Conversational AI - Groups

Reddit Comments dataset

The Reddit Comments dataset is constructed from publicly available user comments on submissions on the Reddit website.

Dataset
JSON

DICES-350

The DICES-350 dataset is a curated sample of 8k multi-turn conversation corpus generated by human agents interacting with a generative AI-chatbot (Thoppilan et al., 2022) in an...

Dataset
JSON

ChatGPT: A conversational AI model

The dataset used in the paper ChatGPT: A conversational AI model.

Dataset
JSON

E-commerce Dialogue Corpus

The dataset is used for training and testing response selection models for multi-turn conversations.

Dataset
JSON

Douban Conversation Corpus

The dataset is used for training and testing response selection models for multi-turn conversations.

Dataset
JSON

DailyDialog

The DailyDialog dataset is a large-scale multi-turn dialogue dataset, consisting of 10,000 conversations with 5 turns each.

Dataset
JSON

EmpatheticDialogues

The EmpatheticDialogues dataset is a text dataset for training empathetic AI chatbots, consisting of 25k conversations grounded in emotional situations with emotion labels.

Dataset
JSON

Ubuntu Dialogue Corpus

The Ubuntu Dialogue Corpus is the largest freely available multi-turn based dialogue corpus which consists of almost one million two-way conversations extracted from the Ubuntu...

Dataset
JSON

8 datasets found