12 datasets found

Groups: Conversational AI Organizations: No Organization Formats: JSON

Filter Results
  • Ubuntu Corpus

    The dataset used in the paper is the Ubuntu Corpus, which consists of dialogues from the Ubuntu technical support chat.
  • Baidu TieBa Corpus

    The dataset used for context-oriented response selecting task, which is considered as a binary classification problem.
  • Topical-Chat

    The Topical-Chat dataset is a knowledge-grounded open-domain conversational dataset, which consists of dialogues between two Mechanical Turk workers (a.k.a. Turkers).
  • Reddit conversation corpus

    Reddit conversation corpus, consisting of data extracted from 95 top-ranked subreddits that discuss various topics such as sports, news, education and politics.
  • STC

    The STC dataset is a short-text conversation dataset collected from Sina Weibo, a Chinese social platform.
  • E-commerce Dialogue Corpus

    The dataset is used for training and testing response selection models for multi-turn conversations.
  • Douban Conversation Corpus

    The dataset is used for training and testing response selection models for multi-turn conversations.
  • ConvAI2 persona-chat dataset

    The ConvAI2 persona-chat dataset is an extended version of the persona-chat dataset, which contains conversations obtained from crowdworkers who were randomly paired and asked...
  • Wizard of Wikipedia

    Wizard of Wikipedia is a recent, large-scale dataset of multi-turn knowledge-grounded dialogues between a “apprentice” and a “wizard”, who has access to information from...
  • DailyDialog

    The DailyDialog dataset is a large-scale multi-turn dialogue dataset, consisting of 10,000 conversations with 5 turns each.
  • EmpatheticDialogues

    The EmpatheticDialogues dataset is a text dataset for training empathetic AI chatbots, consisting of 25k conversations grounded in emotional situations with emotion labels.
  • Ubuntu Dialogue Corpus

    The Ubuntu Dialogue Corpus is the largest freely available multi-turn based dialogue corpus which consists of almost one million two-way conversations extracted from the Ubuntu...