-
Neural User Simulator
The Neural User Simulator (NUS) is a tool for training dialogue managers in spoken dialogue systems. -
Grounded response generation task at DSTC7
Grounded response generation task at DSTC7 -
Schema-Guided Dialogue
The Schema-Guided Dialogue (SGD) dataset contains over 20,000 multi-domain conversations between a human and a virtual assistant. -
Action-Based Conversations Dataset
The Action-Based Conversations Dataset (ABCD) contains over 10,000 human-to-human customer service dialogues across multiple domains. -
Schema-Guided Dataset (SGD)
Schema-Guided Dataset (SGD) is the official dataset for the schema-guided state tracking challenge at DSTC8. The schema-guided dataset consists of 20 domains with a total of 45... -
MultiWOZ 2.0 and MultiWOZ 2.1
Dialogue state tracking (DST) aims at estimating the current dialogue state given all the preceding conversation. For multi-domain DST, the data sparsity problem is a major... -
OpenSubtitles and DailyDialog
Open-domain dialogue datasets: OpenSubtitles and DailyDialog. OpenSubtitles is a collection of movie subtitles and originally contains over 2 billion utterances. DailyDialog... -
ATIS dataset
The ATIS dataset is a benchmark dataset for spoken language understanding, consisting of audio recordings and corresponding manual transcripts about humans asking for flight... -
IPA dataset
The IPA dataset contains a set of Chinese utterances that were collected and annotated in the development process of a commercialized Intelligent Personal Assistant (IPA) named... -
OSQ dataset
The OSQ dataset covers 150 IND intents and also provides a set of manually labeled Out-of-Scope Queries (OSQ) that are not supported by the current system. -
Colors in Context (CIC) task
The CIC dataset is a referential communication task where a director identifies a target color patch to a matcher. The dataset is used to analyze the communication trade-offs... -
Towards Empathetic Open-Domain Conversation Models: A New Benchmark and Dataset
A dialogue dataset for open-domain conversation models. -
Personalizing Dialogue Agents: I Have a Dog, Do You Have Pets Too?
A dialogue dataset for personalizing dialogue agents. -
PhotoChat: A Human-Human Dialogue Dataset with Photo Sharing Behavior
A dialogue dataset with photo sharing behavior for joint image-text modeling. -
Constructing Multi-Modal Dialogue Dataset by Replacing Text with Semantically...
A multi-modal dialogue dataset created by replacing text with semantically relevant images. -
DialogCC: Large-Scale Multi-Modal Dialogue Dataset
A large-scale multi-modal dialogue dataset created by leveraging the automatic pipeline with filtering using CLIP similarity.