Dataset - LDM

DailyDialog: A Manually Labelled Multi-Turn Dialogue Dataset

DailyDialog: A manually labelled multi-turn dialogue dataset.
- Dataset
- JSON
IEMOCAP dataset

The IEMOCAP dataset contains five recording sessions, each with one male speaker and one female speaker.
- Dataset
- JSON
MultiWOZ dataset

The dataset used in the paper is the MultiWOZ dataset, which is a human-human task-oriented dialogue dataset collected via the Wizard-of-Oz framework. It contains conversations...
- Dataset
- JSON
SMCalFlow

Task-oriented dialogue datasets for training and evaluation of task-oriented dialogue models
- Dataset
- JSON
Taskmaster

Task-oriented dialogue datasets for training and evaluation of task-oriented dialogue models
- Dataset
- JSON
MultiWOZ 2.0, CamRest676, SMCalFlow

Task-oriented dialogue datasets for training and evaluation of task-oriented dialogue models
- Dataset
- JSON
KVRET*

Dialogue contexts are proven helpful in the spoken language understanding (SLU) system and they are typically encoded with explicit memory representations. However, most of the...
- Dataset
- JSON
KVRET

Dialogue contexts are proven helpful in the spoken language understanding (SLU) system and they are typically encoded with explicit memory representations. However, most of the...
- Dataset
- JSON
Multi-Session Chat

The Multi-Session Chat dataset is used in the paper for evaluating the long-term memory of conversational agents.
- Dataset
- JSON
DBDC4 Japanese

The DBDC4 Japanese dataset contains dialogues from three dialogue systems named DCM, DIT, and IRS, and five other dialogue systems (IRS, MMK, MRK, TRF, and ZNK) which...
- Dataset
- JSON
DBDC4

The Fourth Dialogue Breakdown Detection Challenge (DBDC4) dataset contains dialogues from a dialogue system named IRIS and six other dialogue systems (anonymised as Bot001 to...
- Dataset
- JSON
MultiWOZ-DF

A dataflow implementation of the MultiWOZ dataset
- Dataset
- JSON
Improving open-domain dialogue systems via multi-turn incomplete utterance re...

Improving open-domain dialogue systems via multi-turn incomplete utterance restoration.
- Dataset
- JSON
Mining Clues from Incomplete Utterance: A Query-enhanced Network for Incomple...

Incomplete utterance rewriting has recently raised wide attention. However, previous works do not consider the semantic structural information between incomplete utterance and...
- Dataset
- JSON
Stanford Multi-turn, Multi-domain Dialogue Dataset

The Stanford Multi-turn, Multi-domain Dialogue Dataset is a dataset for language understanding in task-oriented dialogue systems. It contains a large number of training...
- Dataset
- JSON
Airline Travel Information System dataset (ATIS)

The Airline Travel Information System dataset (ATIS) is a dataset for language understanding in task-oriented dialogue systems. It contains 4978 training utterances from Class A...
- Dataset
- JSON
Topical-Chat

The Topical-Chat dataset is a knowledge-grounded open-domain conversational dataset, which consists of dialogues between two Mechanical Turk workers (a.k.a. Turkers).
- Dataset
- JSON
Grounded response generation task at DSTC7

Grounded response generation task at DSTC7
- Dataset
- JSON
Schema-Guided Dialogue

The Schema-Guided Dialogue (SGD) dataset contains over 20,000 multi-domain conversations between a human and a virtual assistant.
- Dataset
- JSON
Empathetic Dialogue dataset

The Empathetic Dialogue dataset is a dataset of conversations related to daily life, each with an emotion label, a situation described in text, and a short two-party dialogue.
- Dataset
- JSON

You can also access this registry using the API (see API Docs).

35 datasets found