Dataset - LDM

OpenAssistant dataset

The dataset used for the experiments in the paper, consisting of 1000 benign instruction examples.
- Dataset
- JSON
DSTC-FRAMES-ENHI

An extended dataset DSTC-FRAMES-ENHI which contains a total of 37785 samples, 7 entities with 1106 unique entities values (with IOB-prefixes).
- Dataset
- JSON
DSTC-FRAMES-EN

A combined dataset formed from two public English task-oriented conversational datasets belonging to travel and restaurant domains respectively.
- Dataset
- JSON
CoQA

The CoQA dataset is a benchmark for question answering research. It consists of conversational questions.
- Dataset
- JSON
Wizard of Wikipedia

Wizard of Wikipedia is a recent, large-scale dataset of multi-turn knowledge-grounded dialogues between a “apprentice” and a “wizard”, who has access to information from...
- Dataset
- JSON
DailyDialog

The DailyDialog dataset is a large-scale multi-turn dialogue dataset, consisting of 10,000 conversations with 5 turns each.
- Dataset
- JSON

You can also access this registry using the API (see API Docs).

6 datasets found