Dataset - LDM

WebKB

The dataset used in this paper is a probabilistic logic programming dataset, which is a probabilistic version of the WebKB dataset.
- Dataset
- JSON
FKTC

FKTC is a test set for evaluating the factual knowledge of large language models. It contains 210,158 prompts in total.
- Dataset
- JSON
SemEval-2021 task 4

The dataset used in the paper for question answering task
- Dataset
- JSON
ZJUKLAB at SemEval-2021 task 4

The dataset used in the paper for negative augmentation with language model for reading comprehension of abstract meaning
- Dataset
- JSON
Wikidata

The dataset used in the paper is Wikidata, which contains a large number of entities and their corresponding semantic types.
- Dataset
- JSON
BREAK

Break dataset contains question-decomposition meaning representation (QDMR) annotations from BREAK.
- Dataset
- JSON
DROP

DROP dataset contains complex compositional questions against natural language passages describing football games and historical events.
- Dataset
- JSON
MovieQA, TVQA, AVSD, EQA, Embodied QA

A collection of datasets for visual question answering, including MovieQA, TVQA, AVSD, EQA, and Embodied QA.
- Dataset
- JSON
Q-Pain: A Question Answering Dataset to Measure Social Bias in Pain Management

Q-Pain: a question answering dataset to measure social bias in pain management
- Dataset
- JSON
DuReader

DuReader dataset is a Chinese machine reading comprehension dataset, focusing on real-world web data
- Dataset
- JSON
MS-MARCO

MS-MARCO dataset is a large-scale question answering dataset, focusing on real-world web data
- Dataset
- JSON
CICERO

The CICERO dataset is used for training and evaluation.
- Dataset
- JSON
HiTab

A hierarchical table dataset for question answering and natural language generation.
- Dataset
- JSON
Pandalm Dataset

The dataset used to train Pandalm, a generative safety evaluator for Chinese.
- Dataset
- JSON
Auto-J Dataset

The dataset used to train Auto-J, a generative safety evaluator for English.
- Dataset
- JSON
Jade Dataset

The dataset used to train Jade, a linguistic-based safety evaluation platform for Chinese.
- Dataset
- JSON
ShieldLM Dataset

The dataset used to train ShieldLM, a generative safety evaluator for English.
- Dataset
- JSON
SAFETY-J Dataset

The dataset used to train SAFETY-J, a bilingual generative safety evaluator for English and Chinese.
- Dataset
- JSON
MSRVTT

The MSRVTT is a large-scale dataset for video captioning. It contains 10k video clips and each video clip is accompanied with 20 human-edited English sentence descriptions,...
- Dataset
- JSON
CoQA

The CoQA dataset is a benchmark for question answering research. It consists of conversational questions.
- Dataset
- JSON

You can also access this registry using the API (see API Docs).

196 datasets found