-
SimpleWiki
The dataset for the task of identifying if a desire expressed by a subject in a given short piece of text was fulfilled. -
SQuAD: 100,000+ Questions for Machine Comprehension of Text
The SQuAD dataset is a benchmark for natural language understanding tasks, including question answering and text classification. -
TreeMix: Compositional Constituency-based Data Augmentation for Natural Langu...
TreeMix is a compositional data augmentation approach for natural language understanding. It leverages constituency parsing tree to decompose sentences into sub-structures and... -
Natural Instructions
The Natural Instructions (NI) dataset used for evaluating the performance of the DEPTH model on natural language understanding tasks. -
Natural Questions
The Natural Questions dataset consists of questions extracted from web queries, with each question accompanied by a corresponding Wikipedia article containing the answer. -
Bing dataset
The Bing dataset is a large-scale dataset for natural language understanding and question answering. -
MS MARCO dataset
The MS MARCO dataset is a large-scale dataset for natural language understanding and question answering.