12 datasets found

Formats: JSON Tags: natural language understanding

Filter Results
  • SimpleWiki

    The dataset for the task of identifying if a desire expressed by a subject in a given short piece of text was fulfilled.
  • SQuAD: 100,000+ Questions for Machine Comprehension of Text

    The SQuAD dataset is a benchmark for natural language understanding tasks, including question answering and text classification.
  • MASSIVE

    The MASSIVE dataset is a comprehensive collection of approximately one million annotated utterances for various natural language understanding tasks such as slot-filling, intent...
  • TreeMix: Compositional Constituency-based Data Augmentation for Natural Langu...

    TreeMix is a compositional data augmentation approach for natural language understanding. It leverages constituency parsing tree to decompose sentences into sub-structures and...
  • CoQA

    The CoQA dataset is a benchmark for question answering research. It consists of conversational questions.
  • Natural Instructions

    The Natural Instructions (NI) dataset used for evaluating the performance of the DEPTH model on natural language understanding tasks.
  • SQuAD

    The dataset used in the paper is a multiple-choice reading comprehension dataset, which includes a passage, question, and answer. The passage is a script, and the question is a...
  • Natural Questions

    The Natural Questions dataset consists of questions extracted from web queries, with each question accompanied by a corresponding Wikipedia article containing the answer.
  • Bing dataset

    The Bing dataset is a large-scale dataset for natural language understanding and question answering.
  • MS MARCO dataset

    The MS MARCO dataset is a large-scale dataset for natural language understanding and question answering.
  • SQuAD 2.0

    The SQuAD 2.0 dataset is a new challenging task for natural language processing, which requires that machine can read, understand, and answer questions about a text. The dataset...
  • GLUE

    Pre-trained language models (PrLM) have to carefully manage input units when training on a very large text with a vocabulary consisting of millions of words. Previous works have...
You can also access this registry using the API (see API Docs).