196 datasets found

Tags: question answering

Filter Results
  • Open Book Question Answering

    Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering
  • OpenHermes 2.5

    OpenHermes 2.5: An Open Dataset of Synthetic Data for Generalist LLM Assistants
  • Wiki-BM

    The Wiki-BM dataset is a benchmark for the Split and Rephrase task, consisting of Wikipedia data.
  • BiSECT

    The BiSECT dataset is a benchmark for the Split and Rephrase task, consisting of bitexts.
  • WebSplit

    The WebSplit dataset is a benchmark for the Split and Rephrase task, consisting of RDF semantic tuples.
  • WirelessLLM

    The WirelessLLM dataset contains questions and answers about wireless communication systems, focusing on questions with a single entity and relation.
  • SPEC5G

    The SPEC5G dataset contains questions and answers about 5G cellular network protocols, focusing on questions with a single entity and relation.
  • StandardsQA

    The StandardsQA dataset contains questions and answers about telecommunications standards, focusing on questions with a single entity and relation.
  • TeleQnA

    The TeleQnA dataset contains questions and answers about telecommunications knowledge, focusing on questions with a single entity and relation.
  • TruthfulQA

    The TruthfulQA dataset is a dataset that contains 817 questions designed to evaluate language models' preference to mimic some human falsehoods.
  • BioASQ

    The BioASQ dataset contains questions and answers from various sources, including Wikipedia and biomedical literature.
  • BioASQ 2016 Task 4b

    The dataset used in this paper is the BioASQ 2016 Task 4b dataset, which contains questions along with their relevant documents.
  • SciFact

    The SciFact dataset is a collection of scientific fact questions and their corresponding answers.
  • Yin-Yang dataset

    The Yin-Yang dataset is a benchmark for evaluating the performance of question answering systems.
  • Natural Questions and TriviaQA

    The dataset used in the paper is Natural Questions and TriviaQA, two popular datasets for open-domain question answering.
  • FEVER: A Large-Scale Dataset for Fact Extraction and Verification

    The FEVER dataset consists of 185,455 annotated claims, together with 5,416,537 Wikipedia documents containing roughly 25 million sentences as potential evidence.
  • BEIR

    The BEIR dataset is a large-scale zero-shot evaluation dataset for information retrieval models, consisting of 13,000 documents and 1,000 questions.
  • TREC Deep Learning track

    The TREC Deep Learning track dataset is a collection of question answering datasets, which are used for passage retrieval and ranking.
  • MT-bench

    The dataset used in the paper is MT-bench, which is an LLM-based automated evaluation metric comprising 80 challenging questions.
  • QQP

    The Quora Question Pairs (QQP) dataset consists of 50,000 question pairs labeled with paraphrase or non-paraphrase.
You can also access this registry using the API (see API Docs).