Question Answering - Groups

CICERO

The CICERO dataset is used for training and evaluation.
- Dataset
- JSON
HiTab

A hierarchical table dataset for question answering and natural language generation.
- Dataset
- JSON
Pandalm Dataset

The dataset used to train Pandalm, a generative safety evaluator for Chinese.
- Dataset
- JSON
Auto-J Dataset

The dataset used to train Auto-J, a generative safety evaluator for English.
- Dataset
- JSON
Jade Dataset

The dataset used to train Jade, a linguistic-based safety evaluation platform for Chinese.
- Dataset
- JSON
ShieldLM Dataset

The dataset used to train ShieldLM, a generative safety evaluator for English.
- Dataset
- JSON
SAFETY-J Dataset

The dataset used to train SAFETY-J, a bilingual generative safety evaluator for English and Chinese.
- Dataset
- JSON
Greaselm: Graph Reasoning Enhanced Language Models for Question Answering

Greaselm: Graph reasoning enhanced language models for question answering
- Dataset
- JSON
Dense Passage Retrieval for Open-Domain Question Answering

Dense passage retrieval for open-domain question answering
- Dataset
- JSON
Large language models struggle to learn long-tail knowledge

Large language models struggle to learn long-tail knowledge
- Dataset
- JSON
Semantics in Question Answering

Semanitic parsing on freebase from question-answer pairs
- Dataset
- JSON
Automatic Question-Answer Generation for Long-Tail Knowledge

Automatic Question-Answer Generation for Long-Tail Knowledge
- Dataset
- JSON
Off-Topic Memento Dataset

The dataset used to evaluate the effectiveness of different similarity measures for identifying off-topic mementos.
- Dataset
- JSON
Forgetting in Answer Set Programming

The dataset used in the paper is a set of answer set programs and their corresponding V-HT-models.
- Dataset
- JSON
MSRVTT

The MSRVTT is a large-scale dataset for video captioning. It contains 10k video clips and each video clip is accompanied with 20 human-edited English sentence descriptions,...
- Dataset
- JSON
ELI5, FinanceQA, MultiNews, and QMSum datasets

The ELI5, FinanceQA, MultiNews, and QMSum datasets were used in the paper.
- Dataset
- JSON
CoQA

The CoQA dataset is a benchmark for question answering research. It consists of conversational questions.
- Dataset
- JSON
SearchSnippets

The paper discusses the use of multi-objective Bayesian optimization for hyperparameter transfer in topic models.
- Dataset
- JSON
MS MARCO Passage Ranking (MARCO Dev Passage)

Dense retrieval (DR) has shown promising results in information retrieval. In essence, DR requires high-quality text representations to support eﬀective search in the...
- Dataset
- JSON
Singapore Rapid Transit Systems Regulations

Singapore Rapid Transit Systems Regulations is a collection of regulations proclaimed by the Singapore government.
- Dataset
- JSON

536 datasets found