Question Answering - Groups

Contextualized Sequence Likelihood

The authors used several question-answering datasets, including CoQA, TriviaQA, and Natural Questions.

Dataset
JSON

SST-2

The dataset used for the experiments across ten models– ranging from bag-of-words models to pre-trained transformers– and ﬁnd that a model having higher AUC does not necessarily...

Dataset
JSON

FUNSD dataset

FUNSD dataset contains questions answerable using Wikidata as the knowledge graph, focusing on questions with a single entity and relation.

Dataset
JSON

CORD dataset

CORD dataset contains questions answerable using Wikidata as the knowledge graph, focusing on questions with a single entity and relation.

Dataset
JSON

Neural Collaborative Filtering

The dataset is used for neural collaborative filtering, which is a type of collaborative filtering that uses neural networks to learn the relationships between users and items.

Dataset
JSON

MS MARCO: A Human-Generated Machine Reading Comprehension Dataset

The dataset is used for training and evaluating the MS MARCO model, a question answering model.

Dataset
JSON

VQAv2

Visual Question Answering (VQA) has achieved great success thanks to the fast development of deep neural networks (DNN). On the other hand, the data augmentation, as one of the...

Dataset
JSON

IMDB-RLHF-Pair dataset

The IMDB-RLHF-Pair dataset is generated by IMDB, and responses with positive sentiment are preferred.

Dataset
JSON

Stack-Exchange-Paired dataset

The Stack-Exchange-Paired dataset contains questions and answers from the Stack Overflow dataset, where answers with more votes are preferred.

Dataset
JSON

FAQ dataset

The dataset used for FAQ sentence labeling.

Dataset
JSON

XQuAD

The XQuAD dataset is a multilingual question answering dataset.

Dataset
JSON

TyDi QA

Parameter-efficient fine-tuning (PEFT) using labeled task data can significantly improve the performance of large language models (LLMs) on the downstream task. However, there...

Dataset
JSON

Wizard of Wikipedia

Wizard of Wikipedia is a recent, large-scale dataset of multi-turn knowledge-grounded dialogues between a “apprentice” and a “wizard”, who has access to information from...

Dataset
JSON

Synthetic Data

The dataset used in the paper is a synthetic dataset for off-policy contextual bandits, with contexts x ∈ X, a finite set of actions A, and bounded real rewards r ∈ A → [0, 1].

Dataset
JSON

Visual Dialog

Visual dialog is a multi-round extension for VQA. The interactions between the image and multi-round question-answer pairs (history) are progressively changing, and the...

Dataset
JSON

Context-Aware Graph for Visual Dialog

Visual dialog is a challenging task that requires the comprehension of the semantic dependencies among implicit visual and textual contexts. This task can refer to the relation...

Dataset
JSON

CaseEncoder: A Knowledge-enhanced Pre-trained Model for Legal Case Encoding

Legal case retrieval is a critical process for modern legal information systems. This paper proposes CaseEncoder, a pre-trained encoder that utilizes fine-grained legal...

Dataset
JSON

StackOverﬂow

The paper discusses the use of multi-objective Bayesian optimization for hyperparameter transfer in topic models.

Dataset
JSON

Generalized Category Discovery with Decoupled Prototypical Network

Generalized Category Discovery (GCD) aims to recognize both known and novel categories from a set of unlabeled data, based on another dataset labeled with only known categories.

Dataset
JSON

MathQA

MathQA is an English mathematical problems dataset at GRE level. The original MathQA dataset is annotated in a different way from Math23k with many pre-defined operations.

Dataset
JSON

416 datasets found