Question Answering - Groups

XOR-ATTRIQA

The XOR-ATTRIQA dataset is a classification task where model is asked to predict whether the provided answer to the question is supported by the given passage context, which...

Dataset
JSON

Stanford Human Preferences (SHP)

The Stanford Human Preferences (SHP) dataset is sourced from Reddit with various subreddits that focus on QA. Preferences have been extracted from the accumulated up- and...

Dataset
JSON

Pile

The Pile dataset consists of 800GB text from 22 domains. Cynical selection naturally prefers text data based on the target corpus.

Dataset
JSON

Simple Question dataset

The dataset used in this paper is a set of categorical probability distributions for a finite set of categories A = {a1,..., ak}. The dataset is used to evaluate the proposed...

Dataset
JSON

CelebA-spoof: Large-scale face anti-spooﬁng dataset with rich annotations

A face anti-spooﬁng dataset with rich annotations, focusing on questions with a single entity and relation.

Dataset
JSON

Planning by Automatic Prompt Engineering for Large Language Models Agents

The paper proposes a novel method, REPROMPT, for optimizing the step-by-step instructions in the prompt of LLM agents based on the chat history obtained from interactions with...

Dataset
JSON

SimpQ dataset for Question Answering

The SimpQ dataset contains questions answerable using various knowledge graphs.

Dataset
JSON

SimpleQuestions dataset for Question Answering

The SimpleQuestions dataset contains questions answerable using various knowledge graphs.

Dataset
JSON

WebQuestions dataset for Google Suggest

The WebQuestions dataset contains questions answerable using Google Suggest as the knowledge graph.

Dataset
JSON

VANiLLa

The VANiLLa dataset is a question answering dataset with natural language sentences, focusing on simple questions with a single entity and relation.

Dataset
JSON

MS MARCO Dev (small)

The MS MARCO Dev (small) dataset is a small version of the MS MARCO passage dev set.

Dataset
JSON

RetroMAE

The RetroMAE dataset is used for pre-training retrieval-oriented language models.

Dataset
JSON

TREC 2020 Deep Learning (Passage Subtask)

The TREC 2020 Deep Learning (Passage Subtask) dataset consists of 54 queries with manual judgments from NIST annotators (211 relevance assessments per query, on average).

Dataset
JSON

TREC 2019 Deep Learning (Passage Subtask)

The TREC 2019 Deep Learning (Passage Subtask) dataset consists of 43 manually-judged queries using four relevance grades (215 relevance assessments per query, on average).

Dataset
JSON

SemEval-2013 Task 13

The SemEval-2013 task 13 dataset, containing 20 nouns, 20 verbs, and 10 adjectives in WordNet-sense-tagged contexts.

Dataset
JSON

bAbI story-based QA dataset

The bAbI story-based QA dataset is composed of 20 different tasks, each of which has 1,000 synthetically-generated story-question pairs. A story can be as short as two sentences...

Dataset
JSON

Emergent

The Emergent dataset is a dataset derived from a digital journalism project at Columbia University, containing 300 rumored claims and 2,595 news articles.

Dataset
JSON

Semantic communications: Principles and challenges

This dataset has no description

Dataset
JSON

Task-oriented multi-user semantic communications for vqa

This dataset has no description

Dataset
JSON

Genes

The dataset used in this paper is a probabilistic logic programming dataset, which is a probabilistic version of the Genes dataset.

Dataset
JSON

73 datasets found