Question Answering

LLM dataset

The dataset used in this paper is not explicitly described, but it is mentioned that it is a large language model (LLM) and that the authors used it to train and evaluate their...

Dataset
JSON

CLEVR-Humans

The CLEVR-Humans dataset consists of 32,164 questions asked by humans, containing words and reasoning steps that were unseen in CLEVR.

Dataset
JSON

CLOSURE

The CLOSURE dataset consists of 25,200 questions with identical vocabulary but different structure than CLEVR, asked on the same set of images.

Dataset
JSON

MMLU dataset

The dataset used in the paper is the Multitask Language Understanding (MMLU) dataset, which consists of 57 tasks from Science, Technology, Engineering, and Math (STEM),...

Dataset
JSON

LLaVA 158k

The LLaVA 158k dataset is a large-scale multimodal learning dataset, which is used for training and testing multimodal large language models.

Dataset
JSON

Multimodal Robustness Benchmark

The MMR benchmark is designed to evaluate MLLMs' comprehension of visual content and robustness against misleading questions, ensuring models truly leverage multimodal inputs...

Dataset
JSON

Knowledge Graph-Enhanced Large Language Models via Path Selection

Two datasets, MetaQA and FACTKG, are used to evaluate the effectiveness of the proposed method KELP. MetaQA is a critical benchmark dataset containing subsets of questions with...

Dataset
JSON

SQuAD 1.1 and SQuAD 2

The SQuAD 1.1 and SQuAD 2 datasets are used to evaluate the performance of the EQuANt model.

Dataset
JSON

The task is to predict whether the number of edges assigned x is greater than the number of edges assigned y.

Dataset
JSON

Leveraging QA Datasets to Improve Generative Data Augmentation

The paper proposes a method to leverage QA datasets for training generative language models to be context generators for a given question and answer.

Dataset
JSON

VQA-CP

The VQA-CP dataset is a split of the VQA dataset, designed to test generalization skills across changes in the answer distribution between the training and the test sets.

Dataset
JSON

A large annotated corpus for learning natural language inference

Dataset
JSON

QNLI Textual Entailment dataset

The dataset used in this paper is a noisy annotated dataset obtained from a zero-shot learner based module.

Dataset
JSON

GraphQueries

The task of Question Answering over Linked Data (QALD) has received increased attention over the last years (see the surveys [14] and [36]). The task consists in mapping natural...

Dataset
JSON

QALD-6

The task of Question Answering over Linked Data (QALD) has received increased attention over the last years (see the surveys [14] and [36]). The task consists in mapping natural...

Dataset
JSON

AMUSE: Multilingual Semantic Parsing for Question Answering over Linked Data

The task of answering natural language questions over RDF data has received wide interest in recent years, in particular in the context of the series of QALD benchmarks. The...

Dataset
JSON

TREC dataset

The dataset used in the paper is the TREC dataset, which consists of 124 queries.

Dataset
JSON

CGQA

The CGQA dataset is a large dataset containing 413 attributes and 674 object categories.

Dataset
JSON

SQuAD: 100,000+ Questions for Machine Comprehension of Text

The SQuAD dataset is a benchmark for natural language understanding tasks, including question answering and text classification.

Dataset
JSON

FarFetched: Entity-centric Reasoning and Claim Validation for the Greek Language

FarFetched is a modular framework that enables people to verify any kind of textual claim based on the incorporated evidence from textual news sources.

Dataset
JSON

105 datasets found