Question Answering - Groups

Visual Question Answering (VQA)

The VQA dataset consists of 248,349 training questions, 121,512 validation questions and 244,302 testing questions, generated on a total of 123,287 images.

Dataset
JSON

NLPbench

The dataset is used for evaluating large language models on solving NLP problems.

Dataset
JSON

Natural Questions: A Benchmark for Question Answering Research

A benchmark for question answering research is introduced, which includes a large dataset of natural questions.

Dataset
JSON

Causal-VidQA

This dataset is used in the paper to evaluate the performance of the TranSTR architecture.

Dataset
JSON

ActivityNet-QA

Video question answering (VideoQA) is an essential task in vision-language understanding, which has attracted numerous research attention recently. Nevertheless, existing works...

Dataset
JSON

Simple Question dataset

The dataset used in this paper is a set of categorical probability distributions for a finite set of categories A = {a1,..., ak}. The dataset is used to evaluate the proposed...

Dataset
JSON

ProKnow-data

The ProKnow-data dataset is a collection of diagnostic conversations guided by safety constraints and ProKnow that healthcare professionals use.

Dataset
JSON

QASC

Explanation Gold Standards (XGSs) are emerging as a fundamental enabling tool for step-wise and explainable Natural Language Inference (NLI).

Dataset
JSON

FB15K

Knowledge graphs (KGs) such as Freebase (Bollacker et al. 2008), DBpedia (Auer et al. 2007), and YAGO (Mahdisoltani, Biega, and Suchanek 2014) play a critical role in various...

Dataset
JSON

NarrativeQA

The NarrativeQA dataset is a reading comprehension challenge that focuses on questions with a single entity and relation.

Dataset
JSON

Legal Document Chatbot

A legal document chatbot developed using Langchain and Flask, capable of answering questions within the context of Indian constitution.

Dataset
JSON

Stack Overﬂow dataset

The Stack Overﬂow dataset contains data from a question-answering forum on the topic of computer programming.

Dataset
JSON

Hellaswag: Can a machine really finish your sentence?

Dataset
JSON

SuperGLUE

The dataset used in the paper is the SuperGLUE benchmark, which includes 17 tasks: STS-B, MRPC, MNLI, QNL, QNLI, CoLA, SST-2, MRPC, GLUE, NLI, NQ, ReCoRD, ReCoRD-Sub,...

Dataset
JSON

LLM dataset

The dataset used in this paper is not explicitly described, but it is mentioned that it is a large language model (LLM) and that the authors used it to train and evaluate their...

Dataset
JSON

CLEVR-Humans

The CLEVR-Humans dataset consists of 32,164 questions asked by humans, containing words and reasoning steps that were unseen in CLEVR.

Dataset
JSON

CLOSURE

The CLOSURE dataset consists of 25,200 questions with identical vocabulary but different structure than CLEVR, asked on the same set of images.

Dataset
JSON

MMLU dataset

The dataset used in the paper is the Multitask Language Understanding (MMLU) dataset, which consists of 57 tasks from Science, Technology, Engineering, and Math (STEM),...

Dataset
JSON

LLaVA 158k

The LLaVA 158k dataset is a large-scale multimodal learning dataset, which is used for training and testing multimodal large language models.

Dataset
JSON

Multimodal Robustness Benchmark

The MMR benchmark is designed to evaluate MLLMs' comprehension of visual content and robustness against misleading questions, ensuring models truly leverage multimodal inputs...

Dataset
JSON

119 datasets found