-
Long Range Arena
The Long Range Arena dataset consists of 6 tasks with lengths 1K-16K steps encompassing modalities and objectives that require similarity, structural, and visuospatial reasoning. -
LLM dataset
The dataset used in this paper is not explicitly described, but it is mentioned that it is a large language model (LLM) and that the authors used it to train and evaluate their... -
CLEVR-Humans
The CLEVR-Humans dataset consists of 32,164 questions asked by humans, containing words and reasoning steps that were unseen in CLEVR. -
MMLU dataset
The dataset used in the paper is the Multitask Language Understanding (MMLU) dataset, which consists of 57 tasks from Science, Technology, Engineering, and Math (STEM),... -
Semantic communications: Principles and challenges
This dataset has no description
-
Task-oriented multi-user semantic communications for vqa
This dataset has no description
-
Alquist 3.0: Alexa Prize Bot Using Conversational Knowledge Graph
The third version of the socialbot Alquist, a conversational system designed to converse coherently and engagingly with humans on popular topics. -
FRACAS: A French Annotated Corpus of Attribution relations in news
Manually annotated corpus of 1676 newswire texts in French for quotation extraction and source attribution. -
Measuring Massive Multitask Language Understanding
The dataset used in this paper is a multiple choice question set that allows for the evaluation of large language models. -
LLaVA 158k
The LLaVA 158k dataset is a large-scale multimodal learning dataset, which is used for training and testing multimodal large language models. -
Multimodal Robustness Benchmark
The MMR benchmark is designed to evaluate MLLMs' comprehension of visual content and robustness against misleading questions, ensuring models truly leverage multimodal inputs... -
Discord Questions: A Computational Approach To Diversity Analysis in News Cov...
The dataset used in the paper to evaluate the effectiveness of the Annotated Article, Recomposed Article, and Question Grid interfaces in highlighting news coverage diversity. -
Modality-Aware Integration with Large Language Models for Knowledge-based Vis...
Knowledge-based visual question answering (KVQA) has been extensively studied to answer visual questions with external knowledge, e.g., knowledge graphs (KGs). -
A dataset of clinically generated visual questions and answers about radiolog...
A dataset of clinically generated visual questions and answers about radiology images. -
Med-HallMark
Med-HallMark is a benchmark for detecting and evaluating hallucinations in medical multimodal language models.