-
SPAGHETTI: Open-Domain Question Answering
SPAGHETTI: A hybrid open-domain question-answering system that combines semantic parsing and information retrieval to handle structured and unstructured data. -
LIPID dataset
The LIPID dataset is a template-free dataset for probing models with prompts from the biomedical domain. -
Google-RE (Templates) dataset
The Google-RE (Templates) dataset contains 6.11K template-based prompts from Wikipedia and 3 relations. -
Comparing Template-based and Template-free Language Model Probing
Template-based probing uses expert-made templates to create prompts, while template-free probing uses naturally-occurring text. -
Long Range Arena
The Long Range Arena dataset consists of 6 tasks with lengths 1K-16K steps encompassing modalities and objectives that require similarity, structural, and visuospatial reasoning. -
LLM dataset
The dataset used in this paper is not explicitly described, but it is mentioned that it is a large language model (LLM) and that the authors used it to train and evaluate their... -
CLEVR-Humans
The CLEVR-Humans dataset consists of 32,164 questions asked by humans, containing words and reasoning steps that were unseen in CLEVR. -
MMLU dataset
The dataset used in the paper is the Multitask Language Understanding (MMLU) dataset, which consists of 57 tasks from Science, Technology, Engineering, and Math (STEM),... -
Semantic communications: Principles and challenges
This dataset has no description
-
Task-oriented multi-user semantic communications for vqa
This dataset has no description
-
Alquist 3.0: Alexa Prize Bot Using Conversational Knowledge Graph
The third version of the socialbot Alquist, a conversational system designed to converse coherently and engagingly with humans on popular topics. -
FRACAS: A French Annotated Corpus of Attribution relations in news
Manually annotated corpus of 1676 newswire texts in French for quotation extraction and source attribution. -
Measuring Massive Multitask Language Understanding
The dataset used in this paper is a multiple choice question set that allows for the evaluation of large language models. -
LLaVA 158k
The LLaVA 158k dataset is a large-scale multimodal learning dataset, which is used for training and testing multimodal large language models. -
Multimodal Robustness Benchmark
The MMR benchmark is designed to evaluate MLLMs' comprehension of visual content and robustness against misleading questions, ensuring models truly leverage multimodal inputs... -
Discord Questions: A Computational Approach To Diversity Analysis in News Cov...
The dataset used in the paper to evaluate the effectiveness of the Annotated Article, Recomposed Article, and Question Grid interfaces in highlighting news coverage diversity.