-
Quantum Language Model with Entanglement Embedding for Question Answering
The proposed QLM-EE model is used for question answering task on two benchmark datasets, TREC-QA and WIKIQA. -
User Reported Scenarios (URS) dataset
The User Reported Scenarios (URS) dataset is a collection of real-world use cases with 15 LLMs from a user study with 712 participants from 23 countries. -
one-million-reddit-questions
The dataset contains 500 questions from one million open-ended requests posted on AskReddit, and 129,483 of these questions were identified as asking for help. -
MS MARCO NLGen
The MS MARCO NLGen dataset is a collection of natural language generation tasks, where the goal is to generate natural-sounding answers to questions. -
FactCheckQA
FactCheckQA is a refreshable dataset for probing model performance in trusted source alignment. -
SimpleQuestion Dataset
The dataset used in the paper is a collection of data for the Simple Question dataset, which contains questions answerable using Wikidata as the knowledge graph. -
Collective classification in network data
Collective classification in network data. -
Conditional Generative Matching Model for Multi-lingual Reply Suggestion
A Conditional Generative Matching Model for Multi-lingual Reply Suggestion -
CommonsenseQA
The dataset used in the paper is also mentioned as CommonsenseQA, which is a 5-way multiple choice QA dataset that requires commonsense knowledge. -
Visual Text Question Answering (VTQA)
A new challenge named Visual Text Question Answering (VTQA) along with a corresponding dataset, which includes 23,781 questions based on 10,124 image-text pairs. -
Natural Questions
The Natural Questions dataset consists of questions extracted from web queries, with each question accompanied by a corresponding Wikipedia article containing the answer.