-
Multilingual CommonsenseQA
Multilingual CommonsenseQA (mCSQA) is a dataset for evaluating the common sense reasoning capabilities of multilingual LMs. -
A survey of reasoning with foundation models
The paper discusses the challenges of using large language models for reasoning tasks. -
TravelPlanner
The TravelPlanner dataset is a benchmark for real-world planning with language agents. -
Planning by Automatic Prompt Engineering for Large Language Models Agents
The paper proposes a novel method, REPROMPT, for optimizing the step-by-step instructions in the prompt of LLM agents based on the chat history obtained from interactions with... -
SemEval-2020 Task 4: Commonsense Validation and Explanation (ComVE)
The dataset for SemEval-2020 Task 4: Commonsense Validation and Explanation (ComVE) consists of 10 sentences: two similar sentences and three options each. -
Cyberattack Prediction Through Public Text Analysis and Mini-Theories
Cyberattack Prediction Through Public Text Analysis and Mini-Theories is a dataset used for training machine learning models to predict cyberattacks. -
Markov Logic Networks
Markov Logic Networks (MLNs) are a probabilistic graphical model that can be used for a variety of tasks, including classification, regression, and clustering. -
DeepPSL: End-to-end perception and reasoning
DeepPSL is a variant of probabilistic soft logic (PSL) to produce an end-to-end trainable system that integrates reasoning and perception. -
bAbI dataset
The bAbI dataset is a benchmark for in-context learning of chat-based Large Language Models (LLMs). It consists of a sequence of stories that give the model new information, and... -
Kinship dataset
The Kinship dataset is used for relation discovery and reasoning tasks. Given a set of examples about relations, the model infers the direct relation between two people and... -
Cora and Citeseer datasets
The Cora and Citeseer datasets are used for training machine learning models to classify documents into different categories.