Dataset - LDM

COMET

COMET is a model for commonsense reasoning that can generate coherent and contextually relevant text.
- Dataset
- JSON
CommonsenseQA and OpenBookQA

CommonsenseQA and OpenBookQA are two of the most widely used commonsense reasoning benchmarks.
- Dataset
- JSON
CSQA

The CSQA dataset is a widely used benchmark dataset for conversational KBQA, consisting of around 200K dialogues where training set, validation set and testing set contain 153K,...
- Dataset
- JSON
Jericho

A dataset of 32 interactive fiction games, including dungeon crawl, Sci-Fi, mystery, comedy, and horror games.
- Dataset
- JSON
StrategyQA

The StrategyQA dataset is used to evaluate the ability of LLMs in generating accurate answers to multi-step reasoning questions.
- Dataset
- JSON

You can also access this registry using the API (see API Docs).

5 datasets found