Dataset - LDM

BIG-Bench Hard

The BIG-Bench Hard dataset is derived from the original BIG-Bench evaluation suite, focusing on tasks that pose challenges to existing language models.
- Dataset
- JSON
LongPile

LongPile is a diverse dataset derived from the Pile corpus.
- Dataset
- JSON
ChatGPT Language Comprehension and Production

This dataset consists of 12 experiments that explore the extent to which ChatGPT resembles humans in the comprehension and production of language.
- Dataset
- JSON
FACTOR

The dataset used in this paper is FACTOR, a benchmark for factuality evaluation of language models.
- Dataset
- JSON
TruthfulQA

The TruthfulQA dataset is a dataset that contains 817 questions designed to evaluate language models' preference to mimic some human falsehoods.
- Dataset
- JSON
Edit Distance Robust Watermarks for Language Models

The dataset used in the paper is a language model output, which is a sequence of tokens generated by a language model.
- Dataset
- JSON
A general theoretical paradigm to understand learning from human preferences

The paper proposes a novel approach to aligning language models with human preferences, focusing on the use of preference optimization in reward-free RLHF.
- Dataset
- JSON
Llama: Open and efficient foundation language models

The LLaMA dataset is a large language model dataset used in the paper.
- Dataset
- JSON
Fine-tuning Language Models with Advantage-Induced Policy Alignment

The dataset used in the paper is the Anthropic Helpfulness and Harmlessness dataset and the StackExchange dataset.
- Dataset
- JSON
Mixtral of Experts

The dataset used in the paper for instruction following task
- Dataset
- JSON
HONEST

HONEST is a fairness dataset specifically designed to assess LMs' outputs' hurtfulness.
- Dataset
- JSON
FAIRBELIEF

FAIRBELIEF is a language-agnostic analytical approach to capture and assess beliefs embedded in LMs.
- Dataset
- JSON
Greaselm: Graph Reasoning Enhanced Language Models for Question Answering

Greaselm: Graph reasoning enhanced language models for question answering
- Dataset
- JSON
Large language models struggle to learn long-tail knowledge

Large language models struggle to learn long-tail knowledge
- Dataset
- JSON
GMEG-wiki and GMEG-yahoo

The GMEG-wiki and GMEG-yahoo datasets are used to evaluate the proposed approach.
- Dataset
- JSON
BEA-2019

The Break-It-Fix-It (BIFI) framework has demonstrated strong results on learning to repair a broken program without any labeled examples.
- Dataset
- JSON
CoNLL-2014

The task of grammatical error correction (GEC) is to map an ungrammatical sentence xbad into a grammatical version of it, xgood.
- Dataset
- JSON
LM-Critic: Language Models for Unsupervised Grammatical Error Correction

Training a model for grammatical error correction (GEC) requires a set of labeled ungrammatical / grammatical sentence pairs, but manually annotating such pairs can be expensive.
- Dataset
- JSON
GLUE benchmark

The dataset used in the paper is not explicitly described, but it is mentioned that the authors used three downstream tasks from the GLUE benchmark: Stanford Sentiment Treebank...
- Dataset
- JSON
Laion-5b

A large-scale dataset of text and images for training next-generation language models.
- Dataset
- JSON

You can also access this registry using the API (see API Docs).

32 datasets found