Dataset - LDM

Anthropic Helpfulness Base eval

The dataset used in the paper is the Anthropic Helpfulness Base eval dataset.
- Dataset
- JSON
Anthropic Helpfulness Base

The dataset used in the paper is the Anthropic Helpfulness Base train dataset and the Anthropic Helpfulness eval dataset.
- Dataset
- JSON
Dense Reward for Free in RLHF

The dataset used in the paper is not explicitly described, but it is mentioned that it is a preference dataset for language models.
- Dataset
- JSON
HIVE: Harnessing Human Feedback for Instructional Visual Editing

The dataset used in the paper Harnessing Human Feedback for Instructional Visual Editing (HIVE) for instructional visual editing.
- Dataset
- JSON
Anthropic HH dataset

The Anthropic HH dataset is a general-purpose preference dataset for helpfulness and harmlessness.
- Dataset
- JSON
DeFacto

The DeFacto dataset is a resource specifically curated to enhance the factual consistency of machine-generated summaries through the inclusion of human-annotated demonstrations...
- Dataset
- JSON
Training a helpful and harmless assistant with reinforcement learning from hu...

The authors propose a novel approach that incorporates parameter-efficient tuning to better optimize control tokens, thus benefitting controllable generation.
- Dataset
- JSON
Anthropic's HH-RLHF and OpenAI's summarization datasets

The dataset used in the paper is the Anthropic's HH-RLHF and OpenAI's summarization datasets.
- Dataset
- JSON
AlpacaFarm

The AlpacaFarm dataset is a large-scale dataset for preference optimization, which consists of a set of instructions and their corresponding responses.
- Dataset
- JSON

You can also access this registry using the API (see API Docs).

9 datasets found