Anthropic-HH-RLHF Dataset

You're currently viewing an old version of this dataset. To see the current version, click here.

The dataset used in the paper is the Anthropic-HH-RLHF dataset, which is used for reinforcement learning from human feedback.

Data and Resources

Original MetadataJSON
The json representation of the dataset with its distributions based on DCAT.
Explore
- Preview
- Download

Anthropic (2024). Dataset: Anthropic-HH-RLHF Dataset. https://doi.org/10.57702/vwhenrmb

DOI retrieved: December 2, 2024

Field	Value
Created	December 2, 2024
Last update	December 2, 2024
Author	Anthropic
Homepage	https://huggingface.co/datasets/Anthropic/hh-rlhf