Anthropic HH dataset

The Anthropic HH dataset is a general-purpose preference dataset for helpfulness and harmlessness.

Data and Resources

Cite this as

Aaron J. Li, Satyapriya Krishna, Himabindu Lakkaraju (2024). Dataset: Anthropic HH dataset. https://doi.org/10.57702/zad85q4d

DOI retrieved: December 16, 2024

Additional Info

Field Value
Created December 16, 2024
Last update December 16, 2024
Defined In https://doi.org/10.48550/arXiv.2404.18870
Author Aaron J. Li
More Authors
Satyapriya Krishna
Himabindu Lakkaraju
Homepage https://github.com/aaron-jx-li/rlhf-trustworthiness