Language Model Training - Groups - LDM

UltraRM-13B

The UltraRM-13B dataset is a collection of human feedback for language model training.
- Dataset
- JSON
AlpacaFarm

The AlpacaFarm dataset is a large-scale dataset for preference optimization, which consists of a set of instructions and their corresponding responses.
- Dataset
- JSON
Anthropic-HH

The Anthropic-HH dataset is a collection of human feedback for language model training.
- Dataset
- JSON

Before browse our site, please accept our cookies policy