3 datasets found

Tags: Language Model Training

Filter Results
  • UltraRM-13B

    The UltraRM-13B dataset is a collection of human feedback for language model training.
  • AlpacaFarm

    The AlpacaFarm dataset is a large-scale dataset for preference optimization, which consists of a set of instructions and their corresponding responses.
  • Anthropic-HH

    The Anthropic-HH dataset is a collection of human feedback for language model training.
You can also access this registry using the API (see API Docs).