4 datasets found

Formats: JSON

Filter Results
  • SRLLM Training Dataset

    A dataset of annotated text, used for training and evaluating the Safety and Responsible Large Language Model (SRLLM).
  • UltraRM-13B

    The UltraRM-13B dataset is a collection of human feedback for language model training.
  • AlpacaFarm

    The AlpacaFarm dataset is a large-scale dataset for preference optimization, which consists of a set of instructions and their corresponding responses.
  • Anthropic-HH

    The Anthropic-HH dataset is a collection of human feedback for language model training.