1 dataset found

Tags: SafeRLHF

Filter Results
  • SafeRLHF

    The dataset used in the paper for training and testing the DPO and PPO models.
You can also access this registry using the API (see API Docs).