1 dataset found

Tags: preference models

Filter Results
  • Anthropic red-team dataset

    The Anthropic red-team dataset is a significant open-access dataset aimed at improving AI safety through training preference models and assessing their safety.
You can also access this registry using the API (see API Docs).