Preference Optimization - Groups

Anthropic’s Helpfulness and Harmlessness

The Anthropic’s Helpfulness and Harmlessness datasets are used for preference optimization, which consists of a set of instructions and their corresponding responses.
- Dataset
- JSON
AlpacaFarm

The AlpacaFarm dataset is a large-scale dataset for preference optimization, which consists of a set of instructions and their corresponding responses.
- Dataset
- JSON

2 datasets found