-
SRLLM Training Dataset
A dataset of annotated text, used for training and evaluating the Safety and Responsible Large Language Model (SRLLM). -
UltraRM-13B
The UltraRM-13B dataset is a collection of human feedback for language model training. -
AlpacaFarm
The AlpacaFarm dataset is a large-scale dataset for preference optimization, which consists of a set of instructions and their corresponding responses. -
Anthropic-HH
The Anthropic-HH dataset is a collection of human feedback for language model training.