LLM Safety - Groups - LDM

Do-Not-Answer dataset

The Do-Not-Answer dataset is designed to test the safety performance of Large Language Models (LLMs).
- Dataset
- JSON
HH-RLHF dataset

The HH-RLHF dataset is used to evaluate the performance of the proposed Compositional Preference Models (CPMs).
- Dataset
- JSON

Before browse our site, please accept our cookies policy