Anthropic red-team dataset

doi:doi:10.57702/bup1brhp

Anthropic red-team dataset

The Anthropic red-team dataset is a significant open-access dataset aimed at improving AI safety through training preference models and assessing their safety.

Data and Resources

Original MetadataJSON
The json representation of the dataset with its distributions based on DCAT.
Explore
- Preview
- Download

Cite this as

Bahareh Harandizadeh, Abel Salinas, Fred Morstatter (2024). Dataset: Anthropic red-team dataset. https://doi.org/10.57702/bup1brhp

DOI retrieved: December 2, 2024

Additional Info

Field	Value
Created	December 2, 2024
Last update	December 2, 2024
Defined In	https://doi.org/10.48550/arXiv.2403.14988
Author	Bahareh Harandizadeh
More Authors	Abel Salinas Fred Morstatter