BELLS: A Framework Towards Future Proof Benchmarks for the Evaluation of LLM Safeguards

doi:doi:10.57702/1yuivokg

BELLS: A Framework Towards Future Proof Benchmarks for the Evaluation of LLM Safeguards

A structured collection of tests for input-output safeguards, including established failure tests, emerging failure tests, and next-gen architecture tests.

Data and Resources

Original MetadataJSON
The json representation of the dataset with its distributions based on DCAT.
Explore
- Preview
- Download

Cite this as

Diego Dorn, Alexandre Variengien, Charbel-Rapha¨el Segerie, Vincent Corruble (2024). Dataset: BELLS: A Framework Towards Future Proof Benchmarks for the Evaluation of LLM Safeguards. https://doi.org/10.57702/1yuivokg

DOI retrieved: December 16, 2024

Additional Info

Field	Value
Created	December 16, 2024
Last update	December 16, 2024
Defined In	https://doi.org/10.48550/arXiv.2406.01364
Author	Diego Dorn
More Authors	Alexandre Variengien Charbel-Rapha¨el Segerie Vincent Corruble
Homepage	https://api.semanticscholar.org/CorpusID:269010944