BELLS: A Framework Towards Future Proof Benchmarks for the Evaluation of LLM Safeguards

A structured collection of tests for input-output safeguards, including established failure tests, emerging failure tests, and next-gen architecture tests.

Data and Resources

Cite this as

Diego Dorn, Alexandre Variengien, Charbel-Rapha¨el Segerie, Vincent Corruble (2024). Dataset: BELLS: A Framework Towards Future Proof Benchmarks for the Evaluation of LLM Safeguards. https://doi.org/10.57702/1yuivokg

DOI retrieved: December 16, 2024

Additional Info

Field Value
Created December 16, 2024
Last update December 16, 2024
Defined In https://doi.org/10.48550/arXiv.2406.01364
Author Diego Dorn
More Authors
Alexandre Variengien
Charbel-Rapha¨el Segerie
Vincent Corruble
Homepage https://api.semanticscholar.org/CorpusID:269010944