AdvBench

The dataset used in the paper to test the Gradient Cuff method for detecting jailbreak attacks on large language models.

BibTex: