1 dataset found

Tags: jailbreak attack

Filter Results
  • AdvBench

    The dataset used in the paper to test the Gradient Cuff method for detecting jailbreak attacks on large language models.
You can also access this registry using the API (see API Docs).