-
Comprehensive Assessment of Jailbreak Attacks against LLMs
The Comprehensive Assessment of Jailbreak Attacks against LLMs dataset is used to evaluate the effectiveness of jailbreak attacks on language models. -
Unified Multi-scenario Summarization Evaluation Model
UMSE is a unified multi-scenario summarization evaluation framework that can perform semantic evaluation on three typical evaluation scenarios: Sum-Ref, Sum-Doc, and Sum-Doc-Ref... -
Pick-a-Pic
A large dataset of text-to-image prompts for training and evaluation -
Kodak True Color Image Suite
The Kodak True Color Image Suite is a dataset of images used for evaluating image compression algorithms. -
Forbidden Question Dataset
The dataset used to evaluate the effectiveness of different jailbreak attack methods against LLMs. The dataset contains 160 forbidden questions with high diversity. -
Jailbreak Attack Dataset
The dataset used in the paper to evaluate the effectiveness of different jailbreak attack methods against Large Language Models (LLMs). -
Do-Not-Answer dataset
The Do-Not-Answer dataset is designed to test the safety performance of Large Language Models (LLMs). -
PhilEO Bench
A dataset for evaluating geospatial foundation models -
ParaPrompts-400
A dataset for evaluating the performance of text-to-image models, namely ViLG-300 and ParaPrompts-400. -
ViLG-300 and ParaPrompts-400
A dataset for evaluating the performance of text-to-image models, namely ViLG-300 and ParaPrompts-400.