-
Multimodal Robustness Benchmark
The MMR benchmark is designed to evaluate MLLMs' comprehension of visual content and robustness against misleading questions, ensuring models truly leverage multimodal inputs... -
CIFAR-10-C and CIFAR-100-C
CIFAR-10-C and CIFAR-100-C are robustness benchmarks consisting of 19 corruptions types with five levels of severities. -
LAV Dataset
The LAV dataset is used to evaluate the robustness of the proposed Penalty-based Imitation Learning with Cross Semantics Generation approach. -
Imbalanced Gradients
The Imbalanced Gradients dataset is a benchmark for evaluating the robustness of deep neural networks.