-
Image Perturbation Dataset
The dataset used in the paper is a collection of images with perturbations, where participants are asked to identify when an image becomes just noticeably different from the... -
Multimodal Large Language Models Harmlessness Alignment Dataset
The dataset used in the paper to evaluate the harmlessness alignment of multimodal large language models (MLLMs). The dataset consists of 750 harmful instructions paired with... -
BEEAR: Embedding-based Adversarial Removal of Safety Backdoors in Instruction...
The dataset used in the paper to evaluate the effectiveness of the BEEAR method in mitigating safety backdoors in instruction-tuned LLMs. -
Labeled Faces in the Wild (LFW)
The Labeled Faces in the Wild (LFW) dataset contains 13,233 images from 5,749 identities, with large variations in pose, expression and illumination. -
BLUFF: Interactively Deciphering Adversarial Attacks on Deep Neural Networks
BLUFF is an interactive system for visualizing, characterizing, and deciphering adversarial attacks on DNNs. -
Adversarial Robustness of Randomized Neural Networks via Gradient Diversity R...
Randomized neural networks are vulnerable to adversarial attacks. We investigate the effect of proxy-gradient-based attacks on randomized neural networks and propose a... -
Torchattacks: A PyTorch Repository for Adversarial Attacks
Torchattacks is a PyTorch library that contains adversarial attacks to generate adversarial examples and to verify the robustness of deep learning models. -
FOOLHD: FOOLING SPEAKER IDENTIFICATION BY HIGHLY IMPERCEPTIBLE ADVERSARIAL DI...
Speaker identification models are vulnerable to carefully designed adversarial perturbations of their input signals that induce misclas-sification. -
LocalStyleFool
LocalStyleFool: Regional Video Style Transfer Attack Using Segment Anything -
Perturbation Benchmark
The dataset used in this paper is a perturbation benchmark containing nineteen common corruptions and five adversarial attacks. -
Adversarial Counterfactual Visual Explanations
Counterfactual explanations and adversarial attacks have a related goal: flipping output labels with minimal perturbations regardless of their characteristics. -
Robustification of Segmentation Models Against Adversarial Perturbations In Me...
A defense framework for segmentation models against adversarial attacks in medical imaging