-
Multimodal Large Language Models Harmlessness Alignment Dataset
The dataset used in the paper to evaluate the harmlessness alignment of multimodal large language models (MLLMs). The dataset consists of 750 harmful instructions paired with... -
Alpha-CLIP: A CLIP Model Focusing on Wherever You Want
Alpha-CLIP is an enhanced version of CLIP with an auxiliary alpha channel to suggest attentive regions and fine-tuned with constructed millions of RGBA region-text pairs.