Alpha-CLIP: A CLIP Model Focusing on Wherever You Want

Alpha-CLIP is an enhanced version of CLIP with an auxiliary alpha channel to suggest attentive regions and fine-tuned with constructed millions of RGBA region-text pairs.

Data and Resources

Cite this as

Zeyi Sun, Ye Fang, Tong Wu, Pan Zhang, Yuhang Zang, Shu Kong, Yuanjun Xiong, Dahua Lin, Jiaqi Wang (2024). Dataset: Alpha-CLIP: A CLIP Model Focusing on Wherever You Want. https://doi.org/10.57702/tx2bj9wp

DOI retrieved: December 3, 2024

Additional Info

Field Value
Created December 3, 2024
Last update December 3, 2024
Defined In https://doi.org/10.48550/arXiv.2312.03818
Author Zeyi Sun
More Authors
Ye Fang
Tong Wu
Pan Zhang
Yuhang Zang
Shu Kong
Yuanjun Xiong
Dahua Lin
Jiaqi Wang
Homepage https://aleafy.github.io/alpha-clip