Laion-20M

The dataset used for pre-training the MS-CLIP model, which consists of 20 million image-text pairs filtered from Laion-400M.

Data and Resources

Cite this as

Haoxuan You, Luowei Zhou, Bin Xiao, Noel Codella, Yu Cheng, Ruochen Xu, Shih-Fu Chang, Lu Yuan (2024). Dataset: Laion-20M. https://doi.org/10.57702/070kx7rz

DOI retrieved: December 16, 2024

Additional Info

Field Value
Created December 16, 2024
Last update December 16, 2024
Defined In https://doi.org/10.48550/arXiv.2207.12661
Author Haoxuan You
More Authors
Luowei Zhou
Bin Xiao
Noel Codella
Yu Cheng
Ruochen Xu
Shih-Fu Chang
Lu Yuan
Homepage https://github.com/Hxyou/MSCLIP