Laion-20M

doi:doi:10.57702/070kx7rz

Laion-20M

The dataset used for pre-training the MS-CLIP model, which consists of 20 million image-text pairs filtered from Laion-400M.

Data and Resources

Original MetadataJSON
The json representation of the dataset with its distributions based on DCAT.
Explore
- Preview
- Download

Cite this as

Haoxuan You, Luowei Zhou, Bin Xiao, Noel Codella, Yu Cheng, Ruochen Xu, Shih-Fu Chang, Lu Yuan (2024). Dataset: Laion-20M. https://doi.org/10.57702/070kx7rz

DOI retrieved: December 16, 2024

Additional Info

Field	Value
Created	December 16, 2024
Last update	December 16, 2024
Defined In	https://doi.org/10.48550/arXiv.2207.12661
Author	Haoxuan You
More Authors	Luowei Zhou Bin Xiao Noel Codella Yu Cheng Ruochen Xu Shih-Fu Chang Lu Yuan
Homepage	https://github.com/Hxyou/MSCLIP