YFCC15M

Mid-scale 15M data is a good balance of the training cost and performance. The dataset is used for Contrastive Language-Image Pretraining (CLIP) and its variants.

Data and Resources

Cite this as

Yufeng Cui, Lichen Zhao, Feng Liang, Yangguang Li, Jing Shao (2024). Dataset: YFCC15M. https://doi.org/10.57702/9tsbls5f

DOI retrieved: December 2, 2024

Additional Info

Field Value
Created December 2, 2024
Last update December 2, 2024
Defined In https://doi.org/10.48550/arXiv.2203.05796
Author Yufeng Cui
More Authors
Lichen Zhao
Feng Liang
Yangguang Li
Jing Shao
Homepage https://github.com/Sense-GVT/DeCLIP