Dataset - LDM

CC14M

Large-scale image-text dataset for pre-training a collaborative two-stream vision-language model for cross-modal retrieval.
- Dataset
- JSON
CC4M

Large-scale image-text datasets for pre-training a collaborative two-stream vision-language model for cross-modal retrieval.
- Dataset
- JSON
DataComp-1B

The dataset used in the paper is also DataComp-1B, which is a large-scale dataset for training next-generation image-text models.
- Dataset
- JSON
LAION-400M and LAION-5B

The dataset used in the paper is LAION-400M and LAION-5B, which are large-scale datasets for training next-generation image-text models.
- Dataset
- JSON
DataCompDR

The dataset used for CLIP pretraining with good quality captions.
- Dataset
- JSON

You can also access this registry using the API (see API Docs).

5 datasets found