Dataset - LDM

Chinese CLIP

A vision-language pre-training dataset, Chinese CLIP, which consists of 100 million image-text pairs.
- Dataset
- JSON
Sticker820K

A large-scale Chinese sticker dataset, Sticker820K, which consists of 820k image-text pairs. Each sticker has rich and high-quality textual annotations, including descriptions,...
- Dataset
- JSON
BLIP2

A vision-language pre-training dataset, BLIP2, which consists of 100 million image-text pairs.
- Dataset
- JSON

You can also access this registry using the API (see API Docs).

3 datasets found