Dataset - LDM

Chinese CLIP

A vision-language pre-training dataset, Chinese CLIP, which consists of 100 million image-text pairs.
- Dataset
- JSON
BLIP2

A vision-language pre-training dataset, BLIP2, which consists of 100 million image-text pairs.
- Dataset
- JSON
ALADIN

The ALADIN dataset is a custom dataset created for the ALADIN paper.
- Dataset
- JSON
WebLI Dataset

The WebLI dataset used for training and evaluation of the CoBIT model.
- Dataset
- JSON
JFT-4B Dataset

The JFT-4B dataset used for training and evaluation of the CoBIT model.
- Dataset
- JSON
ALIGN Dataset

The ALIGN dataset used for training and evaluation of the CoBIT model.
- Dataset
- JSON
CoBIT Dataset

The dataset used for training and evaluation of the CoBIT model, which consists of image-text pairs from large-scale noisy web-crawled data and image annotation data.
- Dataset
- JSON
VQAv2

Visual Question Answering (VQA) has achieved great success thanks to the fast development of deep neural networks (DNN). On the other hand, the data augmentation, as one of the...
- Dataset
- JSON
MixGen: A New Multi-Modal Data Augmentation

MixGen: a joint data augmentation for vision-language representation learning to further improve data efﬁciency.
- Dataset
- JSON
MS-COCO

Large scale datasets [18, 17, 27, 6] boosted text conditional image generation quality. However, in some domains it could be difficult to make such datasets and usually it could...
- Dataset
- JSON

You can also access this registry using the API (see API Docs).

10 datasets found