-
CC12M dataset
CC12M dataset is used for training and testing the proposed method. It contains 12 million images with 12 million captions. -
Flickr8K-Expert dataset
Flickr8K-Expert dataset is used for evaluating the proposed method. It contains 8,000 images with 8,000 captions. -
Conceptual Captions 12M and RedCaps
The dataset used in the paper is Conceptual Captions 12M (CC12M) and RedCaps. -
Conceptual Captions 3M, Conceptual Captions 12M, RedCaps, and LAION-400M
The dataset used in the paper is Conceptual Captions 3M (CC3M), Conceptual Captions 12M (CC12M), RedCaps, and LAION-400M. -
Conceptual Captions
The dataset used in the paper "Scaling Laws of Synthetic Images for Model Training". The dataset is used for supervised image classification and zero-shot classification tasks. -
MSCOCO dataset
The MSCOCO dataset is a large-scale image captioning dataset, containing 113,287 images with 5,000 validation images and 5,000 test images. The dataset is used for training and... -
Microsoft COCO Captions
A large dataset of captions for images. -
COCO Dataset
The COCO dataset is a large-scale dataset for object detection, semantic segmentation, and captioning. It contains 80 object categories and 1,000 image instances per category,...