-
CC12M dataset
CC12M dataset is used for training and testing the proposed method. It contains 12 million images with 12 million captions. -
CC3M and CC12M
CC3M and CC12M are used as datasets for training and evaluation -
CC3M and CC12M datasets
The CC3M and CC12M datasets are used for training and evaluation of the proposed method. -
ImageNet and CC12M datasets
The dataset used in the paper is not explicitly described, but it is mentioned that the authors used the ImageNet dataset and a subset of the CC12M dataset for training.