-
ImageNet with Adversarial Text Regions
The ImageNet with Adversarial Text Regions (ImageNet-Atr) dataset is a new evaluation set built by adding spotting words to the images of ImageNet evaluation sets. -
YFCC15M-V2
The dataset is used for Contrastive Language-Image Pretraining (CLIP) and its variants. -
YFCC15M-V1
The dataset is used for Contrastive Language-Image Pretraining (CLIP) and its variants.