-
Augmented Flickr-8K Dataset
A dataset of images annotated with captions and semantic tuples, created by training a model to predict semantic tuples from image captions. -
Open Image Dataset
The Open Image Dataset (OID) is a large-scale image dataset that contains a diverse set of images. -
Image Segmentation
The Image Segmentation dataset is used to evaluate the performance of the ensemble average rule. -
FLICKR-25K
The dataset used for cross-modal hashing task, containing image and text data. -
Zero-shot semantic image editing dataset
The zero-shot semantic image editing dataset, which consists of a set of 150 tuples, each containing a source image, a source text, and a target text. -
VOC 2007 object detection dataset
The VOC 2007 object detection dataset. -
COCO object detection dataset
The dataset used in the paper is a 2D object detection dataset, where the authors investigate the issues of achieving sufficient rigor in the arguments for the safety of machine... -
LAION COCO 600M
The dataset used for training the text-to-video model consists of 20 million videos and 600 million images. -
GoPro dataset
The GoPro dataset is used for training, which contains 2103 pairs of blurred clear control images. The validation set used for training is 460 pairs of randomly divided images.... -
LAION-Aesthetic
The dataset used in the paper is LAION-Aesthetic, a large-scale image dataset. -
DeepFashion dataset
The DeepFashion dataset is a large-scale dataset for person image synthesis, containing 101,966 pairs of images with different poses and clothing. -
Natural Image-Text Dataset
The dataset used for training the Vary-base model, containing natural image-text pairs. -
Document and Chart Dataset
The dataset used for training the new vision vocabulary network, containing high-resolution document and chart images with corresponding text.