Dataset - LDM

Visual7W dataset

The Visual7W dataset is a visual question answering dataset, which consists of images and corresponding questions.
- Dataset
- JSON
VQAv2

Visual Question Answering (VQA) has achieved great success thanks to the fast development of deep neural networks (DNN). On the other hand, the data augmentation, as one of the...
- Dataset
- JSON
Conceptual Captions

The dataset used in the paper "Scaling Laws of Synthetic Images for Model Training". The dataset is used for supervised image classification and zero-shot classification tasks.
- Dataset
- JSON
VQA 2.0

The VQA 2.0 dataset is used for visual question answering task. It consists of three sets with a train set containing 83k images and 444k questions, a validation set containing...
- Dataset
- JSON
Florence

A large-scale dataset for visual question answering.
- Dataset
- JSON
SBU Captions

The SBU Captions dataset is a large-scale image-text dataset used for vision-language pre-training.
- Dataset
- JSON
Amazon Berkeley Objects Dataset (ABO)

The Amazon Berkeley Objects Dataset (ABO) is a public available e-commerce dataset with multiple images per product.
- Dataset
- JSON
CLEVR

CLEVR images contain objects characterized by a set of attributes (shape, color, size and material). The questions are grouped into 5 categories: Exist, Count, CompareInteger,...
- Dataset
- JSON
Visual Genome

The Visual Genome dataset is a large-scale visual question answering dataset, containing 1.5 million images, each with 15-30 annotated entities, attributes, and relationships.
- Dataset
- JSON
COCO-QA

The COCO-QA dataset is used for visual question answering task. It consists of 123,287 images and 78,736 train and 38,948 test questions.
- Dataset
- JSON
COCO

Large scale datasets [18, 17, 27, 6] boosted text conditional image generation quality. However, in some domains it could be difficult to make such datasets and usually it could...
- Dataset
- JSON
MSCOCO

Human Pose Estimation (HPE) aims to estimate the position of each joint point of the human body in a given image. HPE tasks support a wide range of downstream tasks such as...
- Dataset
- JSON

You can also access this registry using the API (see API Docs).

32 datasets found