Dataset - LDM

CLEVR-Ref+

The CLEVR-Ref+ dataset is a dataset for referring expression comprehension with complicated referring expressions.
- Dataset
- JSON
GQA

The GQA dataset is a visual question answering dataset that characterizes in compositional question answering and visual reasoning about real-world images.
- Dataset
- JSON
V-COCO

The V-COCO dataset contains 2,533 training images, 2,867 validation images, and 4,946 test images, including 24 action classes.
- Dataset
- JSON
MultiRels

A multiple relations scene graph-image paired dataset with highly precise labels called MultiRels.
- Dataset
- JSON
CLEVR

CLEVR images contain objects characterized by a set of attributes (shape, color, size and material). The questions are grouped into 5 categories: Exist, Count, CompareInteger,...
- Dataset
- JSON
Visual Genome

The Visual Genome dataset is a large-scale visual question answering dataset, containing 1.5 million images, each with 15-30 annotated entities, attributes, and relationships.
- Dataset
- JSON

You can also access this registry using the API (see API Docs).

6 datasets found