6 datasets found

Tags: scene graph

Filter Results
  • CLEVR-Ref+

    The CLEVR-Ref+ dataset is a dataset for referring expression comprehension with complicated referring expressions.
  • GQA

    The GQA dataset is a visual question answering dataset that characterizes in compositional question answering and visual reasoning about real-world images.
  • V-COCO

    The V-COCO dataset contains 2,533 training images, 2,867 validation images, and 4,946 test images, including 24 action classes.
  • MultiRels

    A multiple relations scene graph-image paired dataset with highly precise labels called MultiRels.
  • CLEVR

    CLEVR images contain objects characterized by a set of attributes (shape, color, size and material). The questions are grouped into 5 categories: Exist, Count, CompareInteger,...
  • Visual Genome

    The Visual Genome dataset is a large-scale visual question answering dataset, containing 1.5 million images, each with 15-30 annotated entities, attributes, and relationships.
You can also access this registry using the API (see API Docs).