8 datasets found

Tags: image understanding

Filter Results
  • Rel3D

    Rel3D is a large-scale dataset of human-annotated spatial relations in 3D. It consists of spatial relations situated in synthetic 3D scenes, making it possible to extract rich...
  • CC3M-595K

    The dataset used for training the Chat-UniVi model.
  • Visual Analogies of Situation Recognition (VASR)

    The VASR dataset is a large-scale visual analogy dataset for situation recognition, where the task is to select an image candidate B’ that completes the analogy (A to A’ is like...
  • Break-A-Scene dataset

    The Break-A-Scene dataset contains images with multiple concepts extracted from a single image.
  • FIT: Far-reaching Interleaved Transformers

    We present FIT: a transformer-based architecture with efficient self-attention and adaptive computation.
  • Visual Genome

    The Visual Genome dataset is a large-scale visual question answering dataset, containing 1.5 million images, each with 15-30 annotated entities, attributes, and relationships.
  • ADE20k

    Semantic segmentation is one of the fundamental prob-lems in computer vision, whose task is to assign a seman-tic label to each pixel of an image so that different classes can...
  • COCO

    Large scale datasets [18, 17, 27, 6] boosted text conditional image generation quality. However, in some domains it could be difficult to make such datasets and usually it could...
You can also access this registry using the API (see API Docs).