14 datasets found

Groups: Image Understanding Organizations: No Organization

Filter Results
  • Conceptual Caption

    The dataset used in the paper is Conceptual Caption, which is a large-scale dataset of images with captions.
  • Visual Motion Prediction To New Heightfields

    The dataset used in this paper is a large dataset of simulated physical experiments, consisting of images of rolling balls on bowls of varying shape and orientation, and on...
  • SatlasPretrain

    SatlasPretrain: A large-scale dataset for remote sensing image understanding
  • Common objects in context

    Common objects in context.
  • CityScapes dataset

    Monocular depth estimation dataset
  • Cityscapes dataset for semantic urban scene understanding

    The Cityscapes dataset is a large-scale urban scene dataset containing over 25,000 images.
  • BigEarthNet

    BigEarthNet is a large-scale Sentinel-2 dataset collected from a total of 125 Sentinel-2 tiles covering areas of 10 countries in Europe. The dataset was prepared with data from...
  • FIT: Far-reaching Interleaved Transformers

    We present FIT: a transformer-based architecture with efficient self-attention and adaptive computation.
  • CLEVR

    CLEVR images contain objects characterized by a set of attributes (shape, color, size and material). The questions are grouped into 5 categories: Exist, Count, CompareInteger,...
  • Visual Genome

    The Visual Genome dataset is a large-scale visual question answering dataset, containing 1.5 million images, each with 15-30 annotated entities, attributes, and relationships.
  • ADE20k

    Semantic segmentation is one of the fundamental prob-lems in computer vision, whose task is to assign a seman-tic label to each pixel of an image so that different classes can...
  • KITTI dataset

    The dataset used in the paper is the KITTI dataset, which is a benchmark for monocular depth estimation. The dataset consists of a large collection of images and corresponding...
  • Microsoft COCO

    The Microsoft COCO dataset was used for training and evaluating the CNNs because it has become a standard benchmark for testing algorithms aimed at scene understanding and...
  • COCO

    Large scale datasets [18, 17, 27, 6] boosted text conditional image generation quality. However, in some domains it could be difficult to make such datasets and usually it could...