21 datasets found

Groups: Scene Understanding Formats: JSON

Filter Results
  • Scene Graph Generation (SGG) Benchmark

    The Scene Graph Generation (SGG) dataset contains 300,000 images with 10,000 object categories.
  • COCOStuff

    COCOStuff is a scene-centric dataset with a total of 80 things and 91 stuff categories.
  • RGB-D scenes dataset

    The RGB-D scenes dataset contains RGB-D images of indoor scenes with everyday-life objects.
  • Behave dataset

    The Behave dataset contains various scenes with human-object interactions, and is used to evaluate the proposed object-level 3D semantic mapping approach.
  • Slot Attention

    A dataset of videos of a robot interacting with blocks of different shapes and colors placed on a table in a simulation environment.
  • COCO-stuff dataset

    The COCO-stuff dataset is a large-scale dataset for scene understanding, object detection, and image synthesis.
  • Cityspace

    The dataset used for training and testing the proposed RGBD-based obstacle avoidance system for visually impaired people.
  • RefCOCO dataset

    The authors used the RefCOCO dataset, a large-scale dataset for object detection and scene understanding, to train and evaluate their models.
  • ImageNet: A Large-Scale Hierarchical Image Database

    The ImageNet dataset is a large-scale image database that contains over 14 million images, each labeled with one of 21,841 categories.
  • SUN2012 Dataset

    The SUN2012 dataset is a challenging dataset for object detection, with large cluttered scenes and small objects.
  • Scannet

    The dataset used for training and testing the proposed RGBD-based obstacle avoidance system for visually impaired people.
  • WRGB-D Scenes Dataset

    A large-scale hierarchical multi-view RGB-D object dataset.
  • BigBird Dataset

    A large-scale 3D database of object instances for scene understanding.
  • YFCC100M

    The dataset used in the paper is YFCC100M, a large-scale video dataset. The dataset is used for foreground and background patch extraction and object recognition tasks.
  • Stanford dataset

    The Stanford dataset consists of a large-scale collection of aerial images and videos of a university campus containing various agents (cars, buses, bicycles, golf carts,...
  • SUN RGB-D

    RGB-D scene recognition approaches often train two standalone backbones for RGB and depth modalities with the same Places or ImageNet pre-training. However, the pre-trained...
  • Visual Genome

    The Visual Genome dataset is a large-scale visual question answering dataset, containing 1.5 million images, each with 15-30 annotated entities, attributes, and relationships.
  • NeRF

    NeRF [33] has demonstrated amazing ability to synthesize images of 3D scenes from novel views. However, they rely upon specialized volumetric rendering algorithms based on ray...
  • LSUN

    The dataset used for training and validation of the proposed approach to combine semantic segmentation and dense outlier detection.
  • Cityscapes

    The Cityscapes dataset is a large and famous city street scene semantic segmentation dataset. 19 classes of which 30 classes of this dataset are considered for training and...