Scene Understanding - Groups

Scene Graph Generation (SGG) Benchmark

The Scene Graph Generation (SGG) dataset contains 300,000 images with 10,000 object categories.

Dataset
JSON

COCOStuff

COCOStuff is a scene-centric dataset with a total of 80 things and 91 stuff categories.

Dataset
JSON

RGB-D scenes dataset

The RGB-D scenes dataset contains RGB-D images of indoor scenes with everyday-life objects.

Dataset
JSON

Behave dataset

The Behave dataset contains various scenes with human-object interactions, and is used to evaluate the proposed object-level 3D semantic mapping approach.

Dataset
JSON

Slot Attention

A dataset of videos of a robot interacting with blocks of different shapes and colors placed on a table in a simulation environment.

Dataset
JSON

COCO-stuff dataset

The COCO-stuff dataset is a large-scale dataset for scene understanding, object detection, and image synthesis.

Dataset
JSON

Cityspace

The dataset used for training and testing the proposed RGBD-based obstacle avoidance system for visually impaired people.

Dataset
JSON

RefCOCO dataset

The authors used the RefCOCO dataset, a large-scale dataset for object detection and scene understanding, to train and evaluate their models.

Dataset
JSON

ImageNet: A Large-Scale Hierarchical Image Database

The ImageNet dataset is a large-scale image database that contains over 14 million images, each labeled with one of 21,841 categories.

Dataset
JSON

SUN2012 Dataset

The SUN2012 dataset is a challenging dataset for object detection, with large cluttered scenes and small objects.

Dataset
JSON

Scannet

The dataset used for training and testing the proposed RGBD-based obstacle avoidance system for visually impaired people.

Dataset
JSON

WRGB-D Scenes Dataset

A large-scale hierarchical multi-view RGB-D object dataset.

Dataset
JSON

BigBird Dataset

A large-scale 3D database of object instances for scene understanding.

Dataset
JSON

YFCC100M

The dataset used in the paper is YFCC100M, a large-scale video dataset. The dataset is used for foreground and background patch extraction and object recognition tasks.

Dataset
JSON

Stanford dataset

The Stanford dataset consists of a large-scale collection of aerial images and videos of a university campus containing various agents (cars, buses, bicycles, golf carts,...

Dataset
JSON

SUN RGB-D

RGB-D scene recognition approaches often train two standalone backbones for RGB and depth modalities with the same Places or ImageNet pre-training. However, the pre-trained...

Dataset
JSON

Visual Genome

The Visual Genome dataset is a large-scale visual question answering dataset, containing 1.5 million images, each with 15-30 annotated entities, attributes, and relationships.

Dataset
JSON

NeRF

NeRF [33] has demonstrated amazing ability to synthesize images of 3D scenes from novel views. However, they rely upon specialized volumetric rendering algorithms based on ray...

Dataset
JSON

LSUN

The dataset used for training and validation of the proposed approach to combine semantic segmentation and dense outlier detection.

Dataset
JSON

Cityscapes

The Cityscapes dataset is a large and famous city street scene semantic segmentation dataset. 19 classes of which 30 classes of this dataset are considered for training and...

Dataset
JSON

21 datasets found