-
KITTI tracking dataset
KITTI tracking dataset provides 21 training and 29 test sequences. The dataset provides 2D bounding box annotations for cars, pedestrians, and 6 other classes, but only the... -
Physics-aware Simulation for Object Detection and Pose Estimation
This paper proposes a dataset generation pipeline that uses physics simulation to generate images of objects in cluttered scenes. -
Imagenette
The Imagenette dataset used in the paper for class density and dataset quality in high-dimensional, unstructured data. -
VIMER-UFO Benchmark
The VIMER-UFO benchmark dataset consists of 8 computer vision tasks: CPLFW, Market1501, DukeMTMC, MSMT-17, Veri-776, VehicleId, VeriWild, and SOP. -
Content-Aware Convolutional Neural Networks
Convolutional Neural Networks (CNNs) have achieved great success due to the powerful feature learning ability of convolution layers. Specifically, the standard convolution... -
Open Images Dataset
The dataset used in the experiment consists of 50 images equally distributed between five classes: aircraft, bird, bicycle, boat, and dog. Each class has 5 true positive images... -
Training dataset generation for bridge game registration
The proposed method of automatic dataset generation for cards detection and classification makes it possible to obtain any number of images of any size, which can be used to... -
Argoverse: 3D tracking and forecasting with rich maps
The Argoverse dataset includes 65 training and 24 validation sequences recorded in Miami and Pittsburgh. -
ImageNet Large Scale Visual Recognition Challenge (ILSVRC)
The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) dataset is a large-scale image classification dataset containing over 14 million images from 21,841 categories. -
Amazon Picking Challenge 2016 Dataset
The dataset used in the Amazon Picking Challenge 2016, a vision-based robotic picking system developed by Team Applied Robotics. -
ScanNet200
Diff2Scene uses ScanNet, Matterport3D, ScanNet200 and Replica for open-vocabulary 3D semantic segmentation and visual grounding tasks. -
PASCAL Visual Object Classes Challenge
The PASCAL Visual Object Classes Challenge (VOC) is a benchmark dataset for object detection and semantic segmentation. -
COCO object detection and instance segmentation, ADE20K semantic segmentation
The dataset used in the paper is the COCO object detection and instance segmentation dataset, and the ADE20K semantic segmentation dataset. -
Object Tracking Benchmark
The OTB100 dataset is an extension of the OTB50 dataset, containing 100 videos with 1000 frames each. -
MIT-67, CUB-2011, Caltech-101, DTD
MIT-67 is a dataset of 67 indoor scenes, CUB-2011 is a dataset of 200 bird species, Caltech-101 is a dataset of 101 objects, and DTD is a dataset of 47 textures. -
Submanifold Sparse Convolutional Networks
Convolutional network are the de-facto standard for analysing spatio-temporal data such as images, videos, 3D shapes, etc. Whilst some of this data is naturally dense (for... -
MS COCO dataset
The MS COCO dataset is a large benchmark for image captioning, containing 328K images with 5 caption descriptions each.