-
KITTI tracking dataset
KITTI tracking dataset provides 21 training and 29 test sequences. The dataset provides 2D bounding box annotations for cars, pedestrians, and 6 other classes, but only the... -
Microsoft COCO 2014 and 2017
Microsoft COCO 2014 and 2017 datasets for object detection, segmentation, and captioning -
Pascal VOC 2007 and 2012
Pascal VOC 2007 and 2012 datasets for object detection -
Cars dataset
The Cars dataset is a multi-object multi-camera network application. -
Tinyvideos dataset
Tinyvideos dataset -
CityPersons Dataset for Pedestrian Detection
The CityPersons dataset is a new pedestrian detection dataset, consisting of 500 images with annotated objects. -
Caltech Dataset for Pedestrian Detection
The Caltech dataset is a large-scale dataset for pedestrian detection, consisting of 4024 images with annotated objects. -
KITTI Dataset for Autonomous Driving
The KITTI dataset is a large-scale dataset for autonomous driving, consisting of 15,000 images with annotated objects. -
DAVIS Challenge 2017
The DAVIS Challenge 2017 benchmark is a dataset for video object segmentation. -
DAVIS Challenge 2018
The DAVIS Challenge 2018 benchmark is a dataset for video object segmentation. -
Places: A Large-Scale Hierarchical Image Database
A large-scale hierarchical image database for scene recognition. -
Physics-aware Simulation for Object Detection and Pose Estimation
This paper proposes a dataset generation pipeline that uses physics simulation to generate images of objects in cluttered scenes. -
3D X-ray Microscopy for 3D Object Detection, Segmentation, and Metrology
3D X-ray microscope data for 3D object detection, segmentation, and metrology for buried structures in advanced IC packages -
RGB-D scenes dataset
The RGB-D scenes dataset contains RGB-D images of indoor scenes with everyday-life objects. -
VisDrone2019 Dataset
The VisDrone2019 dataset contains 288 video clips made up of 261,908 frames and 10,209 images -
Uni3DL: Unified Model for 3D and Language Understanding
Uni3DL is a unified model for 3D and language understanding. It operates directly on point clouds and supports diverse 3D vision-language tasks, including semantic segmentation,... -
XIMAGENET-12
XIMAGENET-12 is an explainable visual benchmark dataset for model robustness evaluation. It consists of over 200K images with 15,410 manual semantic annotations. The dataset is... -
Object Detection dataset
A dataset used for training a convolutional neural network (CNN) for object detection. -
Visual-Inertial Odometry (VIO) dataset
A dataset used for training a recurrent neural network (RNN) to infer positional uncertainties for a model predictive control (MPC) algorithm. -
AI-TOD remote sensing dataset
The AI-TOD remote sensing dataset is used for detecting dense small objects in aerial images.