-
MS COCO dataset for object detection
The MS COCO dataset is a task for object detection. -
Track Anything Rapter (TAR)
The Track Anything Rapter (TAR) project utilizes state-of-the-art pre-trained models to accurately detect and track target objects through multimodal queries. -
DOTA1.0, DOTA1.5, HRSC2016
Aerial object detection dataset for anchor-free oriented object detection -
CAE v2: Context Autoencoder with CLIP Target
Masked image modeling (MIM) learns visual representation by masking and reconstructing image patches. Applying the reconstruction supervision on the CLIP representation has been... -
Microsoft COCO Dataset
The MS COCO 2014 Dataset contains images of 91 object categories, which contains 82783 training images, 40504 validation images and 40775 testing images. -
MS COCO Detection Dataset
The MS COCO detection dataset is a large-scale object detection benchmark. -
MatrixNets: A New Scale and Aspect Ratio Aware Architecture for Object Detection
Object detection is one of the most widely studied tasks in computer vision with many applications to tasks such as object tracking, instance segmentation, and image captioning. -
MSCOCO validation set
The dataset used in the paper is the MSCOCO validation set. -
iCubWorld Transformations
iCubWorld Transformations (iCWT): a dataset for object recognition and manipulation -
Pascal VOC-OS
Open-set object detection datasets -
COCO panoptic validation set
Panoptic segmentation aims to unify instance and semantic segmentation in the same framework. Existing works propose to merge instance and semantic segmentation using... -
COCO panoptic segmentation
Panoptic segmentation aims to unify instance and semantic segmentation in the same framework. Existing works propose to merge instance and semantic segmentation using... -
Distilling Object Detectors with Feature Richness
The proposed Feature-Richness Score (FRS) method to choose important features that are beneficial to distillation. -
CamVid Dataset
CamVid dataset is a benchmark dataset for semantic segmentation. It consists of 700 images with 11 object classes. -
Stanford Cars dataset
The Stanford Cars dataset is a dataset of images of cars, with 196 categories and approximately 16,000 images. The authors created a synthetic dataset by adding occlusions of... -
GPS-Assisted Automatic Object Annotation in Videos
TagMe is a system that provides automatic object annotation in videos using GPS data. -
Inter-Instance Similarity Modeling for Contrastive Learning
The existing contrastive learning methods widely adopt one-hot instance discrimination as pretext task for self-supervised learning, which inevitably neglects rich... -
Deformable DETR dataset
The Deformable DETR dataset -
DAVIS-2017
The DAVIS-2017 dataset is a benchmark for video object segmentation