-
PASCAL VOC 2007
Multi-label image recognition is a practical and challenging task compared to single-label image classification. -
TrackingNet
The TrackingNet dataset is a benchmark for visual tracking, containing 511 video sequences with varying difficulties. -
Objects365
The Objects365 dataset is a large-scale object detection dataset containing 365,000 images with 365 categories. -
ShanghaiTech Part A
ShanghaiTech Part A is a crowd counting dataset that contains 300 training images and 182 test images. -
Pascal VOC
Semantic segmentation is a crucial and challenging task for image understanding. It aims to predict a dense labeling map for the input image, which assigns each pixel a unique... -
OpenImages
Large-scale vision-and-language models trained on curated and web-scrapped data have led to significant improvements over task-specific models when transferred to downstream... -
Cityscapes
The Cityscapes dataset is a large and famous city street scene semantic segmentation dataset. 19 classes of which 30 classes of this dataset are considered for training and... -
KITTI dataset
The dataset used in the paper is the KITTI dataset, which is a benchmark for monocular depth estimation. The dataset consists of a large collection of images and corresponding... -
Microsoft COCO
The Microsoft COCO dataset was used for training and evaluating the CNNs because it has become a standard benchmark for testing algorithms aimed at scene understanding and... -
ImageNet Large Scale Visual Recognition Challenge
A benchmark for low-shot recognition was proposed by Hariharan & Girshick (2017) and consists of a representation learning phase without access to the low-shot classes and a...