-
Microsoft COCO 2014 and 2017
Microsoft COCO 2014 and 2017 datasets for object detection, segmentation, and captioning -
Microsoft Common Objects in Context (MS COCO)
A well-known dataset for object detection and image segmentation -
Foggy Cityscapes
The Foggy Cityscapes dataset is an extension to the Cityscapes dataset, containing 5k diverse real-world urban driving scenes with fog. -
Caltech-UCSD Birds-200-2011 Dataset
The Caltech-UCSD Birds-200-2011 Dataset consists of 11,169 bird images from 200 categories and each category has 60 images averagely. -
MIT Scene Parsing Dataset
The MIT scene parsing dataset used for training the FCN network. -
Fully Convolutional Networks for Automatically Generating Image Masks to Trai...
The proposed method for automatically generating image masks to train Mask R-CNN for object detection. -
Microsoft COCO Dataset
The MS COCO 2014 Dataset contains images of 91 object categories, which contains 82783 training images, 40504 validation images and 40775 testing images. -
CamVid Dataset
CamVid dataset is a benchmark dataset for semantic segmentation. It consists of 700 images with 11 object classes. -
BDD100K Dataset
BDD100K Dataset is a large-scale dataset for autonomous driving, containing 100,000 images, with 20,000 images for training and 80,000 images for testing. -
Pascal VOC 2012
The dataset used in the paper is the Pascal VOC 2012 dataset, which is a benchmark for instance segmentation. The dataset consists of 1464 images with 20 class categories and... -
COCO Stuff
COCO Stuff dataset is an extension of the COCO dataset, 164,000 images covering 171 classes are annotated with segmentation masks. -
Microsoft Common Objects in Context (COCO) dataset
The Microsoft Common Objects in Context (COCO) dataset is a benchmark for object detection, segmentation, and image classification. -
Berkeley Segmentation Dataset
A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. -
MS COCO dataset
The MS COCO dataset is a large benchmark for image captioning, containing 328K images with 5 caption descriptions each. -
COCO+LVIS dataset
The COCO+LVIS dataset contains millions of high-quality labels for natural images. -
PASCAL Context
The PASCAL Context dataset is a benchmark for multi-task learning in computer vision. It contains 10103 images with 5 tasks: semantic segmentation, human body part segmentation,... -
LVIS: A dataset for segment anything model (SAM)
A dataset for segment anything model (SAM) to evaluate its performance.