-
XView2 dataset
The XView2 dataset is a large set of satellite overhead images annotated. -
FLICKR-125K
The FLICKR-125K dataset is a large-scale image annotation dataset. -
FLICKR-60K
The FLICKR-60K dataset is a large-scale image annotation dataset. -
Fluid Annotation: A Human-Machine Collaboration Interface for Full Image Anno...
Fluid Annotation is an intuitive human-machine collaboration interface for annotating the class label and outline of every object and background region in an image. -
COCO panoptic dataset
The COCO panoptic dataset combines the original COCO dataset with COCO-stuff, merging some stuff classes based on [32]. It contains 118K training and 5K validation images... -
SPMDataset
A dataset of images annotated with semantic tuples, including predicates, actors, and locatives. -
Microsoft COCO 2017 dataset
This dataset contains images paired with multiple human-annotated descriptions in the form of sentences. -
Heudiasyc dataset
A dataset for autonomous driving. -
ApolloScape Dataset
The ApolloScape dataset is a large-scale dataset for autonomous driving, containing images and annotations. -
Inria Aerial Image Labeling dataset
Inria Aerial Image Labeling dataset contains aerial orthorectified color imagery of 5000 × 5000 pixels with a spatial resolution of 0.3 m. -
AICrowd Mapping Challenge dataset
AICrowd Mapping Challenge dataset contains 300 × 300 pixels RGB images and corresponding annotations in MS-COCO format. -
ReferItGame
Visual grounding is the task of localizing a language query in an image. The output is often a bounding box as drawn in the yellow color. -
Flickr30K Entities
The Flickr30K Entities dataset consists of 31,783 images each matched with 5 captions. The dataset links distinct sentence entities to image bounding boxes, resulting in 70K... -
Pothole Detection Dataset
A dataset of images with pothole annotations from various sources, including Google Earth Pro, AUTOPILOT videos, and GoPro camera images.