-
SPMDataset
A dataset of images annotated with semantic tuples, including predicates, actors, and locatives. -
Microsoft COCO 2017 dataset
This dataset contains images paired with multiple human-annotated descriptions in the form of sentences. -
Heudiasyc dataset
A dataset for autonomous driving. -
ApolloScape Dataset
The ApolloScape dataset is a large-scale dataset for autonomous driving, containing images and annotations. -
Inria Aerial Image Labeling dataset
Inria Aerial Image Labeling dataset contains aerial orthorectified color imagery of 5000 × 5000 pixels with a spatial resolution of 0.3 m. -
AICrowd Mapping Challenge dataset
AICrowd Mapping Challenge dataset contains 300 × 300 pixels RGB images and corresponding annotations in MS-COCO format. -
ReferItGame
Visual grounding is the task of localizing a language query in an image. The output is often a bounding box as drawn in the yellow color. -
Flickr30K Entities
The Flickr30K Entities dataset consists of 31,783 images each matched with 5 captions. The dataset links distinct sentence entities to image bounding boxes, resulting in 70K... -
Pothole Detection Dataset
A dataset of images with pothole annotations from various sources, including Google Earth Pro, AUTOPILOT videos, and GoPro camera images. -
LabelMe dataset
The LabelMe dataset is a natural scene dataset used for testing the performance of the IBTM model on image classification tasks. -
COCO Dataset
The COCO dataset is a large-scale dataset for object detection, semantic segmentation, and captioning. It contains 80 object categories and 1,000 image instances per category,... -
Visual Genome
The Visual Genome dataset is a large-scale visual question answering dataset, containing 1.5 million images, each with 15-30 annotated entities, attributes, and relationships. -
Cityscapes
The Cityscapes dataset is a large and famous city street scene semantic segmentation dataset. 19 classes of which 30 classes of this dataset are considered for training and...