Dataset - LDM

CrowdHuman dataset

The CrowdHuman dataset is a benchmark dataset for human detection, consisting of 15,000 images, 4,370 images for validation, and 5,000 images for testing.
- Dataset
- JSON
COCO Captions

Object detection is a fundamental task in computer vision, requiring large annotated datasets that are difficult to collect.
- Dataset
- JSON
Pascal VOC

Semantic segmentation is a crucial and challenging task for image understanding. It aims to predict a dense labeling map for the input image, which assigns each pixel a unique...
- Dataset
- JSON
COCO 2017

Object detection is one of the most foundational computer vision task and is essential for many real-world applications. The object detection pipeline has been developed...
- Dataset
- JSON
PSU dataset

The PSU dataset was collected from two sources: an open dataset of aerial images available on Github and our own images acquired after flying a 3DR SOLO drone equipped with a...
- Dataset
- JSON
Stanford dataset

The Stanford dataset consists of a large-scale collection of aerial images and videos of a university campus containing various agents (cars, buses, bicycles, golf carts,...
- Dataset
- JSON
MS COCO captions

The MS COCO captions dataset contains captions for images in the Microsoft COCO dataset.
- Dataset
- JSON
PASCAL VOC 2010

The PASCAL VOC 2010 dataset is an extension of the PASCAL VOC dataset, containing additional images and categories.
- Dataset
- JSON
Faster-LTN: a neuro-symbolic, end-to-end object detection architecture

The detection of semantic relationships between objects represented in an image is one of the fundamental challenges in image interpretation. Neural-Symbolic techniques, such as...
- Dataset
- JSON
ImageNet Dataset

Object recognition is arguably the most important problem at the heart of computer vision. Recently, Barbu et al. introduced a dataset called ObjectNet which includes objects in...
- Dataset
- JSON
COCO Dataset

The COCO dataset is a large-scale dataset for object detection, semantic segmentation, and captioning. It contains 80 object categories and 1,000 image instances per category,...
- Dataset
- JSON
OpenImages

Large-scale vision-and-language models trained on curated and web-scrapped data have led to significant improvements over task-specific models when transferred to downstream...
- Dataset
- JSON
Visual Genome

The Visual Genome dataset is a large-scale visual question answering dataset, containing 1.5 million images, each with 15-30 annotated entities, attributes, and relationships.
- Dataset
- JSON
COCO2017

The COCO2017 dataset is used for the trainability experiment, which includes 47,429 images with 210,893 objects.
- Dataset
- JSON
MS-COCO

Large scale datasets [18, 17, 27, 6] boosted text conditional image generation quality. However, in some domains it could be difficult to make such datasets and usually it could...
- Dataset
- JSON
RUOD

Underwater object detection dataset
- Dataset
- JSON
URPC

Underwater object detection dataset
- Dataset
- JSON
COCO, PASCAL VOC, Cityscapes, and LVIS

The dataset used in the paper for instance segmentation, which includes COCO, PASCAL VOC, Cityscapes, and LVIS datasets.
- Dataset
- JSON
LVIS

Instance segmentation (IS) is an important computer vision task, aiming at simultaneously predicting the class label and the binary mask for each instance of interest in an image.
- Dataset
- JSON
Cityscapes

The Cityscapes dataset is a large and famous city street scene semantic segmentation dataset. 19 classes of which 30 classes of this dataset are considered for training and...
- Dataset
- JSON

You can also access this registry using the API (see API Docs).

249 datasets found