Dataset - LDM

CD-CNN dataset

The CD-CNN dataset contains data for urban resident recognition.
- Dataset
- JSON
SVT

SVT is a very challenging dataset collected by Wang et al. from the Google Street View.
- Dataset
- JSON
SFinGe

SFinGe is a synthetic fingerprint dataset that can generate fingerprints with varying ridge structures and patterns.
- Dataset
- JSON
NIST SD302

NIST Special Database 302 contains plain, rolled and touch-free impressions captured from various devices.
- Dataset
- JSON
FVC 2002

Fingerprint enhancement task can be related to image denoising. However, we need to consider the inherent properties of biometric data and that it should be handled differently...
- Dataset
- JSON
Kinetics400

Video classification is a fundamental problem in many video-based tasks. Applications such as autonomous driving technology, controlling drones and robots are driving the demand...
- Dataset
- JSON
MJSynth

The OCR and MT datasets are used to train the OCR and MT models respectively.
- Dataset
- JSON
SoBiR dataset

The SoBiR dataset is used for soft biometric retrieval. It contains 8 camera views, 100 persons, and categorical annotations.
- Dataset
- JSON
HMDB-51 and UCF-101

A dataset of real videos for action categorization, including HMDB-51 and UCF-101.
- Dataset
- JSON
CSTR VCTK Corpus

The CSTR VCTK Corpus is a dataset of speech recordings of 109 speakers, each with 20 utterances.
- Dataset
- JSON
Bengali Handwritten Digit Dataset

A dataset of 70000 handwritten samples of Bengali numerals for recognition using artificial neural network based architecture pre-trained by a stacked denoising autoencoder.
- Dataset
- JSON
AI-Skin

A dataset for skin disease recognition based on self-learning and wide data collection through a closed loop framework.
- Dataset
- JSON
ICDAR2015

ICDAR2015 dataset consists of 1,670 images (17,548 annotated text regions) acquired using the Google Glass.
- Dataset
- JSON
ICDAR2013

ICDAR2013 dataset is obtained from the Robust Reading Challenges 2013.
- Dataset
- JSON
SynthText

SynthText dataset is proposed by Gupta et al. for scene text detection. The original dataset is composed of 800,000 scene text images, each with multiple word instances.
- Dataset
- JSON
Kinetics-400

Motion has shown to be useful for video understanding, where motion is typically represented by optical flow. However, computing flow from video frames is very time-consuming....
- Dataset
- JSON
Librispeech

The Librispeech dataset is a large-scale speaker-dependent speech corpus containing 1080 hours of speech, 5600 utterances, and 1000 speakers.
- Dataset
- JSON
UCF101

The UCF101 dataset contains 13320 videos distributed in 101 action categories. This dataset is different from the above ones in that it contains mostly coarse sports activities...
- Dataset
- JSON
HMDB51

Video classification is a fundamental problem in many video-based tasks. Applications such as autonomous driving technology, controlling drones and robots are driving the demand...
- Dataset
- JSON
Kinetics

The Kinetics dataset is a large-scale human action dataset, which consists of 400 action classes where each category has more than 400 videos.
- Dataset
- JSON

You can also access this registry using the API (see API Docs).

42 datasets found