42 datasets found

Tags: Recognition

Filter Results
  • CD-CNN dataset

    The CD-CNN dataset contains data for urban resident recognition.
  • SVT

    SVT is a very challenging dataset collected by Wang et al. from the Google Street View.
  • SFinGe

    SFinGe is a synthetic fingerprint dataset that can generate fingerprints with varying ridge structures and patterns.
  • NIST SD302

    NIST Special Database 302 contains plain, rolled and touch-free impressions captured from various devices.
  • FVC 2002

    Fingerprint enhancement task can be related to image denoising. However, we need to consider the inherent properties of biometric data and that it should be handled differently...
  • Kinetics400

    Video classification is a fundamental problem in many video-based tasks. Applications such as autonomous driving technology, controlling drones and robots are driving the demand...
  • MJSynth

    The OCR and MT datasets are used to train the OCR and MT models respectively.
  • SoBiR dataset

    The SoBiR dataset is used for soft biometric retrieval. It contains 8 camera views, 100 persons, and categorical annotations.
  • HMDB-51 and UCF-101

    A dataset of real videos for action categorization, including HMDB-51 and UCF-101.
  • CSTR VCTK Corpus

    The CSTR VCTK Corpus is a dataset of speech recordings of 109 speakers, each with 20 utterances.
  • Bengali Handwritten Digit Dataset

    A dataset of 70000 handwritten samples of Bengali numerals for recognition using artificial neural network based architecture pre-trained by a stacked denoising autoencoder.
  • AI-Skin

    A dataset for skin disease recognition based on self-learning and wide data collection through a closed loop framework.
  • ICDAR2015

    ICDAR2015 dataset consists of 1,670 images (17,548 annotated text regions) acquired using the Google Glass.
  • ICDAR2013

    ICDAR2013 dataset is obtained from the Robust Reading Challenges 2013.
  • SynthText

    SynthText dataset is proposed by Gupta et al. for scene text detection. The original dataset is composed of 800,000 scene text images, each with multiple word instances.
  • Kinetics-400

    Motion has shown to be useful for video understanding, where motion is typically represented by optical flow. However, computing flow from video frames is very time-consuming....
  • Librispeech

    The Librispeech dataset is a large-scale speaker-dependent speech corpus containing 1080 hours of speech, 5600 utterances, and 1000 speakers.
  • UCF101

    The UCF101 dataset contains 13320 videos distributed in 101 action categories. This dataset is different from the above ones in that it contains mostly coarse sports activities...
  • HMDB51

    Video classification is a fundamental problem in many video-based tasks. Applications such as autonomous driving technology, controlling drones and robots are driving the demand...
  • Kinetics

    The Kinetics dataset is a large-scale human action dataset, which consists of 400 action classes where each category has more than 400 videos.
You can also access this registry using the API (see API Docs).