13 datasets found

Formats: JSON Tags: Audio Dataset

Filter Results
  • VOiCES

    The VOiCES dataset is used for testing the speaker recognition system. The dataset contains 7323 identities combined. The dataset is used for testing.
  • WSJ0-2mix

    The dataset used in the paper is the WSJ0-2mix dataset, which contains 30 hours of training data and 10 hours of validation data generated from the WSJ0 dataset. The speech...
  • Wall Street Journal

    The Wall Street Journal dataset is used for syntactic linearization. It contains a large corpus of news articles with their corresponding syntactic trees.
  • ASVspoof2019

    The ASVspoof2019 LA subset consists of three parts, training, development, and evaluation. Each partition has a disjoint set of speakers. The average duration of the utterances...
  • TIMIT dataset

    The dataset used in this paper is a collection of phonetically and phonologically local allophonic distribution in English, where voiceless stops surface as aspirated...
  • MSP-IMPROV

    The MSP-IMPROV dataset contains 6 sessions of dyadic interactions between pairs of male-female actors. 15 target sentences are used to collect the recordings. For each target...
  • Speech Commands

    The Speech Commands dataset consists of 105809 one-second audio recordings of 35 spoken words sampled at 16kHz. The raw speech commands dataset presents audio recordings as a...
  • LJSpeech Dataset

    The LJSpeech dataset is a collection of audio recordings of a single female speaker reading aloud.
  • LJ Speech Dataset

    The LJ speech dataset is a dataset of speech samples recorded from a single speaker reading passages from 7 non-fiction books.
  • ESC-10

    The ESC-10 dataset is a subset of the ESC-50 dataset with 10 classes and 400 recordings.
  • ESC-50

    The dataset used for training the CNN in cough detection is composed of various modified audio clips gathered from open-source online sources. Each of these audio files...
  • VoxCeleb1

    Speaker recognition aims to identify speaker information from input speech. A type of speaker recognition is speaker verification (SV). It determines whether the test speaker's...
  • LibriSpeech dataset

    The dataset used in the paper is the LibriSpeech dataset, which contains about 1,000 hours of English speech derived from audiobooks.
You can also access this registry using the API (see API Docs).