Dataset - LDM

PERCEPT-R audio Corpus

The PERCEPT-R audio Corpus is a collection of audio files of children and adults speaking American English.
- Dataset
- JSON
Europarl-ST

Europarl-ST is a multilingual speech corpus that contains transcriptions of parliamentary debates in multiple languages.
- Dataset
- JSON
Mozilla Commonvoice

Mozilla Commonvoice is a multilingual speech corpus that contains transcriptions of conversations in multiple languages.
- Dataset
- JSON
AISHELL-1

The AISHELL-1 dataset is a Mandarin speech corpus, consisting of 178 hours of speech, with 11 domains and 400 speakers from different accent areas in China.
- Dataset
- JSON
LaMIT corpus

The LaMIT corpus is a speech corpus for Italian, created and labeled specifically for this work.
- Dataset
- JSON
LaMIT database

The LaMIT database is a speech corpus for Italian, created and labeled specifically for this work.
- Dataset
- JSON
WSJ0-mix dataset

The WSJ0-mix dataset contains a min version of 2-, 3-, 4-, and 5-speaker mixtures simulated using clean speech in the WSJ0 corpus.
- Dataset
- JSON
TED-LIUM 3

TED-LIUM 3 (TL3) is a TED talks dataset. Speaker adaptation data for TL3 was divided randomly, where 2/5 was divided into the train set, 1/5 was divided into the dev set, and...
- Dataset
- JSON
Speech Corpus

A speech corpus of size 7,000 used for training and validation of the FCI module.
- Dataset
- JSON
Tedlium3

Tedlium3: A large-scale English speech corpus for speaker adaptation.
- Dataset
- JSON
TIMIT

The TIMIT corpus is a widely used benchmark for speech recognition tasks. It contains 3,696 training utterances from 462 speakers, excluding the SA sentences. The core test set...
- Dataset
- JSON
speechocean762

speechocean762: An open-source non-native English speech corpus for pronunciation assessment.
- Dataset
- JSON
Voice Bank Corpus

The Voice Bank Corpus is a large regional accent speech database containing over 10 hours of speech data from 20 speakers.
- Dataset
- JSON
HKUST/MTS: A Very Large Scale Mandarin Telephone Speech Corpus

The HKUST dataset is a large dataset of speech recordings, each containing a single speaker speaking a sentence.
- Dataset
- JSON
The Wall Street Journal Corpus

The WSJ dataset is a large dataset of speech recordings, each containing a single speaker speaking a sentence.
- Dataset
- JSON
TIMIT Acoustic-Phonetic Continuous Speech Corpus

The TIMIT acoustic-phonetic continuous speech corpusCD-ROM contains a large collection of speech samples from 250 male and 250 female speakers.
- Dataset
- JSON
Chinese Standard Mandarin Speech Corpus (CSMSC)

The Chinese Standard Mandarin Speech Corpus (CSMSC) is a large-scale speech corpus containing 10,000 recorded sentences read by a female speaker.
- Dataset
- JSON
Voice Conversion Challenge 2018 (VCC2018) corpus

The Voice Conversion Challenge 2018 (VCC2018) corpus, which included recordings of 12 professional US English speakers with a sampling rate of 22050 Hz and a sample resolution...
- Dataset
- JSON
LibriLight

The dataset used in this paper is a large-scale production ASR system, which includes multi-domain (MD) data sets in English. The MD data sets include medium-form (MF) and...
- Dataset
- JSON

You can also access this registry using the API (see API Docs).

19 datasets found