-
PERCEPT-R audio Corpus
The PERCEPT-R audio Corpus is a collection of audio files of children and adults speaking American English. -
Europarl-ST
Europarl-ST is a multilingual speech corpus that contains transcriptions of parliamentary debates in multiple languages. -
Mozilla Commonvoice
Mozilla Commonvoice is a multilingual speech corpus that contains transcriptions of conversations in multiple languages. -
LaMIT corpus
The LaMIT corpus is a speech corpus for Italian, created and labeled specifically for this work. -
LaMIT database
The LaMIT database is a speech corpus for Italian, created and labeled specifically for this work. -
WSJ0-mix dataset
The WSJ0-mix dataset contains a min version of 2-, 3-, 4-, and 5-speaker mixtures simulated using clean speech in the WSJ0 corpus. -
TED-LIUM 3
TED-LIUM 3 (TL3) is a TED talks dataset. Speaker adaptation data for TL3 was divided randomly, where 2/5 was divided into the train set, 1/5 was divided into the dev set, and... -
Speech Corpus
A speech corpus of size 7,000 used for training and validation of the FCI module. -
speechocean762
speechocean762: An open-source non-native English speech corpus for pronunciation assessment. -
Voice Bank Corpus
The Voice Bank Corpus is a large regional accent speech database containing over 10 hours of speech data from 20 speakers. -
HKUST/MTS: A Very Large Scale Mandarin Telephone Speech Corpus
The HKUST dataset is a large dataset of speech recordings, each containing a single speaker speaking a sentence. -
The Wall Street Journal Corpus
The WSJ dataset is a large dataset of speech recordings, each containing a single speaker speaking a sentence. -
TIMIT Acoustic-Phonetic Continuous Speech Corpus
The TIMIT acoustic-phonetic continuous speech corpusCD-ROM contains a large collection of speech samples from 250 male and 250 female speakers. -
Chinese Standard Mandarin Speech Corpus (CSMSC)
The Chinese Standard Mandarin Speech Corpus (CSMSC) is a large-scale speech corpus containing 10,000 recorded sentences read by a female speaker. -
Voice Conversion Challenge 2018 (VCC2018) corpus
The Voice Conversion Challenge 2018 (VCC2018) corpus, which included recordings of 12 professional US English speakers with a sampling rate of 22050 Hz and a sample resolution... -
LibriLight
The dataset used in this paper is a large-scale production ASR system, which includes multi-domain (MD) data sets in English. The MD data sets include medium-form (MF) and...