-
TIMIT Corpus
The TIMIT corpus is a large database of speech recordings used for speaker recognition and speech synthesis tasks. -
Voice Bank Corpus
The Voice Bank Corpus is a large regional accent speech database containing over 10 hours of speech data from 20 speakers. -
Proprietary Speech Dataset
Proprietary speech dataset consisted of 184 hours of high quality US English speech spoken by 11 female and 10 male speakers. -
CSTR VCTK Corpus
The CSTR VCTK Corpus is a dataset of speech recordings of 109 speakers, each with 20 utterances. -
VCTK Dataset
The VCTK dataset is a large corpus of speech recordings, each containing a single speaker and a single sentence. -
LibriSpeech dataset
The dataset used in the paper is the LibriSpeech dataset, which contains about 1,000 hours of English speech derived from audiobooks.