Speech Synthesis - Groups

TIMIT Corpus

The TIMIT corpus is a large database of speech recordings used for speaker recognition and speech synthesis tasks.

Dataset
JSON

TIMIT

The TIMIT corpus is a widely used benchmark for speech recognition tasks. It contains 3,696 training utterances from 462 speakers, excluding the SA sentences. The core test set...

Dataset
JSON

Voice Bank Corpus

The Voice Bank Corpus is a large regional accent speech database containing over 10 hours of speech data from 20 speakers.

Dataset
JSON

Proprietary Speech Dataset

Proprietary speech dataset consisted of 184 hours of high quality US English speech spoken by 11 female and 10 male speakers.

Dataset
JSON

WSJ

The WSJ corpus is a large vocabulary continuous speech recognition dataset. It contains 36416 sequences, representing around 80 hours of speech.

Dataset
JSON

CSTR VCTK Corpus

The CSTR VCTK Corpus is a dataset of speech recordings of 109 speakers, each with 20 utterances.

Dataset
JSON

VCTK Dataset

The VCTK dataset is a large corpus of speech recordings, each containing a single speaker and a single sentence.

Dataset
JSON

LibriSpeech dataset

The dataset used in the paper is the LibriSpeech dataset, which contains about 1,000 hours of English speech derived from audiobooks.

Dataset
JSON

8 datasets found