Speech Synthesis - Groups

TIMIT

The TIMIT corpus is a widely used benchmark for speech recognition tasks. It contains 3,696 training utterances from 462 speakers, excluding the SA sentences. The core test set...

Dataset
JSON

Generative Pre-Training for Speech

Generative models have gained more and more attention in recent years for their remarkable success in tasks that required estimating and sampling data distribution to generate...

Dataset
JSON

Reverberation and Noise Contaminated Speech Datasets

Training and test datasets were generated by contaminating the clean data with reverberation and noise.

Dataset
JSON

VCTK Corpus

The VCTK corpus is an English multi-speaker dataset, with 44 hours of audio spoken by 109 native English speakers.

Dataset
JSON

4 datasets found

TIMIT

Generative Pre-Training for Speech

Reverberation and Noise Contaminated Speech Datasets

VCTK Corpus