4 datasets found

Groups: Speech Synthesis Formats: JSON

Filter Results
  • TIMIT

    The TIMIT corpus is a widely used benchmark for speech recognition tasks. It contains 3,696 training utterances from 462 speakers, excluding the SA sentences. The core test set...
  • Generative Pre-Training for Speech

    Generative models have gained more and more attention in recent years for their remarkable success in tasks that required estimating and sampling data distribution to generate...
  • Reverberation and Noise Contaminated Speech Datasets

    Training and test datasets were generated by contaminating the clean data with reverberation and noise.
  • VCTK Corpus

    The VCTK corpus is an English multi-speaker dataset, with 44 hours of audio spoken by 109 native English speakers.