2 datasets found

Groups: Audio Tags: Speaker Recognition

Filter Results
  • VCTK Corpus

    The VCTK corpus is an English multi-speaker dataset, with 44 hours of audio spoken by 109 native English speakers.
  • LibriTTS

    A popular text-based VC approach is to use an automatic speech recognition (ASR) model to extract phonetic posteriorgram (PPG) as content representation.
You can also access this registry using the API (see API Docs).