12 datasets found

Groups: Speech Recognition Organizations: No Organization

Filter Results
  • The ICSI Meeting Corpus

    The ICSI Meeting Corpus
  • TIMIT Corpus

    The TIMIT corpus is a large database of speech recordings used for speaker recognition and speech synthesis tasks.
  • TIMIT

    The TIMIT corpus is a widely used benchmark for speech recognition tasks. It contains 3,696 training utterances from 462 speakers, excluding the SA sentences. The core test set...
  • Voice Bank Corpus

    The Voice Bank Corpus is a large regional accent speech database containing over 10 hours of speech data from 20 speakers.
  • A Crowdsourced Open-Source Kazakh Speech Corpus and Initial Speech Recognitio...

    The Kazakh speech corpus (KSC) contains around 332 hours of transcribed audio comprising over 153,000 utterances spoken by participants from different regions and age groups, as...
  • Proprietary Speech Dataset

    Proprietary speech dataset consisted of 184 hours of high quality US English speech spoken by 11 female and 10 male speakers.
  • WSJ

    The WSJ corpus is a large vocabulary continuous speech recognition dataset. It contains 36416 sequences, representing around 80 hours of speech.
  • CSTR VCTK Corpus

    The CSTR VCTK Corpus is a dataset of speech recordings of 109 speakers, each with 20 utterances.
  • VCTK Dataset

    The VCTK dataset is a large corpus of speech recordings, each containing a single speaker and a single sentence.
  • VCTK

    Voice conversion (VC) is a technique that alters the voice of a source speaker to a target style, such as speaker identity, prosody, and emotion, while keeping the linguistic...
  • LibriSpeech dataset

    The dataset used in the paper is the LibriSpeech dataset, which contains about 1,000 hours of English speech derived from audiobooks.
  • LibriTTS

    A popular text-based VC approach is to use an automatic speech recognition (ASR) model to extract phonetic posteriorgram (PPG) as content representation.