6 datasets found

Organizations: No Organization

Filter Results
  • Streaming end-to-end bilingual ASR systems with joint language identiļ¬cation

    Multilingual ASR technology simplifies model training and deployment, but its accuracy is known to depend on the availability of language information at runtime.
  • Europarl-ST

    Europarl-ST is a multilingual speech corpus that contains transcriptions of parliamentary debates in multiple languages.
  • Mozilla Commonvoice

    Mozilla Commonvoice is a multilingual speech corpus that contains transcriptions of conversations in multiple languages.
  • MLS

    MLS: A large-scale multilingual dataset for speech research.
  • CommonVoice

    The sequence-to-sequence approach is widely used in speech recognition (SR) nowadays, and many research works are dedicated to show that their capabilities relying on a single...
  • Dictation dataset

    The dictation dataset across 39 locales, including Latin (Albanian, Icelandic, Slovak), Arabic (Levant, Maghrebi), Cyrillic (Macedonian, Kazakh), Devanagari (Nepali), etc.