6 datasets found

Groups: Multilingual Tags: Multilingual

Filter Results
  • M4

    The M4 dataset consists of human-written texts from several data sources, including Wikipedia, Reddit, and arXiv in the English subset of the dataset. It pairs the human-written...
  • BABEL-Pashto

    The BABEL-Pashto dataset is a multilingual speech recognition dataset containing Pashto speech recordings.
  • MCV-10

    This work showcases a cost-effective method for generating training data for speech processing tasks. The dataset MCV-10 is a multilingual dataset that contains 50 hours of...
  • TransMuCoRes

    Translated dataset for Multilingual Coreference Resolution (TransMuCoRes) in 31 South Asian languages.
  • Very Deep Multilingual Convolutional Neural Networks for LVCSR

    Convolutional neural networks (CNNs) are a standard component of many current state-of-the-art Large Vocabulary Continuous Speech Recognition (LVCSR) systems. However, CNNs in...
  • CommonVoice

    The sequence-to-sequence approach is widely used in speech recognition (SR) nowadays, and many research works are dedicated to show that their capabilities relying on a single...
You can also access this registry using the API (see API Docs).