3 datasets found

Groups: Multilingualism Tags: Multilingual

Filter Results
  • Europarl-ST

    Europarl-ST is a multilingual speech corpus that contains transcriptions of parliamentary debates in multiple languages.
  • WikiANN

    The WikiANN dataset is a multilingual dataset for named entity recognition.
  • MuST-C

    MuST-C is a multilingual speech translation dataset, which contains at least 385 hours of audio recordings from TED Talks, with their manual transcriptions and translations at...
You can also access this registry using the API (see API Docs).