3 datasets found

Tags: Multilingual

Filter Results
  • Europarl-ST

    Europarl-ST is a multilingual speech corpus that contains transcriptions of parliamentary debates in multiple languages.
  • WikiANN

    The WikiANN dataset is a multilingual dataset for named entity recognition.
  • MuST-C

    MuST-C is a multilingual speech translation dataset, which contains at least 385 hours of audio recordings from TED Talks, with their manual transcriptions and translations at...