3 datasets found

Filter Results
  • Reuters Corpus Volume 2

    A multilingual corpus with a collection of 487,000 news stories.
  • OSCAR corpus

    The dataset used in this study is the OSCAR corpus, which is a multilingual corpus that is obtained by filtering of the Common Crawl corpus.
  • Parallel Meaning Bank

    A semantically annotated parallel corpus for English, German, Italian, and Dutch where sentences are aligned with scoped meaning representations in order to capture the...