7 datasets found

Tags: parallel corpus

Filter Results
  • Umsuka English-isiZulu Parallel Corpus

    The Umsuka English-isiZulu Parallel Corpus provides a novel, high-quality parallel dataset for machine translation, containing English sentences sampled from both News Crawl...
  • ParCor Dataset

    The ParCor dataset is a parallel corpus of annotated pronouns.
  • WIT3 Parallel Corpus

    The WIT3 parallel corpus is a large-scale corpus of transcribed and translated talks.
  • Europarl parallel corpus

    The dataset used in this paper is a multi-view dataset, where each view is a matrix of size I x K, with I being the number of entities and K being the number of features. The...
  • Watchtower corpus (WTC)

    The dataset used in this paper is a multilingual parallel corpus, specifically the Watchtower corpus (WTC), which is a collection of multilingual sentences.
  • DGT corpus

    The dataset is a parallel corpus of aligned sentences across nine languages (36 language pairs) from the DGT corpus, used for language comparison experiments.
  • Paralex

    Propose a method for generating paraphrases of English questions that retain the original intent but use a different surface form.
You can also access this registry using the API (see API Docs).