-
Generated Template Sentences for Same-Gender Relationships
Generated template sentences for a variety of relationships in French, Italian, and Spanish, using the format “OCCUPATION RELATIONSHIP-VERB RELATIONSHIP-TARGET.” HIS/HER -
English to Hebrew Transliteration
The dataset used for transliterating person names from English to Hebrew, supporting both backward transliteration of Hebrew names and Sideways Transliteration of Arabic names. -
IWSLT 2014 Shared Task Dataset
The IWSLT 2014 shared task dataset contains 152K, 156K, 141K and 172K training sentences for the de-en, zh-en, en-tr and en-es language pairs, respectively. -
Xlnet: Generalized Autoregressive Pretraining for Language Understanding
The Xlnet is a generalized autoregressive pretraining model for language understanding. -
Roberta: A Robustly Optimized BERT Pre-training Approach
Robert is a robustly optimized BERT pre-training approach. -
MARGE: A Pre-trained Sequence-to-Sequence Model for Multi-lingual Paraphrasing
MARGE is a pre-trained sequence-to-sequence model learned with an unsupervised multi-lingual multi-document paraphrasing objective. -
LNMap: Departures from Isomorphic Assumption in Bilingual Lexicon Induction
LNMap: Departures from isomorphic assumption in bilingual lexicon induction through non-linear mapping in latent space. -
Learning Principled Bilingual Word Embeddings
Learning principled bilingual mappings of word embeddings while preserving monolingual invariance. -
RAPO: An Adaptive Ranking Paradigm for Bilingual Lexicon Induction
Bilingual lexicon induction induces the word translations by aligning independently trained word embeddings in two languages. -
NIST Chinese-English
The dataset used for the experiments of simultaneous neural machine translation. -
WMT15 English-German
The dataset used for the experiments of simultaneous neural machine translation. -
IWSLT16 German-English
The dataset used for the experiments of simultaneous neural machine translation. -
WMT17 Zh-En
Non-autoregressive machine translation dataset -
WMT14 En-De
The WMT14 En-De dataset contains 4.5M pairs of English and German sentences. -
newstest2019.orig-en.p
The paraphrased reference translations used for the experiments in the paper. -
newstest2018.orig-en.p
The paraphrased reference translations used for the experiments in the paper. -
WMT 2019 English-German news translation task
The dataset used for the experiments in the paper, containing English-German news translation task.