-
Machine Translation and Automated Analysis of the Sumerian Language Dataset
The Machine Translation and Automated Analysis of the Sumerian Language dataset, which contains Sumerian texts in cuneiform script. -
English-German Multi30K dataset
English-German Multi30K dataset -
WMT18 data
The dataset used in the paper is the WMT18 data. -
Eu-En: Basque-English dataset
The Basque-English dataset (eu-en) has been collected from the WMT16 IT-domain translation shared task. -
Cs-En: Czech-English dataset
The Czech-English dataset (cs-en) is also from the IWSLT 2016 TED talks translation task. -
En-Fr: English-French dataset
The English-French dataset (en-fr) has been sourced from the IWSLT 2016 translation shared task. -
De-En: German-English dataset
Four different language pairs have been selected for the experiments. The datasets' size varies from tens of thousands to millions of sentences to test the regularizers' ability... -
DocRepair dataset
The dataset used for testing the DocRepair model, containing 30m groups of 4 consecutive sentences in English and Russian. -
WMT 2014 English-German task
The dataset used for the Second Workshop on Neural Machine Translation and Generation -
IWSLT2014 dataset
Tatoeba and IWSLT2014 datasets for machine translation. -
Tatoeba and IWSLT2014 datasets
Simultaneous machine translation (SMT) datasets for Tatoeba and IWSLT2014. -
WMT16 English-Romanian
Diffusion models have achieved state-of-the-art synthesis quality on both visual and audio tasks, and recent works further adapt them to textual data by diffusing on the... -
WMT14 English-German
Translation Given Non-Autoregressive a source sentence x, an AT model generates each target word yt conditioned on previously generated ones y<t, leading to high latency on... -
IWSLT14 German-English
Diffusion models have achieved state-of-the-art synthesis quality on both visual and audio tasks, and recent works further adapt them to textual data by diffusing on the... -
WMT19 English-German
Two widely-used resource-rich benchmarks, WMT17 English-Chinese (20M) and WMT19 English-German (36M) translation tasks -
WMT17 English-Chinese
Two widely-used resource-rich benchmarks, WMT17 English-Chinese (20M) and WMT19 English-German (36M) translation tasks