-
Penn Treebank
The Penn Treebank dataset contains one million words of 1989 Wall Street Journal material annotated in Treebank II style, with 42k sentences of varying lengths. -
IWSLT14 EN→DE, WMT14 EN→DE, WMT16 EN→DE
The dataset used in the paper is not explicitly described. However, it is mentioned that the authors used the IWSLT14 EN→DE task, WMT14 EN→DE task, and WMT16 EN→DE task. -
IWSLT14 and WMT14
The IWSLT14 and WMT14 datasets are machine translation datasets. -
WMT’16 English-Romanian dataset
The WMT’16 English-Romanian dataset was used for machine translation task. -
WMT’14 English-German and English-French benchmarks, CNN/DailyMail dataset, a...
The WMT’14 English-German and English-French benchmarks, CNN/DailyMail dataset, and CONLL dataset were used for machine translation, abstractive summarization, and grammar error...