WMT17 Chinese-English, WMT14 English-French, WAT17 English-Japanese

The dataset used in the paper is a large-scale dataset for neural machine translation (NMT) task, consisting of 20.6M sentence pairs for Chinese-English, 35.5M sentence pairs for English-French, and 1.9M sentence pairs for English-Japanese.

Data and Resources

Cite this as

Shilin He, Zhaopeng Tu, Xing Wang, Longyue Wang, Michael R. Lyu, Shuming Shi (2024). Dataset: WMT17 Chinese-English, WMT14 English-French, WAT17 English-Japanese. https://doi.org/10.57702/zadhyml4

DOI retrieved: December 16, 2024

Additional Info

Field Value
Created December 16, 2024
Last update December 16, 2024
Defined In https://doi.org/10.48550/arXiv.1909.00326
Author Shilin He
More Authors
Zhaopeng Tu
Xing Wang
Longyue Wang
Michael R. Lyu
Shuming Shi
Homepage https://github.com/attneuralmachine/word-importance