TED

The dataset is used for document-level neural machine translation. It contains 0.23M training sentences, 0.31M development sentences, and 0.21M test sentences.

BibTex: