Massively Multilingual Machine Translation Dataset

A corpus of parallel documents over 102 languages and English, containing 25 billion training examples across a diverse set of languages used for multilingual neural machine translation.

BibTex: