No language left behind: Scaling human-centered machine translation

The dataset is used for training and testing the performance of multilingual language models.

Data and Resources

Cite this as

Marta R NLLB Team, Costa-juss`a, James Cross, Onur C¸ elebi, Maha Elbayad, Kenneth Heafield, Kevin Heffernan, Elahe Kalbassi, Janice Lam, Daniel Licht, Jean Maillard, Timoth´ee Lacroix, Baptiste Rozi`ere, Naman Goyal, Eric Hambro, Faisal Azhar (2025). Dataset: No language left behind: Scaling human-centered machine translation. https://doi.org/10.57702/qg913gvb

DOI retrieved: January 3, 2025

Additional Info

Field Value
Created January 3, 2025
Last update January 3, 2025
Defined In https://doi.org/10.48550/arXiv.2401.13136
Citation
  • https://doi.org/10.48550/arXiv.2310.03686
Author Marta R NLLB Team
More Authors
Costa-juss`a
James Cross
Onur C¸ elebi
Maha Elbayad
Kenneth Heafield
Kevin Heffernan
Elahe Kalbassi
Janice Lam
Daniel Licht
Jean Maillard
Timoth´ee Lacroix
Baptiste Rozi`ere
Naman Goyal
Eric Hambro
Faisal Azhar
Homepage https://huggingface.co/facebook/