You're currently viewing an old version of this dataset. To see the current version, click here.

Lang-84

The dataset used in this paper is a collection of parallel sentence pairs from 96 different native languages, with at least 10,000 sentence pairs per language.

Data and Resources

Cite this as

Yuanyuan Zhao, Weiwei Sun, Xiaojun Wan (2025). Dataset: Lang-84. https://doi.org/10.57702/uyqnrio1

DOI retrieved: January 2, 2025

Additional Info

Field Value
Created January 2, 2025
Last update January 2, 2025
Defined In https://doi.org/10.48550/arXiv.2007.09076
Author Yuanyuan Zhao
More Authors
Weiwei Sun
Xiaojun Wan
Homepage https://code.google.com/archive/p/berkeleyaligner/