Lang-84

doi:doi:10.57702/uyqnrio1

You're currently viewing an old version of this dataset. To see the current version, click here.

Lang-84

The dataset used in this paper is a collection of parallel sentence pairs from 96 different native languages, with at least 10,000 sentence pairs per language.

Data and Resources

Original MetadataJSON
The json representation of the dataset with its distributions based on DCAT.
Explore
- Preview
- Download

Cite this as

Yuanyuan Zhao, Weiwei Sun, Xiaojun Wan (2025). Dataset: Lang-84. https://doi.org/10.57702/uyqnrio1

DOI retrieved: January 2, 2025

Additional Info

Field	Value
Created	January 2, 2025
Last update	January 2, 2025
Defined In	https://doi.org/10.48550/arXiv.2007.09076
Author	Yuanyuan Zhao
More Authors	Weiwei Sun Xiaojun Wan
Homepage	https://code.google.com/archive/p/berkeleyaligner/