WikiMatrix

The WikiMatrix dataset is a multilingual dataset that contains parallel texts between English and other languages.

Data and Resources

Cite this as

Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, Veselin Stoyanov (2024). Dataset: WikiMatrix. https://doi.org/10.57702/e02s74au

DOI retrieved: December 3, 2024

Additional Info

Field Value
Created December 3, 2024
Last update December 3, 2024
Defined In https://doi.org/10.48550/arXiv.2206.00621
Author Yinhan Liu
More Authors
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
Mike Lewis
Luke Zettlemoyer
Veselin Stoyanov
Homepage https://huggingface.co/datasets/WikiMatrix