You're currently viewing an old version of this dataset. To see the current version, click here.

Mr. TyDi

The Mr. TyDi dataset is a multilingual dataset for dense retrieval, consisting of 100,000 passages and 1,000,000 queries.

Data and Resources

Cite this as

Xinyu Zhang, Nandan Thakur, Odunayo Ogundepo, Ehsan Kamalloo, David Alfonso-Hermelo, Xiaoguang Li, Qun Liu, Mehdi Rezagholizadeh, Jimmy Lin (2024). Dataset: Mr. TyDi. https://doi.org/10.57702/o7nteut0

DOI retrieved: December 16, 2024

Additional Info

Field Value
Created December 16, 2024
Last update December 16, 2024
Defined In https://doi.org/10.48550/arXiv.2302.07010
Author Xinyu Zhang
More Authors
Nandan Thakur
Odunayo Ogundepo
Ehsan Kamalloo
David Alfonso-Hermelo
Xiaoguang Li
Qun Liu
Mehdi Rezagholizadeh
Jimmy Lin
Homepage https://github.com/nyu-dmlab/MrTyDi