MIRACL

doi:doi:10.57702/sv13lanf

MIRACL

The MIRACL dataset is a unique resource for researchers working on search across multiple languages. It covers 18 different languages, each of which is divided into four parts: train, dev, test-A, and test-B.

Data and Resources

Original MetadataJSON
The json representation of the dataset with its distributions based on DCAT.
Explore
- Preview
- Download

Cite this as

Qi Zhang, Zijian Yang, Yilun Huang, Ze Chen, Zijian Cai, Kangxu Wang, Jiewen Zheng, Jiarong He, Jin Gao (2024). Dataset: MIRACL. https://doi.org/10.57702/sv13lanf

DOI retrieved: December 16, 2024

Additional Info

Field	Value
Created	December 16, 2024
Last update	December 16, 2024
Defined In	https://doi.org/10.48550/arXiv.2302.07010
Author	Qi Zhang
More Authors	Zijian Yang Yilun Huang Ze Chen Zijian Cai Kangxu Wang Jiewen Zheng Jiarong He Jin Gao
Homepage	https://project-miracl.github.io/