You're currently viewing an old version of this dataset. To see the current version, click here.

Wikipedia dataset

The dataset used in the paper is the Wikipedia dataset, which contains over six million English Wikipedia articles with a full-text field associated with 50 training queries from various domains and 50 evaluation queries.

Data and Resources

Cite this as

Daniel Biermann, Fabrizio Palumbo, Morten Goodwin, Ole-Christoffer Granmo (2024). Dataset: Wikipedia dataset. https://doi.org/10.57702/4v69tpmv

DOI retrieved: December 16, 2024

Additional Info

Field Value
Created December 16, 2024
Last update December 16, 2024
Defined In https://doi.org/10.48550/arXiv.1905.11912
Citation
  • https://doi.org/10.48550/arXiv.2310.14837
  • https://doi.org/10.48550/arXiv.2305.13088
  • https://doi.org/10.48550/arXiv.2207.10839
  • https://doi.org/10.48550/arXiv.1702.07680
  • https://doi.org/10.48550/arXiv.2106.14610
  • https://doi.org/10.1145/3578337.3605132
  • https://doi.org/10.48550/arXiv.1901.04268
Author Daniel Biermann
More Authors
Fabrizio Palumbo
Morten Goodwin
Ole-Christoffer Granmo
Homepage https://dumps.wikimedia.org/