UKWaC and Wackypedia corpora

The dataset used in this paper is a large text corpus compiled from UKWaC and Wackypedia corpora.

Data and Resources

Cite this as

Ka Chun Lam, Francisco Pereira, Maryam Vaziri-Pashkam, Kristin Woodard, Emalie McMahon (2024). Dataset: UKWaC and Wackypedia corpora. https://doi.org/10.57702/dr3xmeq3

DOI retrieved: December 3, 2024

Additional Info

Field Value
Created December 3, 2024
Last update December 3, 2024
Defined In https://doi.org/10.48550/arXiv.2007.04245
Author Ka Chun Lam
More Authors
Francisco Pereira
Maryam Vaziri-Pashkam
Kristin Woodard
Emalie McMahon
Homepage https://www.ukwa-c.org/