OpenWebTextCorpus

The OpenWebText corpus is a collection of text data from the web.

Data and Resources

Cite this as

Wes Gurnee, Neel Nanda, Matthew Pauly, Katherine Harvey, Dmitrii Troitskii, Dimitris Bertsimas (2024). Dataset: OpenWebTextCorpus. https://doi.org/10.57702/hcb1yh2b

DOI retrieved: December 16, 2024

Additional Info

Field Value
Created December 16, 2024
Last update December 16, 2024
Defined In https://doi.org/10.48550/arXiv.2406.09519
Author Wes Gurnee
More Authors
Neel Nanda
Matthew Pauly
Katherine Harvey
Dmitrii Troitskii
Dimitris Bertsimas
Homepage https://skylion007.github.io/OpenWebTextCorpus