One Billion Words Dataset
Data and Resources
-
Original MetadataJSON
The json representation of the dataset with its distributions based on DCAT.
Cite this as
Subham Sekhar Sahoo, Marianne Arriola, Yair Schiff, Aaron Gokaslan, Edgar Marroquin, Justin T Chiu, Alexander Rush, Volodymyr Kuleshov (2024). Dataset: One Billion Words Dataset. https://doi.org/10.57702/zujg4t8j
DOI retrieved: December 2, 2024
Additional Info
Field | Value |
---|---|
Created | December 2, 2024 |
Last update | December 2, 2024 |
Defined In | https://doi.org/10.48550/arXiv.2406.07524 |
Author | Subham Sekhar Sahoo |
More Authors |
|
Homepage | https://github.com/louaaron/Score-Entropy-Discrete-Diffusion/blob/main/data.py |