RedPajama

The RedPajama dataset is an open-source recipe to reproduce the LLaMA training dataset.

Data and Resources

Cite this as

Together Computer (2024). Dataset: RedPajama. https://doi.org/10.57702/qqx3qox1

DOI retrieved: December 2, 2024

Additional Info

Field Value
Created December 2, 2024
Last update December 2, 2024
Defined In https://doi.org/10.48550/arXiv.2310.16789
Citation
  • https://doi.org/10.48550/arXiv.2309.00751
Author Together Computer
Homepage https://github.com/togethercomputer/RedPajama-INCITE-Chat-3B-v1