1 dataset found

Tags: Diverse Text

Filter Results
  • The Pile

    The Pile dataset contains 3.5 million samples of diverse text for language modeling.
You can also access this registry using the API (see API Docs).