1 dataset found

Tags: text document

Filter Results
  • C4

    The dataset used for pre-training language models, containing a large collection of text documents.
You can also access this registry using the API (see API Docs).