1 dataset found

Groups: Natural Language Processing

Filter Results
  • S2ORC

    A collection of 81.1 million scholarly publications in English from various academic fields, used to pre-train a language model.