S2ORC

A collection of 81.1 million scholarly publications in English from various academic fields, used to pre-train a language model.

BibTex: