OSCAR Dataset

The dataset used in the paper is a large corpus of real-world programs for pre-training a neural network model to learn better code representation.

BibTex: