Code Representation - Groups

SynCoBERT

SynCoBERT: Syntax-guided multi-modal contrastive pre-training for code representation.

Dataset
JSON

Various Datasets

The datasets used in the paper are described as follows: WikiMIA, BookMIA, Temporal Wiki, Temporal arXiv, ArXiv-1 month, Multi-Webdata, LAION-MI, Gutenberg.

Dataset
JSON

OSCAR Dataset

The dataset used in the paper is a large corpus of real-world programs for pre-training a neural network model to learn better code representation.

Dataset
JSON

3 datasets found

SynCoBERT

Various Datasets

OSCAR Dataset