Dataset - LDM

Automated Correction for Syntax Errors in Programming Assignments using Recur...

We present a technique for providing feedback on syntax errors that uses Recurrent neural networks (RNNs) to model syntactically valid token sequences.
- Dataset
- JSON
TEGCER Dataset

A dataset of 15,000+ erroneous student programs which fail to compile, along with their corresponding repaired code.
- Dataset
- JSON
MBPP: A Dataset for Language Model Evaluation

The MBPP dataset is a collection of basic programming questions used to evaluate the performance of language models.
- Dataset
- JSON
APPS: A Dataset for Code Generation Evaluation

The APPS dataset is a collection of programming problems used to evaluate the performance of code generation models.
- Dataset
- JSON
APPS

The dataset used in the paper for training and testing the DPO and PPO models.
- Dataset
- JSON

You can also access this registry using the API (see API Docs).

5 datasets found