Dataset - LDM

APPS: A Dataset for Code Generation Evaluation

The APPS dataset is a collection of programming problems used to evaluate the performance of code generation models.
- Dataset
- JSON
Evaluating large language models trained on code

The paper presents the results of the OpenAI Codex evaluation on generating Python code.
- Dataset
- JSON
Execution-based Evaluation for NL2Bash

A set of 50 prompts to evaluate execution-based evaluation for NL2Bash task
- Dataset
- JSON
CodeXGLUE

Code completion is considered as an essential feature towards efficient software development in modern Integrated Development Environments (IDEs).
- Dataset
- JSON
CodeUltraFeedback

CodeUltraFeedback is a preference dataset of 10,000 complex instructions to tune and align LLMs to coding preferences through AI feedback.
- Dataset
- JSON
APPS

The dataset used in the paper for training and testing the DPO and PPO models.
- Dataset
- JSON
Evol-Instruct-Code-80k

Evol-Instruct-Code-80k is a dataset for evaluating the performance of code generation models.
- Dataset
- JSON

You can also access this registry using the API (see API Docs).

7 datasets found