2 datasets found

Groups: Code Editing

Filter Results
  • LiveCodeBench

    LiveCodeBench is a benchmark for evaluating the performance of Large Language Models (LLMs) in code editing tasks, including debugging, translating, polishing, and requirement...
  • Hoppity

    Hoppity is a dataset of graph transformations that replicate small edits in code.