MACHIAVELLI Benchmark

A dataset of traces from the MACHIAVELLI environment, including API calls and their outcomes.

BibTex: