R2-B2: Recursive Reasoning-Based Bayesian Optimization for No-Regret Learning in Games

The dataset used in the paper is a synthetic game with two agents, where the payoff functions are sampled from a Gaussian process.

BibTex: