1 dataset found

Tags: contextual bandits

Filter Results
  • Synthetic Data

    The dataset used in the paper is a synthetic dataset for off-policy contextual bandits, with contexts x ∈ X, a finite set of actions A, and bounded real rewards r ∈ A → [0, 1].
You can also access this registry using the API (see API Docs).