Dataset - LDM

Conformal Off-Policy Prediction in Contextual Bandits

Most off-policy evaluation methods for contextual bandits have focused on the expected outcome of a policy, which is estimated via methods that at best provide only asymptotic...
- Dataset
- JSON
Synthetic dataset for opportunistic contextual bandits

The dataset used in the paper is a synthetic dataset with 20 possible arms, each associated with a disjoint unknown coefficient θ(a).
- Dataset
- JSON
Synthetic Data

The dataset used in the paper is a synthetic dataset for off-policy contextual bandits, with contexts x ∈ X, a finite set of actions A, and bounded real rewards r ∈ A → [0, 1].
- Dataset
- JSON

You can also access this registry using the API (see API Docs).

3 datasets found