-
Synthetic Data
The dataset used in the paper is a synthetic dataset for off-policy contextual bandits, with contexts x ∈ X, a finite set of actions A, and bounded real rewards r ∈ A → [0, 1]. -
Synthetic Dataset
The dataset used in this work is a custom synthetic dataset generated using the liquid-dsp library, containing 600000 examples of each of 13.8 million examples, with SNRs...