Datasets Activity Stream About Order by Relevance Name Ascending Name Descending Last Modified Go 1 dataset found Filter Results Synthetic Data The dataset used in the paper is a synthetic dataset for off-policy contextual bandits, with contexts x ∈ X, a finite set of actions A, and bounded real rewards r ∈ A → [0, 1]. Dataset JSON