Dataset - LDM

Quantile Oﬀ-Policy Evaluation via Deep Conditional Generative Learning

The dataset used in this paper for quantile off-policy evaluation via deep conditional generative learning.
- Dataset
- JSON
Synthetic Data

The dataset used in the paper is a synthetic dataset for off-policy contextual bandits, with contexts x ∈ X, a finite set of actions A, and bounded real rewards r ∈ A → [0, 1].
- Dataset
- JSON

You can also access this registry using the API (see API Docs).

2 datasets found