Interactive Scoring IRL

The dataset used in the paper is a set of trajectories and scores provided by human teachers to train a behavioral policy in a sparse reward environment.

BibTex: