-
Interactive Scoring IRL
The dataset used in the paper is a set of trajectories and scores provided by human teachers to train a behavioral policy in a sparse reward environment. -
Inverted Pendulum
The dataset used in the paper is an Inverted Pendulum dataset, which is a standard benchmark system in control and reinforcement learning.