-
MuJoCo Environment
The dataset used in the paper is a MuJoCo environment, with 13-states and 4-control inputs, nonlinear dynamics with polynomial dependency in the control inputs. -
Corridor Environment
The corridor environment is a simple environment where the agent has to determine whether the rewarding cell (colored yellow) is at the top or bottom, based on the color of the... -
Atari Environment
The dataset used in this work is the Atari environment in OpenAI Gym, created by the Arcade Learning Environment (ALE).