-
BRIDGE dataset
The BRIDGE dataset is a collection of 155 deterministic MDPs, each with a horizon of 100 time steps. The dataset is used to evaluate the performance of reinforcement learning... -
Four Rooms
The Four Rooms environment is a stochastic version of the classic Atari game Four Rooms. The environment has 104 states and 4 actions, and the agent can move in any of the 4... -
Markov Decision Process
The dataset used in the paper is a Markov Decision Process, where states can take values in a state space X, corresponding to a state x ∈ X, we can take an action u ∈ U,...