-
CartPole, Pendulum, and LunarLander
The dataset used in the paper is a set of environments for reinforcement learning, including CartPole, Pendulum, and LunarLander. -
Pendulum and Reacher
The Pendulum swing-up task, the agent tries to keep the pendulum upright and balanced under the constraint of keeping away from unsafe angles. In the Reacher task, the robotic...