Deterministic Policy Gradients With General State Transitions

The authors used the ComplexPoint-v0, Pendulum-v0, LunarLanderContinuous-v2, Swimmer-v2, HalfCheetah-v2, HumanoidStandup-v2, Humanoid-v2 datasets for experiments.

BibTex: