-
MaxDiff RL dataset
The dataset used in this paper is a collection of experiences of embodied agents in various environments, including a point mass, swimmer, and ant. The dataset is used to... -
Soft Actor-Critic With Integer Actions
Reinforcement learning under integer actions by incorporating the Soft Actor-Critic (SAC) algorithm with an integer reparameterization.