-
Deterministic Policy Gradients With General State Transitions
The authors used the ComplexPoint-v0, Pendulum-v0, LunarLanderContinuous-v2, Swimmer-v2, HalfCheetah-v2, HumanoidStandup-v2, Humanoid-v2 datasets for experiments. -
Super Mario Bros
The dataset used in the Generative Adversarial Exploration for Reinforcement Learning paper. -
Atari 2600
The dataset used in the paper is the Atari 2600 dataset, which consists of 49 games. The dataset is used to test the Successor Uncertainties algorithm. -
CartPole, Pendulum, and LunarLander
The dataset used in the paper is a set of environments for reinforcement learning, including CartPole, Pendulum, and LunarLander. -
Autohedger dataset
The dataset is used to train and test the autohedger model. -
Deep Attention Recurrent Q-Network
The Deep Attention Recurrent Q-Network (DARQN) algorithm was tested on several popular Atari 2600 games: Breakout, Seaquest, Space Invaders, Tutankham, and Gopher. -
Direct preference optimization: Your language model is secretly a reward model
The dataset used in the paper is not explicitly described. However, it is mentioned that the authors used a language model to optimize the performance of a reinforcement... -
Metadrive: Composing diverse driving scenarios for generalizable reinforcemen...
The dataset used in the paper is Metadrive, a driving simulator. -
LightZero: A unified benchmark for Monte Carlo Tree Search in general sequent...
The dataset used in the paper is not explicitly described. However, it is mentioned that the authors used Atari environments and board games to evaluate the proposed algorithm. -
Distributional Reinforcement Learning with Quantile Regression
Distributional reinforcement learning with quantile regression -
Markov Decision Process
The dataset used in the paper is a Markov Decision Process, where states can take values in a state space X, corresponding to a state x ∈ X, we can take an action u ∈ U,... -
Meta-World and Robomimic
The dataset used in the paper is a robotic manipulation task dataset, which consists of trajectories and preference labels. -
DeepMind Control Suite
The DeepMind Control Suite is a collection of 20 robotic manipulation tasks, each with 5 different environments and 5 different robot parameters. The tasks are designed to test... -
BBRL Activations Dataset
The dataset used in the paper is a collection of activations from a feature extraction network and a reactive network, used to train a Variational Autoencoder (VAE) to learn... -
Deep Reinforcement Learning Based Controller for Active Heave Compensation
Heave compensation is an essential part in various offshore operations. It is used in various applications, which include on-loading or off-loading systems, offshore drilling,...