-
Visualizing MuZero Models
MuZero, a model-based reinforcement learning algorithm that uses a value equivalent dynamics model. -
Google Research Football
The Google Research Football environment is a reinforcement learning experimental platform focused on training agents to play football. -
Super Mario Bros
The dataset used in the Generative Adversarial Exploration for Reinforcement Learning paper. -
Atari 2600
The dataset used in the paper is the Atari 2600 dataset, which consists of 49 games. The dataset is used to test the Successor Uncertainties algorithm. -
CartPole, Pendulum, and LunarLander
The dataset used in the paper is a set of environments for reinforcement learning, including CartPole, Pendulum, and LunarLander. -
dm_control
The dataset used for training the dm_control environment. -
PatchAttack: A Black-box Texture-based Attack with Reinforcement Learning
Patch-based attacks introduce a perceptible but localized change to the input that induces misclassification. A limitation of cur- rent patch-based black-box attacks is that they... -
Direct preference optimization: Your language model is secretly a reward model
The dataset used in the paper is not explicitly described. However, it is mentioned that the authors used a language model to optimize the performance of a reinforcement... -
Metadrive: Composing diverse driving scenarios for generalizable reinforcemen...
The dataset used in the paper is Metadrive, a driving simulator. -
LightZero: A unified benchmark for Monte Carlo Tree Search in general sequent...
The dataset used in the paper is not explicitly described. However, it is mentioned that the authors used Atari environments and board games to evaluate the proposed algorithm. -
Markov Decision Process
The dataset used in the paper is a Markov Decision Process, where states can take values in a state space X, corresponding to a state x ∈ X, we can take an action u ∈ U,... -
DeepMind Control Suite
The DeepMind Control Suite is a collection of 20 robotic manipulation tasks, each with 5 different environments and 5 different robot parameters. The tasks are designed to test...