-
CartPole and Blackjack environments
The dataset used in this paper is the CartPole and Blackjack environments from OpenAI Gym. -
PyBulletGym tasks
The dataset used in the paper is a collection of experiences sampled from a replay buffer, used to train and evaluate the proposed Multi-step DDPG (MDDPG) and Mixed Multi-step... -
Visual CartPole
A visual version of the CartPole environment from OpenAI Gym. -
Contra State Dataset
The dataset used in the paper is a collection of instruction sets and states for the Contra game, used to train a language model and a reinforcement learning policy. -
Contra Instruction Dataset
The dataset used in the paper is a collection of instruction sets and states for the Contra game, used to train a language model and a reinforcement learning policy. -
Contra Dataset
The dataset used in the paper is a collection of instruction sets and states for the Contra game, used to train a language model and a reinforcement learning policy. -
Reinforcement Re-ranking with 2D Grid-based Recommendation Panels
A novel Markov decision process (MDP)-based re-ranking model for final-stage recommendation, called Panel-MDP. -
Binary Tree MDP
The dataset used in the paper is a binary tree MDP, where the agent must execute a sequence of L uninterrupted UP movements. The dataset is used to test the Successor... -
Procgen Dataset
The dataset used in the experiments, which contains procedurally generated environments. -
Rainbow dataset
The dataset used in the paper is the Rainbow dataset, which is a combination of six extensions to the DQN algorithm. -
City Brain Challenge dataset
The dataset used in the City Brain Challenge competition, containing a real-world city-scale road network and its traffic demand derived from real traffic data. -
Cart-Pole Problem
The cart-pole problem is a classic control problem in robotics and control theory. It is a continuous control problem where the goal is to keep the pole upright by applying a... -
Generalization in Deep Reinforcement Learning for Robotic Navigation by Rewar...
A novel reward function for reinforcement learning and a Soft Actor-Critic algorithm to train a DRL policy in the context of local navigation for autonomous mobile robots in... -
Dynamic Frame Skip Deep Q-Network (DFDQN) dataset
The dataset used in the paper is the Dynamic Frame Skip Deep Q-Network (DFDQN) dataset, which consists of 3 Atari games: Seaquest, Space Invaders, and Alien. -
Deep Q-Network (DQN) dataset
The dataset used in the paper is the Deep Q-Network (DQN) dataset, which consists of 15 classic Atari games. -
SCIMAI-Gym
The SCIM environment proposed in this paper is a stochastic and divergent two-echelon supply chain that includes a factory that can produce various product types, a factory...