-
Newsvendor
The Newsvendor problem is a classic problem in inventory management. The problem is to determine the optimal order quantity to satisfy uncertain demand. -
ORL: Reinforcement Learning Benchmarks for Online Stochastic Optimization Pro...
Reinforcement Learning (RL) has achieved state-of-the-art results in domains such as robotics and games. We build on this previous work by applying RL algorithms to a selection... -
Comparison Datasets for Imitation Learning and Reinforcement Learning
Comparison datasets for Imitation Learning and Reinforcement Learning. -
RILe: Reinforced Imitation Learning
Reinforced Imitation Learning (RILe) dataset, which consists of expert demonstrations and noisy expert data. -
RL Unplugged
The RL Unplugged dataset, a benchmark for offline reinforcement learning, consisting of 20 tasks with varying difficulty levels. -
Explore2Offline
The dataset used in the paper for offline reinforcement learning, consisting of task-agnostic exploration data collected via curiosity-based intrinsic motivation. -
Multiple Domain Cyberspace Attack and Defense Game Model
The dataset used in the paper is a multiple domain cyberspace attack and defense game model based on reinforcement learning. -
Prioritized Sequence Experience Replay
Prioritized Sequence Experience Replay (PSER) is a novel framework for prioritizing sequences of transitions to both learn more efficiently and effectively. -
TorchCraft
The TorchCraft dataset is a collection of games played by a reinforcement learning agent, which can be used to train and evaluate reinforcement learning algorithms. -
Bootstrapped DQN
The Bootstrapped DQN dataset is a collection of 49 Atari games. -
Incentivizing Exploration in Atari
The Incentivizing Exploration in Atari dataset is a collection of 49 Atari games. -
Arcade Learning Environment
The Arcade Learning Environment (ALE) dataset is a collection of 49 Atari games. -
Grid-world environment
The dataset used in the paper is a grid-world environment, which is a discrete MDP. The environment has four walls, some obstacles, a start-state and a reward-state. The goal of... -
New York Road Network
The dataset used in the paper is a real-world traffic signal control dataset, which includes 48 intersections in the New York road network. -
Jinan Road Network
The dataset used in the paper is a real-world traffic signal control dataset, which includes 12 intersections in the Jinan road network. -
Shenzhen Road Network
The dataset used in the paper is a real-world traffic signal control dataset, which includes 33 traffic signals in the Shenzhen road network. -
Transactions on Machine Learning Research
The dataset used in the Efficient Reward Poisoning Attacks on Online Deep Reinforcement Learning paper.