-
Correctional Learning Dataset
The dataset used in the paper is a discrete system with a limited budget for correctional learning. -
Grid World
The dataset used in the paper is a reinforcement learning dataset, specifically a Markov Decision Process (MDP) with a finite set of states and actions. -
Stochastic MDP
The dataset used in this paper is a stochastic MDP with |S| = 4 and |A| = 4. One of the states is set to the terminal state, and one of the rest is set to the starting state.... -
A Deep Reinforcement Learning Approach for Online Parcel Assignment
The online parcel assignment problem, which is aimed at assigning each incoming parcel to a candidate route for delivery, in order to minimize the total cost under consideration... -
Mountain Car
The dataset used in the paper is a reinforcement learning dataset, specifically a Markov Decision Process (MDP) with a finite set of states and actions.