-
Stochastic MDP
The dataset used in this paper is a stochastic MDP with |S| = 4 and |A| = 4. One of the states is set to the terminal state, and one of the rest is set to the starting state.... -
A Deep Reinforcement Learning Approach for Online Parcel Assignment
The online parcel assignment problem, which is aimed at assigning each incoming parcel to a candidate route for delivery, in order to minimize the total cost under consideration... -
Forest management problem
The dataset used in this paper is a forest management problem, where the objective is to maintain an old forest for wildlife and make money by selling the cut wood. -
Mountain Car
The dataset used in the paper is a reinforcement learning dataset, specifically a Markov Decision Process (MDP) with a finite set of states and actions.