-
Grid World
The dataset used in the paper is a reinforcement learning dataset, specifically a Markov Decision Process (MDP) with a finite set of states and actions. -
Forest management problem
The dataset used in this paper is a forest management problem, where the objective is to maintain an old forest for wildlife and make money by selling the cut wood.