Markov Decision Processes with Reachability Characterization

The dataset used in the paper is a Markov Decision Process (MDP) with a set of states, actions, transition probabilities, and rewards.

BibTex: