Cart-pole problem dataset

The dataset used for the cart-pole problem is a finite set of states: S, a finite set of actions: A, a state transition probability matrix, P, a reward function R, and a discount factor γ.

BibTex: