The dataset used in this paper is the Grid-world task, which is a simple grid-based environment. The dataset is used to evaluate the performance of the Self-correcting Q-learning algorithm.
BibTex:
Before browse our site, please accept our cookies policy