Grid World Navigation Task

The dataset used in the paper is a grid world navigation task with four actions: up, down, left, or right. Transitions are stochastic and with a 5% probability the agent moves sideways. Rewards are set to 1 for entering the goal cell (terminal state) in the top right corner, and otherwise a zero reward is given.

Data and Resources

Cite this as

Lucas Lehnert, Stefanie Tellex, Michael L. Littman (2024). Dataset: Grid World Navigation Task. https://doi.org/10.57702/nugf8iwc

DOI retrieved: December 16, 2024

Additional Info

Field Value
Created December 16, 2024
Last update December 16, 2024
Defined In https://doi.org/10.48550/arXiv.1708.00102
Author Lucas Lehnert
More Authors
Stefanie Tellex
Michael L. Littman
Homepage https://arxiv.org/abs/1606.05312