Partially-Observable Markov Decision Processes - Groups

Mountain Hike and Varying Mountain Hike

The Mountain Hike and Varying Mountain Hike are POMDPs where the agent is tasked with reaching the top of a mountain.

Dataset
JSON

T-Maze and Stochastic T-Maze

The T-Maze and Stochastic T-Maze are POMDPs where the agent is tasked with finding the treasure in a T-shaped maze.

Dataset
JSON

Corridor Environment

The corridor environment is a simple environment where the agent has to determine whether the rewarding cell (colored yellow) is at the top or bottom, based on the color of the...

Dataset
JSON

Reconnaissance Blind TicTacToe

The Reconnaissance Blind TicTacToe (RBT) dataset is a variation of the Reconnaissance Blind Chess (RBC) challenge. It is a game of TicTacToe where the agent cannot see the moves...