3 datasets found

Tags: reward function

Filter Results
  • Atari 2600 Game

    The dataset used in the paper is an Atari 2600 game, where the agent receives reward 1 when a point is scored and 0 otherwise.
  • Gridworld Environment

    The dataset used in the paper is a gridworld environment, where an agent attempts to navigate to a goal block. Observations are 11x11 greyscale images, and the agent receives...
  • D4RL Benchmark

    D4RL benchmark dataset, which consists of four offline logging datasets, collected by different one or mixed behavior policies.
You can also access this registry using the API (see API Docs).