Dataset - LDM

MetaWorld

The dataset is used to train the Make-An-Agent model, a novel policy parameter generator that leverages the power of conditional diffusion models for behavior-to-policy generation.
- Dataset
- JSON
Bipedal Walker, Acrobot, and Continuous Lunar Lander tasks

The dataset used in this paper is a reinforcement learning benchmark problem, specifically the Bipedal Walker, Acrobot, and Continuous Lunar Lander tasks.
- Dataset
- JSON
BRIDGE dataset

The BRIDGE dataset is a collection of 155 deterministic MDPs, each with a horizon of 100 time steps. The dataset is used to evaluate the performance of reinforcement learning...
- Dataset
- JSON
DeepMind Control Suite and PyBullet Environments

The dataset used in this paper is the DeepMind Control Suite and PyBullet Environments.
- Dataset
- JSON
The Arcade Learning Environment: An Evaluation Platform for General Agents

The Arcade Learning Environment (ALE) is a lasting and indispensable element of the RL researcher’s toolbox. It is also the focus of our work. Since its inception, hundreds of...
- Dataset
- JSON
Visual Grid World Environment and TextWorld domain

The dataset used in the paper is a Visual Grid World Environment and the TextWorld domain.
- Dataset
- JSON
Relay Policy Learning

Relay policy learning: Solving long-horizon tasks via imitation and reinforcement learning.
- Dataset
- JSON
Reinforcement Learning for (Mixed) Integer Programming: Smart Feasibility Pump

Mixed integer programming (MIP) problems with a linear objective, linear constraints, and integral constraints.
- Dataset
- JSON
Fine-tuning Language Models with Advantage-Induced Policy Alignment

The dataset used in the paper is the Anthropic Helpfulness and Harmlessness dataset and the StackExchange dataset.
- Dataset
- JSON
MuJoCo Environments with Noise Augmentation

The dataset used in the paper is a set of MuJoCo environments with noise augmentation.
- Dataset
- JSON
DMLab-30

DMLab-30 is a benchmark for multitask reinforcement learning in partially observable environments.
- Dataset
- JSON
Car Racing game dataset

The dataset used in this paper is the Car Racing game dataset, which consists of pixel frames of a car racing game.
- Dataset
- JSON
OpenAI Gym Environment dataset

The dataset used in this paper is the OpenAI Gym Environment dataset, which consists of various games and environments.
- Dataset
- JSON
Atari 2600 games dataset

The dataset used in this paper is the Atari 2600 games dataset, which consists of 50 Atari 2600 games.
- Dataset
- JSON
CodeContest

The dataset used in the paper for training and testing the DPO and PPO models.
- Dataset
- JSON
APPS

The dataset used in the paper for training and testing the DPO and PPO models.
- Dataset
- JSON
HH-RLHF

The HH-RLHF dataset is a human preference dataset for reinforcement learning from human feedback.
- Dataset
- JSON
SafeRLHF

The dataset used in the paper for training and testing the DPO and PPO models.
- Dataset
- JSON
Cambridge restaurant domain

The dataset used in the paper is the Cambridge restaurant domain from the PyDial toolkit.
- Dataset
- JSON
ProcGen

The dataset used in the paper is a procedurally generated environment called ProcGen.
- Dataset
- JSON

You can also access this registry using the API (see API Docs).

198 datasets found