Reinforcement Learning - Groups

Pong, Breakout, Space Invaders, and Seaquest games

The dataset used in the paper is the Pong, Breakout, Space Invaders, and Seaquest games.

Dataset
JSON

Atari 2600 Arcade Learning Environment

The dataset used in the paper is the Atari 2600 Arcade Learning Environment.

Dataset
JSON

Minigrid environment

The dataset used in the paper is the Minigrid environment, which is a 3D grid world with a goal at the bottom-right corner. The agent learns to navigate to the goal using human...

Dataset
JSON

LunarLander environment

The dataset used in the paper is the LunarLander environment, which is a classic control problem. The agent learns to land a lunar lander using human feedback.

Dataset
JSON

CartPole environment

The dataset used in the paper is the CartPole environment, which is a classic control problem. The agent learns to balance a pole using human feedback.

Dataset
JSON

Eﬃcient Reinforcement Learning in Deterministic Systems

The dataset is used to test the Optimistic Constraint Propagation algorithm for reinforcement learning in deterministic systems.

Dataset
JSON

Treasure World

The Treasure World domain is a 3D navigation domain within the DM Lab framework. The domain consists of one large room filled with 64 objects of multiple types. Whenever an...

Dataset
JSON

Playing Catan with Cross-dimensional Neural Network

Catan is a strategic board game with many interesting properties, including multi-player, imperfect information, stochasticity, a complex state space structure (hexagonal board...

Dataset
JSON

Crippled-Ant Environment

The Crippled-Ant Environment is a high-dimensional continuous control environment, where a quadruped aims to attain the highest possible velocity in a limited amount of time.

Dataset
JSON

PolicyCleanse: Backdoor Detection and Mitigation for Reinforcement Learning

Dataset
JSON

RESACT: REINFORCING LONG-TERM ENGAGEMENT

Long-term engagement is preferred over immediate engagement in sequential recommendation as it directly affects product operational metrics such as daily active users (DAUs) and...

Dataset
JSON

Arcade Learning Environment (ALE) and Gym MuJoCo benchmark

The dataset used in the paper is the Arcade Learning Environment (ALE) and the Gym MuJoCo benchmark.

Dataset
JSON

Atari RAM Games

The dataset is used to demonstrate the effectiveness of the Discovery of Deep Options (DDO) algorithm in accelerating reinforcement learning.

Dataset
JSON

GridWorld

The dataset is used to demonstrate the effectiveness of the Discovery of Deep Options (DDO) algorithm in accelerating reinforcement learning.

Dataset
JSON

A Relearning Approach to Reinforcement Learning for control of Smart Buildings

This paper demonstrates that continual relearning of control policies using incremental deep reinforcement learning can improve policy learning for non-stationary processes.

Dataset
JSON

Mol-AIR: A Molecular Optimization Framework with Adaptive Intrinsic Rewards

The Mol-AIR dataset is a molecular optimization framework with adaptive intrinsic rewards that performs efficient exploration for effective goal-directed molecular generation.

Dataset
JSON

Introspection Learning Dataset

The dataset used in the Introspection Learning algorithm, which consists of a family of subsets of state-action pairs (Ui)i, used to query the oracle ωπ.

Dataset
JSON

Reinforcement Learning from Human Feedback with Active Queries

Aligning large language models (LLM) with human preference plays a key role in building modern generative models and can be achieved by reinforcement learning from human...

Dataset
JSON

JIGSAWS

The dataset is used to demonstrate the effectiveness of the Discovery of Deep Options (DDO) algorithm in accelerating reinforcement learning.

Dataset
JSON

PASA: Probabilistic Adaptive State Aggregation

The dataset used in the paper is a state aggregation approximation architecture, which is adapted using feedback regarding the frequency with which an agent has visited certain...

Dataset
JSON

397 datasets found