No Organization - Organizations

Solving Robust MDPs through No-Regret Dynamics

The Robust MDPs problem is a Markov Decision Process problem where the goal is to find a policy π that maximizes the Value Function under worst-case transition dynamics.

Dataset
JSON

Pong Variants

The dataset used in the paper is a set of Pong variants, including Noisy, Black, White, Zoom, and others.

Dataset
JSON

3D Maze Games

The dataset used in the paper is a set of 3D maze games, including Labyrinth and others.

Dataset
JSON

Evolution of Rewards for Food and Motor Action by Simulating Birth and Death

Dataset
JSON

Multimodal Query Suggestion with Multi-Agent Reinforcement Learning from Huma...

Multimodal Query Suggestion with Multi-Agent Reinforcement Learning from Human Feedback

Dataset
JSON

Reinforcement Learning with Delayed, Composite, and Partially Anonymous Reward

We investigate an inﬁnite-horizon average reward Markov Decision Process (MDP) with delayed, composite, and partially anonymous reward feedback.

Dataset
JSON

Double Tunnel

The dataset used in the paper is for the Double Tunnel environment, which is a safety-critical task.

Dataset
JSON

Custom Pong Environment

A new Pong environment with a much higher degree of configurability than the current standard, including the ability to compete against a human opponent.

Dataset
JSON

Guard: A safe reinforcement learning benchmark

The dataset used in the paper is a collection of robot locomotion tasks with various constraints.

Dataset
JSON

State-wise Constrained Policy Optimization

State-wise Constrained Policy Optimization (SCPO) is a general-purpose policy search algorithm for state-wise constrained reinforcement learning.

Dataset
JSON

Pretrained Visual Representations in Reinforcement Learning

Visual reinforcement learning (RL) has made significant progress in recent years, but the choice of visual feature extractor remains a crucial design decision.

Dataset
JSON

DRiLLS: Deep Reinforcement Learning for Logic Synthesis

Logic synthesis requires extensive tuning of the synthesis optimization flow where the quality of results (QoR) depends on the sequence of optimizations used. The authors...

Dataset
JSON

Interactive Scoring IRL

The dataset used in the paper is a set of trajectories and scores provided by human teachers to train a behavioral policy in a sparse reward environment.

Dataset
JSON

MuJoCo Continuous Control Tasks

The dataset used in the paper is a collection of data from the MuJoCo continuous control tasks.

Dataset
JSON

NeoRL

A near real-world benchmark for ofﬂine RL, which contains datasets from various domains with controlled sizes, and extra test datasets for policy validation.

Dataset
JSON

Defense Against Reward Poisoning Attacks in Reinforcement Learning

We study defense strategies against reward poisoning attacks in reinforcement learning.

Dataset
JSON

A Neuromorphic Architecture for Reinforcement Learning from Real-Valued Obser...

The proposed network contains clustering layers, based on earlier work by Afshar et al., 2020 and Bethi et al., 2022, with an introduction of TD-error modulation and eligibility...

Dataset
JSON

On the Theory of Reinforcement Learning

The dataset is used to study a theory of reinforcement learning (RL) in which the learner receives binary feedback only once at the end of an episode.

Dataset
JSON

HandManipulateBlock

The HandManipulateBlock environment from OpenAI gym robotics suite

Dataset
JSON

FetchPickAndPlace and HandManipulateBlock

The FetchPickAndPlace and HandManipulateBlock environments from OpenAI gym robotics suite

Dataset
JSON

397 datasets found