-
DEFENDER: DTW-Based Episode Filtering Using Demonstrations for Enhancing RL S...
The dataset used in this paper is a set of demonstrations for reinforcement learning, containing safe and unsafe trajectories. -
PKU-SafeRLHF dataset
The dataset used in the paper is the PKU-SafeRLHF dataset.