Safe RL policy synthesis in environments with unknown safety constraints

The dataset used in this paper is a small initial labeled dataset, safe trajectories Ds and unsafe trajectories Dus, and a pSTL safety specification template.

BibTex: