Policy Optimization for Stochastic Shortest Path

Policy optimization for stochastic shortest path (SSP) problem, a goal-oriented reinforcement learning model that strictly generalizes the finite-horizon model and better captures many applications.

BibTex: