Policy Optimization for Stochastic Shortest Path
Policy optimization for stochastic shortest path (SSP) problem, a goal-oriented reinforcement learning model that strictly generalizes the finite-horizon model and better captures many applications.
BibTex: