Policy Optimization - Groups

Linear Quadratic Regulator (LQR)

The Linear Quadratic Regulator (LQR) dataset is used to study the sample complexity of model-based and model-free algorithms for policy evaluation and policy optimization.
- Dataset
- JSON
Policy Optimization for Stochastic Shortest Path

Policy optimization for stochastic shortest path (SSP) problem, a goal-oriented reinforcement learning model that strictly generalizes the finite-horizon model and better...
- Dataset
- JSON

2 datasets found