-
Solving Robust MDPs through No-Regret Dynamics
The Robust MDPs problem is a Markov Decision Process problem where the goal is to find a policy π that maximizes the Value Function under worst-case transition dynamics. -
Tensor and Matrix Low-Rank Value-Function Approximation in Reinforcement Lear...
Value-function (VF) approximation is a central problem in Reinforcement Learning (RL). Classical non-parametric VF estimation suffers from the curse of dimensionality. As a...