-
Lifelong Hyper-Policy Optimization with Multiple Importance Sampling
The authors propose a lifelong RL approach that learns a hyper-policy, whose input is time, that outputs the parameters of the policy to be queried at that time. -
SHARING LIFELONG REINFORCEMENT LEARNING KNOWLEDGE VIA MODULATING MASKS
The CT-graph and Minigrid environments are used to evaluate lifelong reinforcement learning approaches. -
SimpleQuestion dataset for Wikidata
The dataset used in this paper is a reinforcement learning dataset, specifically the SimpleQuestion dataset, which contains questions answerable using Wikidata as the knowledge...