Lifelong Hyper-Policy Optimization with Multiple Importance Sampling

doi:doi:10.57702/ron28kil

Lifelong Hyper-Policy Optimization with Multiple Importance Sampling

Followers: 0

Organization

No Organization

There is no description for this organization

License

No License Provided

Export

DCAT(rdf/xml) DCAT(xml) DCAT(N3) DCAT(ttl) DCAT(jsonld) DataCite CSL DublinCore BibTex

Lifelong Hyper-Policy Optimization with Multiple Importance Sampling

The authors propose a lifelong RL approach that learns a hyper-policy, whose input is time, that outputs the parameters of the policy to be queried at that time.

BibTex:

Before browse our site, please accept our cookies policy