Off-policy Learning - Groups

Soft Actor-Critic

A soft actor-critic algorithm for off-policy maximum entropy deep reinforcement learning.
- Dataset
- JSON
NeoRL

A near real-world benchmark for ofﬂine RL, which contains datasets from various domains with controlled sizes, and extra test datasets for policy validation.
- Dataset
- JSON

Before browse our site, please accept our cookies policy

2 datasets found