2 datasets found

Tags: off-policy learning

Filter Results
  • Soft Actor-Critic

    A soft actor-critic algorithm for off-policy maximum entropy deep reinforcement learning.
  • NeoRL

    A near real-world benchmark for offline RL, which contains datasets from various domains with controlled sizes, and extra test datasets for policy validation.
You can also access this registry using the API (see API Docs).