2 datasets found

Groups: Off-policy Learning

Filter Results
  • Soft Actor-Critic

    A soft actor-critic algorithm for off-policy maximum entropy deep reinforcement learning.
  • NeoRL

    A near real-world benchmark for offline RL, which contains datasets from various domains with controlled sizes, and extra test datasets for policy validation.