Datasets Activity Stream About Order by Relevance Name Ascending Name Descending Last Modified Go 2 datasets found Groups: Off-policy Learning Filter Results Soft Actor-Critic A soft actor-critic algorithm for off-policy maximum entropy deep reinforcement learning. Dataset JSON NeoRL A near real-world benchmark for offline RL, which contains datasets from various domains with controlled sizes, and extra test datasets for policy validation. Dataset JSON