-
Seq2sql: Generating structured queries from natural language using reinforcem...
Seq2sql: Generating structured queries from natural language using reinforcement learning -
Self-Learning Search Engine (SLSE) dataset
The dataset used in this paper is a multimedia search engine dataset, which is a Self-Learning Search Engine (SLSE) architecture based on reinforcement learning. -
Waypoints and Edges
The dataset used in the paper is a set of waypoints and edges for planning. -
2D Environment
The dataset used in the paper is a 2D environment where experiments are done. -
Policy Gradients using Variational Quantum Circuits
Variational Quantum Circuits are being used as versatile Quantum Machine Learning models. Some empirical results exhibit an advantage in supervised and generative learning... -
Bank Heist
The Bank Heist environment is a 2D maze with four rooms, where the objective is to navigate to banks distributed across the four mazes. -
Noisy MNIST
The MNIST environment does not elicit any actions from an agent. Instead, the prediction network simply needs to learn one step mappings between pairs of MNIST handwritten digits. -
Pong Variants
The dataset used in the paper is a set of Pong variants, including Noisy, Black, White, Zoom, and others. -
3D Maze Games
The dataset used in the paper is a set of 3D maze games, including Labyrinth and others. -
Evolution of Rewards for Food and Motor Action by Simulating Birth and Death
Evolution of Rewards for Food and Motor Action by Simulating Birth and Death -
Double Tunnel
The dataset used in the paper is for the Double Tunnel environment, which is a safety-critical task. -
AIOptimizer
AIOptimizer is a cost-optimized software performance optimisation tool. It includes a recommendation system driven by reinforcement learning to improve software system... -
Custom Pong Environment
A new Pong environment with a much higher degree of configurability than the current standard, including the ability to compete against a human opponent. -
DRiLLS: Deep Reinforcement Learning for Logic Synthesis
Logic synthesis requires extensive tuning of the synthesis optimization flow where the quality of results (QoR) depends on the sequence of optimizations used. The authors... -
MuJoCo Continuous Control Tasks
The dataset used in the paper is a collection of data from the MuJoCo continuous control tasks. -
SAI Dataset
The dataset used for training the SAI agent, containing 7x7 Go games with multiple komi values. -
MuJoCo environments
The dataset used in the paper is not explicitly described, but it is mentioned that the authors used MuJoCo environments from the OpenAI gym. -
OpenAI Gym benchmark
The dataset used in the paper is the OpenAI Gym benchmark, which provides a set of environments for reinforcement learning. -
Discovering Blind Spots in Reinforcement Learning
The dataset used in the paper is a collection of oracle feedback, which is used to learn a blind spot model of the target world.