Noisy MNIST

The MNIST environment does not elicit any actions from an agent. Instead, the prediction network simply needs to learn one step mappings between pairs of MNIST handwritten digits.

BibTex: