Reinforcement Re-ranking with 2D Grid-based Recommendation Panels

A novel Markov decision process (MDP)-based re-ranking model for final-stage recommendation, called Panel-MDP.

BibTex: