Nonlinear Sequential Accepts and Rejects

The dataset used in the paper is a stochastic multi-armed bandits problem, where the objective is to find the top M arms in the sense of the expected reward.

BibTex: