-
Minimax Regret for Online Learning with Feedback Graphs
The dataset used in the paper is a sequence of strongly observable undirected feedback graphs, where each graph has independence number α for some common value α. -
Lipschitz Bandits
The dataset used in the paper is a Lipschitz bandit problem, where the learner aims at selecting satisficing arms (arms with mean reward exceeding a certain threshold value) as... -
Concave Bandits
The dataset used in the paper is a concave bandit problem, where the learner aims at selecting satisficing arms (arms with mean reward exceeding a certain threshold value) as... -
Finite-Armed Bandits
The dataset used in the paper is a finite-armed bandit problem, where the learner aims at selecting satisficing arms (arms with mean reward exceeding a certain threshold value)...