Batched bandit problems

Vianney Perchet, Philippe Rigollet, Sylvain Chassang, Erik Snowberg

Research output: Contribution to journalArticle

12 Scopus citations

Abstract

Motivated by practical applications, chiefly clinical trials, we study the regret achievable for stochastic bandits under the constraint that the employed policy must split trials into a small number of batches. We propose a simple policy, and show that a very small number of batches gives close to minimax optimal regret bounds. As a byproduct, we derive optimal policies with low switching cost for stochastic bandits.

Original languageEnglish (US)
Pages (from-to)660-681
Number of pages22
JournalAnnals of Statistics
Volume44
Issue number2
DOIs
StatePublished - Apr 2016

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Statistics, Probability and Uncertainty

Keywords

  • Batches
  • Grouped clinical trials, sample size determination, switching cost
  • Multi-armed bandit problems
  • Multi-phase allocation
  • Regret bounds

Fingerprint Dive into the research topics of 'Batched bandit problems'. Together they form a unique fingerprint.

  • Cite this

    Perchet, V., Rigollet, P., Chassang, S., & Snowberg, E. (2016). Batched bandit problems. Annals of Statistics, 44(2), 660-681. https://doi.org/10.1214/15-AOS1381