Competing in the dark: An efficient algorithm for bandit linear optimization

Jacob Abernethy, Elad E. Hazan, Alexander Rakhlin

Research output: Chapter in Book/Report/Conference proceedingConference contribution

107 Scopus citations

Abstract

We introduce an efficient algorithm for the problem of online linear optimization in the bandit setting which achieves the optimal O *(√T) regret. The setting is a natural generalization of the non-stochastic multi-armed bandit problem, and the existence of an efficient optimal algorithm has been posed as an open problem in a number of recent papers. We show how the difficulties encountered by previous approaches are overcome by the use of a self-concordant potential function. Our approach presents a novel connection between online learning and interior point methods.

Original languageEnglish (US)
Title of host publication21st Annual Conference on Learning Theory, COLT 2008
Pages263-273
Number of pages11
StatePublished - Dec 1 2008
Externally publishedYes
Event21st Annual Conference on Learning Theory, COLT 2008 - Helsinki, Finland
Duration: Jul 9 2008Jul 12 2008

Other

Other21st Annual Conference on Learning Theory, COLT 2008
CountryFinland
CityHelsinki
Period7/9/087/12/08

All Science Journal Classification (ASJC) codes

  • Education

Fingerprint Dive into the research topics of 'Competing in the dark: An efficient algorithm for bandit linear optimization'. Together they form a unique fingerprint.

  • Cite this

    Abernethy, J., Hazan, E. E., & Rakhlin, A. (2008). Competing in the dark: An efficient algorithm for bandit linear optimization. In 21st Annual Conference on Learning Theory, COLT 2008 (pp. 263-273)