Interior-point methods for full-information and bandit online learning

Jacob D. Abernethy, Elad Hazan, Alexander Rakhlin

Research output: Contribution to journalArticlepeer-review

47 Scopus citations

Abstract

We study the problem of predicting individual sequences with linear loss with full and partial (or bandit) feedback. Our main contribution is the first efficient algorithm for the problem of online linear optimization in the bandit setting which achieves the optimal Õ(√T) regret. In addition, for the full-information setting, we give a novel regret minimization algorithm. These results are made possible by the introduction of interior-point methods for convex optimization to online learning.

Original languageEnglish (US)
Article number6191328
Pages (from-to)4164-4175
Number of pages12
JournalIEEE Transactions on Information Theory
Volume58
Issue number7
DOIs
StatePublished - 2012
Externally publishedYes

All Science Journal Classification (ASJC) codes

  • Information Systems
  • Computer Science Applications
  • Library and Information Sciences

Keywords

  • Bandit feedback
  • interior-point methods
  • online convex optimization
  • online learning

Fingerprint

Dive into the research topics of 'Interior-point methods for full-information and bandit online learning'. Together they form a unique fingerprint.

Cite this