Logarithmic regret algorithms for online convex optimization

Elad Hazan, Adam Kalai, Satyen Kale, Amit Agarwal

Research output: Chapter in Book/Report/Conference proceedingConference contribution

74 Scopus citations

Abstract

In an online convex optimization problem a decision-maker makes a sequence of decisions, i.e., chooses a sequence of points in Euclidean space, from a fixed feasible set. After each point is chosen, it encounters a sequence of (possibly unrelated) convex cost functions. Zinkevich [Zin03] introduced this framework, which models many natural repeated decision-making problems and generalizes, many existing problems such as Prediction from Expert Advice and Cover's Universal Portfolios. Zinkevich showed that a simple online gradient descent algorithm achieves additive regret O(√T), for an arbitrary sequence of T convex cost functions (of bounded gradients), with respect to the best single decision in hindsight. In this paper, we give algorithms that achieve regret O(log(T)) for an arbitrary sequence of strictly convex functions (with bounded first and second derivatives). This mirrors what has been done for the special cases of prediction from expert advice by Kivinen and Warmuth [KW99], and Universal Portfolios by Cover [Cov91], We propose several algorithms achieving logarithmic regret, which besides being more general are also much more efficient to implement. The main new ideas give rise to an efficient algorithm based on the Newton method for optimization, a new tool in the field, Our analysis shows a surprising connection to follow-the-leader method, and builds on the recent work of Agarwal and Hazan [AH05], We also analyze other algorithms, which tie together several different previous approaches including follow-the-leader, exponential weighting, Cover's algorithm and gradient descent.

Original languageEnglish (US)
Title of host publicationLearning Theory - 19th Annual Conference on Learning Theory, COLT 2006, Proceedings
PublisherSpringer Verlag
Pages499-513
Number of pages15
ISBN (Print)3540352945, 9783540352945
DOIs
StatePublished - 2006
Event19th Annual Conference on Learning Theory, COLT 2006 - Pittsburgh, PA, United States
Duration: Jun 22 2006Jun 25 2006

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume4005 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other19th Annual Conference on Learning Theory, COLT 2006
Country/TerritoryUnited States
CityPittsburgh, PA
Period6/22/066/25/06

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Logarithmic regret algorithms for online convex optimization'. Together they form a unique fingerprint.

Cite this