NEWTRON: An efficient bandit algorithm for online multiclass prediction

Elad Hazan, Satyen Kale

Research output: Chapter in Book/Report/Conference proceedingConference contribution

31 Scopus citations

Abstract

We present an efficient algorithm for the problem of online multiclass prediction with bandit feedback in the fully adversarial setting. We measure its regret with respect to the log-loss defined in [AR09], which is parameterized by a scalar α. We prove that the regret of NEWTRON is O(log T) when α is a constant that does not vary with horizon T, and at most O(T 2/3) if α is allowed to increase to infinity with T. For α = O(log T), the regret is bounded by O( √T), thus solving the open problem of [KSST08, AR09]. Our algorithm is based on a novel application of the online Newton method [HAK07]. We test our algorithm and show it to perform well in experiments, even when α is a small constant.

Original languageEnglish (US)
Title of host publicationAdvances in Neural Information Processing Systems 24
Subtitle of host publication25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011
PublisherNeural Information Processing Systems
ISBN (Print)9781618395993
StatePublished - 2011
Externally publishedYes
Event25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011 - Granada, Spain
Duration: Dec 12 2011Dec 14 2011

Publication series

NameAdvances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011

Other

Other25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011
Country/TerritorySpain
CityGranada
Period12/12/1112/14/11

All Science Journal Classification (ASJC) codes

  • Information Systems

Fingerprint

Dive into the research topics of 'NEWTRON: An efficient bandit algorithm for online multiclass prediction'. Together they form a unique fingerprint.

Cite this