TY - GEN

T1 - The computational power of optimization in online learning

AU - Hazan, Elad

AU - Koren, Tomer

N1 - Publisher Copyright:
© 2016 ACM.

PY - 2016/6/19

Y1 - 2016/6/19

N2 - We consider the fundamental problem of prediction with expert advice where the experts are "optimizable": there is a black-box optimization oracle that can be used to compute, in constant time, the leading expert in retrospect at any point in time. In this setting, we give a novel online algorithm that atta?ins vanishing regret with respect to N experts in total Õ (√n)q computation time. We also give a lower bound showing that this running time cannot be improved (up to log factors) in the oracle model, thereby exhibiting a quadratic speedup as compared to the standard, oracle-free setting where the required time for vanishing rer gret is TpNq. These results demonstrate an exponential gap between the power of optimization in online learning and its power in statistical learning: in the latter, an optimization oracle-i.e., an efficient empirical risk minimizer-allows to learn a finite hypothesis class of size N in time Oplog Nq. We also study the implications of our results to learning in repeated zero-sum games, in a setting where the players have access to oracles that compute, in constant time, their bestresponse to any mixed strategy of their opponent. We show that the runtime required for approx?imating the minimax r value of the game in this setting is Tp Nq, yielding again a quadratic improvement upon the oracle-free setting, where r Θ(N) is known to be tight.

AB - We consider the fundamental problem of prediction with expert advice where the experts are "optimizable": there is a black-box optimization oracle that can be used to compute, in constant time, the leading expert in retrospect at any point in time. In this setting, we give a novel online algorithm that atta?ins vanishing regret with respect to N experts in total Õ (√n)q computation time. We also give a lower bound showing that this running time cannot be improved (up to log factors) in the oracle model, thereby exhibiting a quadratic speedup as compared to the standard, oracle-free setting where the required time for vanishing rer gret is TpNq. These results demonstrate an exponential gap between the power of optimization in online learning and its power in statistical learning: in the latter, an optimization oracle-i.e., an efficient empirical risk minimizer-allows to learn a finite hypothesis class of size N in time Oplog Nq. We also study the implications of our results to learning in repeated zero-sum games, in a setting where the players have access to oracles that compute, in constant time, their bestresponse to any mixed strategy of their opponent. We show that the runtime required for approx?imating the minimax r value of the game in this setting is Tp Nq, yielding again a quadratic improvement upon the oracle-free setting, where r Θ(N) is known to be tight.

KW - Best-response dynamics

KW - Learning in games

KW - Local search

KW - Online learning

KW - Optimization oracles

KW - Zero-sum games

UR - http://www.scopus.com/inward/record.url?scp=84979298967&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84979298967&partnerID=8YFLogxK

U2 - 10.1145/2897518.2897536

DO - 10.1145/2897518.2897536

M3 - Conference contribution

AN - SCOPUS:84979298967

T3 - Proceedings of the Annual ACM Symposium on Theory of Computing

SP - 128

EP - 141

BT - STOC 2016 - Proceedings of the 48th Annual ACM SIGACT Symposium on Theory of Computing

A2 - Mansour, Yishay

A2 - Wichs, Daniel

PB - Association for Computing Machinery

T2 - 48th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2016

Y2 - 19 June 2016 through 21 June 2016

ER -