TY - JOUR
T1 - Finite-time lower bounds for the two-armed bandit problem
AU - Kulkarni, Sanjeev R.
AU - Lugosi, Gábor
N1 - Funding Information:
Manuscript received November 5, 1997, revised June 1, 1999. Recommended by Associate Editor, D. Yao. This work was supported in part by the National Science Foundation under NYI Grant IRI-9457645. The work of G. Lugosi was supported by DIGES under Grant PB96-0300. S. R. Kulkarni is with the Department of Electrical Engineering, Princeton University, Princeton, NJ 08544 USA (e-mail: [email protected]). G. Lugosi is with the Department of Economics, Pompeu Fabra University, Ramon Trias Fargas 25-27, 08005 Barcelona, Spain (e-mail: [email protected]). Publisher Item Identifier S 0018-9286(00)04082-4.
PY - 2000/4
Y1 - 2000/4
N2 - We obtain minimax lower bounds on the regret for the classical two-armed bandit problem. We provide a finite-sample minimax version of the well-known log n asymptotic lower bound of Lai and Robbins. The finite-time lower bound allows us to derive conditions for the amount of time necessary to make any significant gain over a random guessing strategy. These bounds depend on the class of possible distributions of the rewards associated with the arms. For example, in contrast to the log n asymptotic results on the regret, we show that the minimax regret is achieved by mere random guessing under fairly mild conditions on the set of allowable configurations of the two arms.
AB - We obtain minimax lower bounds on the regret for the classical two-armed bandit problem. We provide a finite-sample minimax version of the well-known log n asymptotic lower bound of Lai and Robbins. The finite-time lower bound allows us to derive conditions for the amount of time necessary to make any significant gain over a random guessing strategy. These bounds depend on the class of possible distributions of the rewards associated with the arms. For example, in contrast to the log n asymptotic results on the regret, we show that the minimax regret is achieved by mere random guessing under fairly mild conditions on the set of allowable configurations of the two arms.
UR - http://www.scopus.com/inward/record.url?scp=0034171759&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0034171759&partnerID=8YFLogxK
U2 - 10.1109/9.847107
DO - 10.1109/9.847107
M3 - Article
AN - SCOPUS:0034171759
SN - 0018-9286
VL - 45
SP - 711
EP - 714
JO - IEEE Transactions on Automatic Control
JF - IEEE Transactions on Automatic Control
IS - 4
ER -