On optimal foraging and multi-armed bandits

Vaibhav Srivastava, Paul Reverdy, Naomi E. Leonard

Research output: Chapter in Book/Report/Conference proceedingConference contribution

12 Scopus citations

Abstract

We consider two variants of the standard multi-armed bandit problem, namely, the multi-armed bandit problem with transition costs and the multi-armed bandit problem on graphs. We develop block allocation algorithms for these problems that achieve an expected cumulative regret that is uniformly dominated by a logarithmic function of time, and an expected cumulative number of transitions from one arm to another arm uniformly dominated by a double-logarithmic function of time. We observe that the multi-armed bandit problem with transition costs and the associated block allocation algorithm capture the key features of popular animal foraging models in literature.

Original languageEnglish (US)
Title of host publication2013 51st Annual Allerton Conference on Communication, Control, and Computing, Allerton 2013
PublisherIEEE Computer Society
Pages494-499
Number of pages6
ISBN (Print)9781479934096
DOIs
StatePublished - 2013
Event51st Annual Allerton Conference on Communication, Control, and Computing, Allerton 2013 - Monticello, IL, United States
Duration: Oct 2 2013Oct 4 2013

Publication series

Name2013 51st Annual Allerton Conference on Communication, Control, and Computing, Allerton 2013

Other

Other51st Annual Allerton Conference on Communication, Control, and Computing, Allerton 2013
CountryUnited States
CityMonticello, IL
Period10/2/1310/4/13

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Control and Systems Engineering

Fingerprint Dive into the research topics of 'On optimal foraging and multi-armed bandits'. Together they form a unique fingerprint.

Cite this