On distributed cooperative decision-making in multiarmed bandits

Peter Landgren, Vaibhav Srivastava, Naomi Ehrich Leonard

Research output: Chapter in Book/Report/Conference proceedingConference contribution

42 Scopus citations

Abstract

We study the explore-exploit tradeoff in distributed cooperative decision-making using the context of the multiarmed bandit (MAB) problem. For the distributed cooperative MAB problem, we design the cooperative UCB algorithm that comprises two interleaved distributed processes: (i) running consensus algorithms for estimation of rewards, and (ii) upper-confidence-bound-based heuristics for selection of arms. We rigorously analyze the performance of the cooperative UCB algorithm and characterize the influence of communication graph structure on the decision-making performance of the group.

Original languageEnglish (US)
Title of host publication2016 European Control Conference, ECC 2016
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages243-248
Number of pages6
ISBN (Electronic)9781509025916
DOIs
StatePublished - 2016
Externally publishedYes
Event2016 European Control Conference, ECC 2016 - Aalborg, Denmark
Duration: Jun 29 2016Jul 1 2016

Publication series

Name2016 European Control Conference, ECC 2016

Other

Other2016 European Control Conference, ECC 2016
Country/TerritoryDenmark
CityAalborg
Period6/29/167/1/16

All Science Journal Classification (ASJC) codes

  • Control and Systems Engineering
  • Control and Optimization

Fingerprint

Dive into the research topics of 'On distributed cooperative decision-making in multiarmed bandits'. Together they form a unique fingerprint.

Cite this