Distributed reinforcement learning in multi-agent networks

Soummya Kar, Jose M.F. Moura, H. Vincent Poor

Research output: Chapter in Book/Report/Conference proceedingConference contribution

10 Scopus citations

Abstract

Distributed reinforcement learning algorithms for collaborative multi-agent Markov decision processes (MDPs) are presented and analyzed. The networked setup consists of a collection of agents (learners) which respond differently (depending on their instantaneous one-stage random costs) to a global controlled state and the control actions of a remote controller. With the objective of jointly learning the optimal stationary control policy (in the absence of global state transition and local agent cost statistics) that minimizes network-averaged infinite horizon discounted cost, the paper presents distributed variants of Q-learning of the consensus + innovations type in which each agent sequentially refines its learning parameters by locally processing its instantaneous payoff data and the information received from neighboring agents. Under broad conditions on the multi-agent decision model and mean connectivity of the inter-agent communication network, the proposed distributed algorithms are shown to achieve optimal learning asymptotically, i.e., almost surely (a.s.) each network agent is shown to learn the value function and the optimal stationary control policy of the collaborative MDP asymptotically. Further, convergence rate estimates for the proposed class of distributed learning algorithms are obtained.

Original languageEnglish (US)
Title of host publication2013 5th IEEE International Workshop on Computational Advances in Multi-Sensor Adaptive Processing, CAMSAP 2013
Pages296-299
Number of pages4
DOIs
StatePublished - 2013
Event2013 5th IEEE International Workshop on Computational Advances in Multi-Sensor Adaptive Processing, CAMSAP 2013 - Saint Martin, France
Duration: Dec 15 2013Dec 18 2013

Publication series

Name2013 5th IEEE International Workshop on Computational Advances in Multi-Sensor Adaptive Processing, CAMSAP 2013

Other

Other2013 5th IEEE International Workshop on Computational Advances in Multi-Sensor Adaptive Processing, CAMSAP 2013
Country/TerritoryFrance
CitySaint Martin
Period12/15/1312/18/13

All Science Journal Classification (ASJC) codes

  • Computer Science Applications

Keywords

  • Multi-agent stochastic control
  • collaborative network processing
  • consensus + innovations
  • distributed Q-learning
  • distributed stochastic approximation
  • reinforcement learning

Fingerprint

Dive into the research topics of 'Distributed reinforcement learning in multi-agent networks'. Together they form a unique fingerprint.

Cite this