Optimal control of distributed Markov decision processes with network delays

Sachin Adlakha, Ritesh Madan, Sanjay Lall, Andrea Goldsmith

Research output: Chapter in Book/Report/Conference proceedingConference contribution

8 Scopus citations

Abstract

We consider the problem of finding an optimal feedback controller for a network of interconnected subsystems, each of which is a Markov decision process. Each subsystem is coupled to its neighbors via communication links by which signals are delayed but are otherwise transmitted noise-free. One of the subsystems receives input from a controller, and the controller receives delayed statemeasurements from all of the subsystems. We show that an optimal controller requires only a finite amount of memory which does not grow with time, and obtain a bound on the amount of memory that a controller needs to have for each subsystem. This makes the computation of an optimal controller through dynamic programming tractable. We illustrate our result by a numerical example, and show that it generalizes previous results on Markov decision processes with delayed state measurements.

Original languageEnglish (US)
Title of host publicationProceedings of the 46th IEEE Conference on Decision and Control 2007, CDC
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages3308-3314
Number of pages7
ISBN (Print)1424414989, 9781424414987
DOIs
StatePublished - 2007
Externally publishedYes
Event46th IEEE Conference on Decision and Control 2007, CDC - New Orleans, LA, United States
Duration: Dec 12 2007Dec 14 2007

Publication series

NameProceedings of the IEEE Conference on Decision and Control
ISSN (Print)0743-1546
ISSN (Electronic)2576-2370

Other

Other46th IEEE Conference on Decision and Control 2007, CDC
Country/TerritoryUnited States
CityNew Orleans, LA
Period12/12/0712/14/07

All Science Journal Classification (ASJC) codes

  • Control and Systems Engineering
  • Modeling and Simulation
  • Control and Optimization

Fingerprint

Dive into the research topics of 'Optimal control of distributed Markov decision processes with network delays'. Together they form a unique fingerprint.

Cite this