A review of stochastic algorithms with continuous value function approximation and some new approximate policy iteration algorithms for multidimensional continuous applications

Warren Buckler Powell, Jun Ma

Research output: Contribution to journalReview articlepeer-review

30 Scopus citations

Abstract

We review the literature on approximate dynamic programming, with the goal of better understanding the theory behind practical algorithms for solving dynamic programs with continuous and vector-valued states and actions and complex information processes. We build on the literature that has addressed the well-known problem of multidimensional (and possibly continuous) states, and the extensive literature on model-free dynamic programming, which also assumes that the expectation in Bellman's equation cannot be computed. However, we point out complications that arise when the actions/controls are vector-valued and possibly continuous. We then describe some recent research by the authors on approximate policy iteration algorithms that offer convergence guarantees (with technical assumptions) for both parametric and nonparametric architectures for the value function.

Original languageEnglish (US)
Pages (from-to)336-352
Number of pages17
JournalJournal of Control Theory and Applications
Volume9
Issue number3
DOIs
StatePublished - Aug 2011

All Science Journal Classification (ASJC) codes

  • Control and Systems Engineering
  • Hardware and Architecture
  • Computer Science Applications

Keywords

  • Approximate dynamic programming
  • Approximation algorithms
  • Optimal control
  • Reinforcement learning

Fingerprint

Dive into the research topics of 'A review of stochastic algorithms with continuous value function approximation and some new approximate policy iteration algorithms for multidimensional continuous applications'. Together they form a unique fingerprint.

Cite this