Adaptive stepsizes for recursive estimation with applications in approximate dynamic programming

Abraham P. George, Warren Buckler Powell

Research output: Contribution to journalArticle

87 Scopus citations

Abstract

We address the problem of determining optimal stepsizes for estimating parameters in the context of approximate dynamic programming. The sufficient conditions for convergence of the stepsize rules have been known for 50 years, but practical computational work tends to use formulas with parameters that have to be tuned for specific applications. The problem is that in most applications in dynamic programming, observations for estimating a value function typically come from a data series that can be initially highly transient. The degree of transience affects the choice of stepsize parameters that produce the fastest convergence. In addition, the degree of initial transience can vary widely among the value function parameters for the same dynamic program. This paper reviews the literature on deterministic and stochastic stepsize rules, and derives formulas for optimal stepsizes for minimizing estimation error. This formula assumes certain parameters are known, and an approximation is proposed for the case where the parameters are unknown. Experimental work shows that the approximation provides faster convergence than other popular formulas.

Original languageEnglish (US)
Pages (from-to)167-198
Number of pages32
JournalMachine Learning
Volume65
Issue number1
DOIs
StatePublished - Oct 2006

All Science Journal Classification (ASJC) codes

  • Software
  • Artificial Intelligence

Keywords

  • Adaptive learning
  • Approximate dynamic programming
  • Kalman filter
  • Stochastic stepsize

Fingerprint Dive into the research topics of 'Adaptive stepsizes for recursive estimation with applications in approximate dynamic programming'. Together they form a unique fingerprint.

  • Cite this