Bayesian active learning with basis functions

Ilya O. Ryzhov, Warren Buckler Powell

Research output: Chapter in Book/Report/Conference proceedingConference contribution

8 Scopus citations

Abstract

A common technique for dealing with the curse of dimensionality in approximate dynamic programming is to use a parametric value function approximation, where the value of being in a state is assumed to be a linear combination of basis functions. Even with this simplification, we face the exploration/exploitation dilemma: an inaccurate approximation may lead to poor decisions, making it necessary to sometimes explore actions that appear to be suboptimal. We propose a Bayesian strategy for active learning with basis functions, based on the knowledge gradient concept from the optimal learning literature. The new method performs well in numerical experiments conducted on an energy storage problem.

Original languageEnglish (US)
Title of host publicationIEEE SSCI 2011
Subtitle of host publicationSymposium Series on Computational Intelligence - ADPRL 2011: 2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning
Pages143-150
Number of pages8
DOIs
StatePublished - Sep 5 2011
EventSymposium Series on Computational Intelligence, IEEE SSCI2011 - 2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2011 - Paris, France
Duration: Apr 11 2011Apr 15 2011

Publication series

NameIEEE SSCI 2011: Symposium Series on Computational Intelligence - ADPRL 2011: 2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning

Other

OtherSymposium Series on Computational Intelligence, IEEE SSCI2011 - 2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2011
Country/TerritoryFrance
CityParis
Period4/11/114/15/11

All Science Journal Classification (ASJC) codes

  • Computer Science Applications
  • Software

Fingerprint

Dive into the research topics of 'Bayesian active learning with basis functions'. Together they form a unique fingerprint.

Cite this