TY - GEN
T1 - Bayesian active learning with basis functions
AU - Ryzhov, Ilya O.
AU - Powell, Warren Buckler
PY - 2011/9/5
Y1 - 2011/9/5
N2 - A common technique for dealing with the curse of dimensionality in approximate dynamic programming is to use a parametric value function approximation, where the value of being in a state is assumed to be a linear combination of basis functions. Even with this simplification, we face the exploration/exploitation dilemma: an inaccurate approximation may lead to poor decisions, making it necessary to sometimes explore actions that appear to be suboptimal. We propose a Bayesian strategy for active learning with basis functions, based on the knowledge gradient concept from the optimal learning literature. The new method performs well in numerical experiments conducted on an energy storage problem.
AB - A common technique for dealing with the curse of dimensionality in approximate dynamic programming is to use a parametric value function approximation, where the value of being in a state is assumed to be a linear combination of basis functions. Even with this simplification, we face the exploration/exploitation dilemma: an inaccurate approximation may lead to poor decisions, making it necessary to sometimes explore actions that appear to be suboptimal. We propose a Bayesian strategy for active learning with basis functions, based on the knowledge gradient concept from the optimal learning literature. The new method performs well in numerical experiments conducted on an energy storage problem.
UR - http://www.scopus.com/inward/record.url?scp=80052219755&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=80052219755&partnerID=8YFLogxK
U2 - 10.1109/ADPRL.2011.5967365
DO - 10.1109/ADPRL.2011.5967365
M3 - Conference contribution
AN - SCOPUS:80052219755
SN - 9781424498888
T3 - IEEE SSCI 2011: Symposium Series on Computational Intelligence - ADPRL 2011: 2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning
SP - 143
EP - 150
BT - IEEE SSCI 2011
T2 - Symposium Series on Computational Intelligence, IEEE SSCI2011 - 2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2011
Y2 - 11 April 2011 through 15 April 2011
ER -