The knowledge gradient algorithm for online subset selection

Ilya O. Ryzhov, Warren Powell

Research output: Chapter in Book/Report/Conference proceedingConference contribution

22 Scopus citations

Abstract

We derive a one-period look-ahead policy for online subset selection problems, where learning about one subset also gives us information about other subsets. The subset selection problem is treated as a multi-armed bandit problem with correlated prior beliefs. We show that our decision rule is easily computable, and present experimental evidence that the policy is competitive against other online learning policies.

Original languageEnglish (US)
Title of host publication2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2009 - Proceedings
Pages137-144
Number of pages8
DOIs
StatePublished - 2009
Event2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2009 - Nashville, TN, United States
Duration: Mar 30 2009Apr 2 2009

Publication series

Name2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2009 - Proceedings

Other

Other2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2009
Country/TerritoryUnited States
CityNashville, TN
Period3/30/094/2/09

All Science Journal Classification (ASJC) codes

  • Computational Theory and Mathematics
  • Software

Fingerprint

Dive into the research topics of 'The knowledge gradient algorithm for online subset selection'. Together they form a unique fingerprint.

Cite this