A knowledge-gradient policy for sequential information collection

Peter I. Frazier, Warren B. Powell, Savas Dayanik

Research output: Contribution to journalArticlepeer-review

327 Scopus citations

Abstract

In a sequential Bayesian ranking and selection problem with independent normal populations and common known variance, we study a previously introduced measurement policy which we refer to as the knowledge-gradient policy. This policy myopically maximizes the expected increment in the value of information in each time period, where the value is measured according to the terminal utility function. We show that the knowledge-gradient policy is optimal both when the horizon is a single time period and in the limit as the horizon extends to infinity. We show furthermore that, in some special cases, the knowledge-gradient policy is optimal regardless of the length of any given fixed total sampling horizon. We bound the knowledge-gradient policy's suboptimality in the remaining cases, and show through simulations that it performs competitively with or significantly better than other policies.

Original languageEnglish (US)
Pages (from-to)2410-2439
Number of pages30
JournalSIAM Journal on Control and Optimization
Volume47
Issue number5
DOIs
StatePublished - 2008

All Science Journal Classification (ASJC) codes

  • Control and Optimization
  • Applied Mathematics

Keywords

  • Bayesian statistics
  • Ranking and selection
  • Sequential decision analysis

Fingerprint

Dive into the research topics of 'A knowledge-gradient policy for sequential information collection'. Together they form a unique fingerprint.

Cite this