A unified framework for stochastic optimization

Warren Buckler Powell

Research output: Contribution to journalReview articlepeer-review

182 Scopus citations

Abstract

Stochastic optimization is an umbrella term that includes over a dozen fragmented communities, using a patchwork of sometimes overlapping notational systems with algorithmic strategies that are suited to specific classes of problems. This paper reviews the canonical models of these communities, and proposes a universal modeling framework that encompasses all of these competing approaches. At the heart is an objective function that optimizes over policies that is standard in some approaches, but foreign to others. We then identify four meta-classes of policies that encompasses all of the approaches that we have identified in the research literature or industry practice. In the process, we observe that any adaptive learning algorithm, whether it is derivative-based or derivative-free, is a form of policy that can be tuned to optimize either the cumulative reward (similar to multi-armed bandit problems) or final reward (as is used in ranking and selection or stochastic search). We argue that the principles of bandit problems, long a niche community, should become a core dimension of mainstream stochastic optimization.

Original languageEnglish (US)
Pages (from-to)795-821
Number of pages27
JournalEuropean Journal of Operational Research
Volume275
Issue number3
DOIs
StatePublished - Jun 16 2019

All Science Journal Classification (ASJC) codes

  • Information Systems and Management
  • General Computer Science
  • Modeling and Simulation
  • Management Science and Operations Research

Keywords

  • Bandit problems
  • Dynamic programming
  • Reinforcement learning
  • Robust optimization
  • Simulation optimization
  • Stochastic programming

Fingerprint

Dive into the research topics of 'A unified framework for stochastic optimization'. Together they form a unique fingerprint.

Cite this