Abstract
Here and in a follow-on paper, we consider a simple control problem in which the underlying dynamics depend on a parameter a that is unknown and must be learned. In this paper, we assume that a is bounded, i.e., that jaj ≤ aMAX, and we study two variants of the control problem. In the first variant, Bayesian control, we are given a prior probability distribution for a and we seek a strategy that minimizes the expected value of a given cost function. Assuming that we can solve a certain PDE (the Hamilton–Jacobi–Bellman equation), we produce optimal strategies for Bayesian control. In the second variant, agnostic control, we assume nothing about a and we seek a strategy that minimizes a quantity called the regret. We produce a prior probability distribution dPrior(a) supported on a finite subset of [-aMAX; aMAX] so that the agnostic control problem reduces to the Bayesian control problem for the prior dPrior(a).
Original language | English (US) |
---|---|
Pages (from-to) | 651-744 |
Number of pages | 94 |
Journal | Revista Matematica Iberoamericana |
Volume | 41 |
Issue number | 2 |
DOIs | |
State | Published - 2025 |
All Science Journal Classification (ASJC) codes
- General Mathematics
Keywords
- adaptive control
- agnostic control
- optimal control