Abstract
We consider the fundamental problem of online control of a linear dynamical system from two different viewpoints: regret minimization and competitive analysis. We prove that the optimal competitive policy is well-approximated by a convex parameterized policy class, known as a disturbance-action control (DAC) policies. Using this structural result, we show that several recently proposed online control algorithms achieve the best of both worlds: sublinear regret vs. the best DAC policy selected in hindsight, and optimal competitive ratio, up to an additive correction which grows sub-linearly in the time horizon. We further conclude that sublinear regret vs. the optimal competitive policy is attainable when the linear dynamical system is unknown, and even when a stabilizing controller for the dynamics is not available a priori.
Original language | English (US) |
---|---|
Pages (from-to) | 1345-1356 |
Number of pages | 12 |
Journal | Proceedings of Machine Learning Research |
Volume | 211 |
State | Published - 2023 |
Externally published | Yes |
Event | 5th Annual Conference on Learning for Dynamics and Control, L4DC 2023 - Philadelphia, United States Duration: Jun 15 2023 → Jun 16 2023 |
All Science Journal Classification (ASJC) codes
- Artificial Intelligence
- Software
- Control and Systems Engineering
- Statistics and Probability
Keywords
- Nonstochastic control
- competitive ratio
- regret minimization