Best of Both Worlds in Online Control: Competitive Ratio and Policy Regret

Gautam Goel, Naman Agarwal, Karan Singh, Elad Hazan

Research output: Contribution to journalConference articlepeer-review

Abstract

We consider the fundamental problem of online control of a linear dynamical system from two different viewpoints: regret minimization and competitive analysis. We prove that the optimal competitive policy is well-approximated by a convex parameterized policy class, known as a disturbance-action control (DAC) policies. Using this structural result, we show that several recently proposed online control algorithms achieve the best of both worlds: sublinear regret vs. the best DAC policy selected in hindsight, and optimal competitive ratio, up to an additive correction which grows sub-linearly in the time horizon. We further conclude that sublinear regret vs. the optimal competitive policy is attainable when the linear dynamical system is unknown, and even when a stabilizing controller for the dynamics is not available a priori.

Original languageEnglish (US)
Pages (from-to)1345-1356
Number of pages12
JournalProceedings of Machine Learning Research
Volume211
StatePublished - 2023
Externally publishedYes
Event5th Annual Conference on Learning for Dynamics and Control, L4DC 2023 - Philadelphia, United States
Duration: Jun 15 2023Jun 16 2023

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Software
  • Control and Systems Engineering
  • Statistics and Probability

Keywords

  • Nonstochastic control
  • competitive ratio
  • regret minimization

Fingerprint

Dive into the research topics of 'Best of Both Worlds in Online Control: Competitive Ratio and Policy Regret'. Together they form a unique fingerprint.

Cite this