Abstract
We consider the problem of generating maximally adversarial disturbances for a given controller assuming only blackbox access to it. We propose an online learning approach to this problem that adaptively generates disturbances based on control inputs chosen by the controller. The goal of the disturbance generator is to minimize regret versus a benchmark disturbance-generating policy class, i.e., to maximize the cost incurred by the controller as well as possible compared to the best possible disturbance generator in hindsight (chosen from a benchmark policy class). In the setting where the dynamics are linear and the costs are quadratic, we formulate our problem as an online trust region (OTR) problem with memory and present a new online learning algorithm (MOTR) for this problem. We prove that this method competes with the best disturbance generator in hindsight (chosen from a rich class of benchmark policies that includes linear-dynamical disturbance generating policies). We demonstrate our approach on two simulated examples: (i) synthetically generated linear systems, and (ii) generating wind disturbances for the popular PX4 controller in the AirSim simulator. On these examples, we demonstrate that our approach outperforms several baseline approaches, including H∞ disturbance generation and gradient-based methods.
Original language | English (US) |
---|---|
Pages (from-to) | 1192-1204 |
Number of pages | 13 |
Journal | Proceedings of Machine Learning Research |
Volume | 144 |
State | Published - 2021 |
Event | 3rd Annual Conference on Learning for Dynamics and Control, L4DC 2021 - Virtual, Online, Switzerland Duration: Jun 7 2021 → Jun 8 2021 |
All Science Journal Classification (ASJC) codes
- Artificial Intelligence
- Software
- Control and Systems Engineering
- Statistics and Probability
Keywords
- Adversarial Disturbances
- Controller Verification
- Online Learning