TY - JOUR
T1 - Safely Learning Dynamical Systems from Short Trajectories
AU - Ahmadi, Amir Ali
AU - Chaudhry, Abraar
AU - Sindhwani, Vikas
AU - Tu, Stephen
N1 - Funding Information:
AAA and AC were partially supported by the MURI award of the AFOSR, the DARPA Young Faculty Award, the CAREER Award of the NSF, the Google Faculty Award, the Innovation Award of the School of Engineering and Applied Sciences at Princeton University, and the Sloan Fellowship.
Publisher Copyright:
© 2021 A.A. Ahmadi, A. Chaudhry, V. Sindhwani & S. Tu.
PY - 2021
Y1 - 2021
N2 - A fundamental challenge in learning to control an unknown dynamical system is to reduce model uncertainty by making measurements while maintaining safety. In this work, we formulate a mathematical definition of what it means to safely learn a dynamical system by sequentially deciding where to initialize the next trajectory. In our framework, the state of the system is required to stay within a given safety region under the (possibly repeated) action of all dynamical systems that are consistent with the information gathered so far. For our first two results, we consider the setting of safely learning linear dynamics. We present a linear programming-based algorithm that either safely recovers the true dynamics from trajectories of length one, or certifies that safe learning is impossible. We also give an efficient semidefinite representation of the set of initial conditions whose resulting trajectories of length two are guaranteed to stay in the safety region. For our final result, we study the problem of safely learning a nonlinear dynamical system. We give a second-order cone programming based representation of the set of initial conditions that are guaranteed to remain in the safety region after one application of the system dynamics.
AB - A fundamental challenge in learning to control an unknown dynamical system is to reduce model uncertainty by making measurements while maintaining safety. In this work, we formulate a mathematical definition of what it means to safely learn a dynamical system by sequentially deciding where to initialize the next trajectory. In our framework, the state of the system is required to stay within a given safety region under the (possibly repeated) action of all dynamical systems that are consistent with the information gathered so far. For our first two results, we consider the setting of safely learning linear dynamics. We present a linear programming-based algorithm that either safely recovers the true dynamics from trajectories of length one, or certifies that safe learning is impossible. We also give an efficient semidefinite representation of the set of initial conditions whose resulting trajectories of length two are guaranteed to stay in the safety region. For our final result, we study the problem of safely learning a nonlinear dynamical system. We give a second-order cone programming based representation of the set of initial conditions that are guaranteed to remain in the safety region after one application of the system dynamics.
KW - conic programming
KW - learning dynamical systems
KW - robust optimization
KW - safe learning
KW - uncertainty quantification
UR - http://www.scopus.com/inward/record.url?scp=85161802994&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85161802994&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85161802994
SN - 2640-3498
VL - 144
SP - 498
EP - 509
JO - Proceedings of Machine Learning Research
JF - Proceedings of Machine Learning Research
T2 - 3rd Annual Conference on Learning for Dynamics and Control, L4DC 2021
Y2 - 7 June 2021 through 8 June 2021
ER -