TY - GEN
T1 - Reachability-based safe learning with Gaussian processes
AU - Akametalu, Anayo K.
AU - Kaynama, Shahab
AU - Fisac, Jaime F.
AU - Zeilinger, Melanie N.
AU - Gillula, Jeremy H.
AU - Tomlin, Claire J.
N1 - Publisher Copyright:
© 2014 IEEE.
PY - 2014
Y1 - 2014
N2 - Reinforcement learning for robotic applications faces the challenge of constraint satisfaction, which currently impedes its application to safety critical systems. Recent approaches successfully introduce safety based on reachability analysis, determining a safe region of the state space where the system can operate. However, overly constraining the freedom of the system can negatively affect performance, while attempting to learn less conservative safety constraints might fail to preserve safety if the learned constraints are inaccurate. We propose a novel method that uses a principled approach to learn the system's unknown dynamics based on a Gaussian process model and iteratively approximates the maximal safe set. A modified control strategy based on real-time model validation preserves safety under weaker conditions than current approaches. Our framework further incorporates safety into the reinforcement learning performance metric, allowing a better integration of safety and learning. We demonstrate our algorithm on simulations of a cart-pole system and on an experimental quadrotor application and show how our proposed scheme succeeds in preserving safety where current approaches fail to avoid an unsafe condition.
AB - Reinforcement learning for robotic applications faces the challenge of constraint satisfaction, which currently impedes its application to safety critical systems. Recent approaches successfully introduce safety based on reachability analysis, determining a safe region of the state space where the system can operate. However, overly constraining the freedom of the system can negatively affect performance, while attempting to learn less conservative safety constraints might fail to preserve safety if the learned constraints are inaccurate. We propose a novel method that uses a principled approach to learn the system's unknown dynamics based on a Gaussian process model and iteratively approximates the maximal safe set. A modified control strategy based on real-time model validation preserves safety under weaker conditions than current approaches. Our framework further incorporates safety into the reinforcement learning performance metric, allowing a better integration of safety and learning. We demonstrate our algorithm on simulations of a cart-pole system and on an experimental quadrotor application and show how our proposed scheme succeeds in preserving safety where current approaches fail to avoid an unsafe condition.
UR - http://www.scopus.com/inward/record.url?scp=84988231271&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84988231271&partnerID=8YFLogxK
U2 - 10.1109/CDC.2014.7039601
DO - 10.1109/CDC.2014.7039601
M3 - Conference contribution
AN - SCOPUS:84988231271
T3 - Proceedings of the IEEE Conference on Decision and Control
SP - 1424
EP - 1431
BT - 53rd IEEE Conference on Decision and Control,CDC 2014
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2014 53rd IEEE Annual Conference on Decision and Control, CDC 2014
Y2 - 15 December 2014 through 17 December 2014
ER -