TY - GEN
T1 - Stronger Generalization Guarantees for Robot Learning by Combining Generative Models and Real-World Data
AU - Agarwal, Abhinav
AU - Veer, Sushant
AU - Ren, Allen Z.
AU - Majumdar, Anirudha
N1 - Funding Information:
A. Agarwal, A. Z. Ren, and A. Majumdar are with the Department of Mechanical and Aerospace Engineering, Princeton University, Princeton, NJ, 08544. S. Veer is with NVIDIA Research, Santa Clara, CA 95051, U.S.A. This work was conducted while S. Veer was with Princeton University. Emails: {abhinav.agarwal, allen.ren, ani.majumdar}@princeton.edu, sveer@nvidia.com The authors were supported by the NSF CAREER award [2044149], the Office of Naval Research [N00014-21-1-2803], and the Toyota Research Institute (TRI). This article solely reflects the opinions and conclusions of its authors and not ONR, NSF, TRI or any other Toyota entity.
Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - We are motivated by the problem of learning policies for robotic systems with rich sensory inputs (e.g., vision) in a manner that allows us to guarantee generalization to environments unseen during training. We provide a framework for providing such generalization guarantees by leveraging a finite dataset of real-world environments in combination with a (potentially inaccurate) generative model of environments. The key idea behind our approach is to utilize the generative model in order to implicitly specify a prior over policies. This prior is updated using the real-world dataset of environments by minimizing an upper bound on the expected cost across novel environments derived via Probably Approximately Correct (PAC)-Bayes generalization theory. We demonstrate our approach on two simulated systems with nonlinear/hybrid dynamics and rich sensing modalities: (i) quadrotor navigation with an onboard vision sensor, and (ii) grasping objects using a depth sensor. Comparisons with prior work demonstrate the ability of our approach to obtain stronger generalization guarantees by utilizing generative models. We also present hardware experiments for validating our bounds for the grasping task.
AB - We are motivated by the problem of learning policies for robotic systems with rich sensory inputs (e.g., vision) in a manner that allows us to guarantee generalization to environments unseen during training. We provide a framework for providing such generalization guarantees by leveraging a finite dataset of real-world environments in combination with a (potentially inaccurate) generative model of environments. The key idea behind our approach is to utilize the generative model in order to implicitly specify a prior over policies. This prior is updated using the real-world dataset of environments by minimizing an upper bound on the expected cost across novel environments derived via Probably Approximately Correct (PAC)-Bayes generalization theory. We demonstrate our approach on two simulated systems with nonlinear/hybrid dynamics and rich sensing modalities: (i) quadrotor navigation with an onboard vision sensor, and (ii) grasping objects using a depth sensor. Comparisons with prior work demonstrate the ability of our approach to obtain stronger generalization guarantees by utilizing generative models. We also present hardware experiments for validating our bounds for the grasping task.
UR - http://www.scopus.com/inward/record.url?scp=85136319915&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85136319915&partnerID=8YFLogxK
U2 - 10.1109/ICRA46639.2022.9811565
DO - 10.1109/ICRA46639.2022.9811565
M3 - Conference contribution
AN - SCOPUS:85136319915
T3 - Proceedings - IEEE International Conference on Robotics and Automation
SP - 4414
EP - 4421
BT - 2022 IEEE International Conference on Robotics and Automation, ICRA 2022
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 39th IEEE International Conference on Robotics and Automation, ICRA 2022
Y2 - 23 May 2022 through 27 May 2022
ER -