TY - JOUR
T1 - Explaining landscape connectivity of low-cost solutions for multilayer nets
AU - Kuditipudi, Rohith
AU - Wang, Xiang
AU - Lee, Holden
AU - Zhang, Yi
AU - Li, Zhiyuan
AU - Hu, Wei
AU - Arora, Sanjeev
AU - Ge, Rong
N1 - Funding Information:
Rong Ge acknowledges funding from NSF CCF-1704656, NSF CCF-1845171 (CAREER), the Sloan Fellowship and Google Faculty Research Award. Sanjeev Arora acknowledges funding from the NSF, ONR, Simons Foundation, Schmidt Foundation, Amazon Research, DARPA and SRC.
Publisher Copyright:
© 2019 Neural information processing systems foundation. All rights reserved.
PY - 2019
Y1 - 2019
N2 - Mode connectivity (Garipov et al., 2018; Draxler et al., 2018) is a surprising phenomenon in the loss landscape of deep nets. Optima'at least those discovered by gradient-based optimization'turn out to be connected by simple paths on which the loss function is almost constant. Often, these paths can be chosen to be piece-wise linear, with as few as two segments. We give mathematical explanations for this phenomenon, assuming generic properties (such as dropout stability and noise stability) of well-trained deep nets, which have previously been identified as part of understanding the generalization properties of deep nets. Our explanation holds for realistic multilayer nets, and experiments are presented to verify the theory.
AB - Mode connectivity (Garipov et al., 2018; Draxler et al., 2018) is a surprising phenomenon in the loss landscape of deep nets. Optima'at least those discovered by gradient-based optimization'turn out to be connected by simple paths on which the loss function is almost constant. Often, these paths can be chosen to be piece-wise linear, with as few as two segments. We give mathematical explanations for this phenomenon, assuming generic properties (such as dropout stability and noise stability) of well-trained deep nets, which have previously been identified as part of understanding the generalization properties of deep nets. Our explanation holds for realistic multilayer nets, and experiments are presented to verify the theory.
UR - http://www.scopus.com/inward/record.url?scp=85090169592&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85090169592&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85090169592
SN - 1049-5258
VL - 32
JO - Advances in Neural Information Processing Systems
JF - Advances in Neural Information Processing Systems
T2 - 33rd Annual Conference on Neural Information Processing Systems, NeurIPS 2019
Y2 - 8 December 2019 through 14 December 2019
ER -