TY - GEN

T1 - On the Power of Over-parametrization in Neural Networks with Quadratic Activation

AU - Du, Simon S.

AU - Lee, Jason D.

PY - 2018

Y1 - 2018

N2 - We provide new theoretical insights on why over- parametrization is effective in learning neural networks. For a k hidden node shallow network with quadratic activation and n training data points, we show as long as k > y/2n, over-parametrization enables local search algorithms to find a globally optimal solution for general smooth and convex loss functions. Further, despite that the number of parameters may exceed the sample size, using theory of Radcmacher complexity, wc show with weight decay, the solution also generalizes well if the data is sampled from a regular distribution such as Gaussian. To prove when k > y/2n, the loss function has benign landscape properties, we adopt an idea from smoothed analysis, which may have other applications in studying loss surfaces of neural networks.i.

AB - We provide new theoretical insights on why over- parametrization is effective in learning neural networks. For a k hidden node shallow network with quadratic activation and n training data points, we show as long as k > y/2n, over-parametrization enables local search algorithms to find a globally optimal solution for general smooth and convex loss functions. Further, despite that the number of parameters may exceed the sample size, using theory of Radcmacher complexity, wc show with weight decay, the solution also generalizes well if the data is sampled from a regular distribution such as Gaussian. To prove when k > y/2n, the loss function has benign landscape properties, we adopt an idea from smoothed analysis, which may have other applications in studying loss surfaces of neural networks.i.

UR - http://www.scopus.com/inward/record.url?scp=85057275481&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85057275481&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85057275481

T3 - 35th International Conference on Machine Learning, ICML 2018

SP - 2132

EP - 2141

BT - 35th International Conference on Machine Learning, ICML 2018

A2 - Krause, Andreas

A2 - Dy, Jennifer

PB - International Machine Learning Society (IMLS)

T2 - 35th International Conference on Machine Learning, ICML 2018

Y2 - 10 July 2018 through 15 July 2018

ER -