Lexicographic and depth-sensitive margins in homogeneous and non-homogeneous deep models

Mor Shpigel Nacson, Suriya Gunasekar, Jason D. Lee, Nathan Srebro, Daniel Soudry

Research output: Chapter in Book/Report/Conference proceedingConference contribution

7 Scopus citations

Abstract

With an eye toward understanding complexity control in deep learning, we study how infinitesimal regularization or gradient descent optimization lead to margin maximizing solutions in both homogeneous and non homogeneous models, extending previous work that focused on infinitesimal regularization only in homogeneous models. To this end we study the limit of loss minimization with a diverging norm constraint (the "constrained path"), relate it to the limit of a "margin path" and characterize the resulting solution. For non-homogeneous ensemble models, which output is a sum of homogeneous sub-models, we show that this solution discards the shallowest sub-models if they are unnecessary. For homogeneous models, we show convergence to a "lexicographic max-margin solution", and provide conditions under which max-margin solutions are also attained as the limit of unconstrained gradient descent.

Original languageEnglish (US)
Title of host publication36th International Conference on Machine Learning, ICML 2019
PublisherInternational Machine Learning Society (IMLS)
Pages8224-8233
Number of pages10
ISBN (Electronic)9781510886988
StatePublished - 2019
Externally publishedYes
Event36th International Conference on Machine Learning, ICML 2019 - Long Beach, United States
Duration: Jun 9 2019Jun 15 2019

Publication series

Name36th International Conference on Machine Learning, ICML 2019
Volume2019-June

Conference

Conference36th International Conference on Machine Learning, ICML 2019
Country/TerritoryUnited States
CityLong Beach
Period6/9/196/15/19

All Science Journal Classification (ASJC) codes

  • Education
  • Computer Science Applications
  • Human-Computer Interaction

Fingerprint

Dive into the research topics of 'Lexicographic and depth-sensitive margins in homogeneous and non-homogeneous deep models'. Together they form a unique fingerprint.

Cite this