On the learnability of fully-connected neural networks

Yuchen Zhang, Jason D. Lee, Martin J. Wainwright, Michael I. Jordan

Research output: Contribution to conferencePaperpeer-review

32 Scopus citations

Abstract

Despite the empirical success of deep neural networks, there is limited theoretical understanding of the learnability of these models with respect to polynomial-time algorithms. In this paper, we characterize the learnability of fully-connected neural networks via both positive and negative results. We focus on ℓ1-regularized networks, where the ℓ1-norm of the incoming weights of every neuron is assumed to be bounded by a constant B > 0. Our first result shows that such networks are properly learnable in poly(n, d, exp(1/ϵ2)) time, where n and d are the sample size and the input dimension, and ϵ > 0 is the gap to optimality. The bound is achieved by repeatedly sampling over a low-dimensional manifold so as to ensure approximate optimality, but avoids the exp(d) cost of exhaustively searching over the parameter space. We also establish a hardness result showing that the exponential dependence on 1/ϵ is unavoidable unless RP = NP. Our second result shows that the exponential dependence on 1/ϵ can be avoided by exploiting the underlying structure of the data distribution. In particular, if the positive and negative examples can be separated with margin γ > 0 by an unknown neural network, then the network can be learned in poly(n, d, 1/ϵ) time. The bound is achieved by an ensemble method which uses the first algorithm as a weak learner. We further show that the separability assumption can be weakened to tolerate noisy labels. Finally, we show that the exponential dependence on 1/γ is unimprovable under a certain cryptographic assumption.

Original languageEnglish (US)
StatePublished - 2017
Externally publishedYes
Event20th International Conference on Artificial Intelligence and Statistics, AISTATS 2017 - Fort Lauderdale, United States
Duration: Apr 20 2017Apr 22 2017

Conference

Conference20th International Conference on Artificial Intelligence and Statistics, AISTATS 2017
Country/TerritoryUnited States
CityFort Lauderdale
Period4/20/174/22/17

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Statistics and Probability

Fingerprint

Dive into the research topics of 'On the learnability of fully-connected neural networks'. Together they form a unique fingerprint.

Cite this