Minimax lower bounds for ridge combinations including neural nets

Jason M. Klusowski, Andrew R. Barron

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Scopus citations

Abstract

Estimation of functions of d variables is considered using ridge combinations of the form Σmk=1 c1, kφ(Σdj=1c0, j, kxj-bk) where the activation function φ is a function with bounded value and derivative. These include single-hidden layer neural networks, polynomials, and sinusoidal models. From a sample of size n of possibly noisy values at random sites X B = [-1, 1]d, the minimax mean square error is examined for functions in the closure of the ℓ1 hull of ridge functions with activation φ. It is shown to be of order d/n to a fractional power (when d is of smaller order than n), and to be of order (log d)/n to a fractional power (when d is of larger order than n). Dependence on constraints v0 and v1 on the ℓ1 norms of inner parameter co and outer parameter c1, respectively, is also examined. Also, lower and upper bounds on the fractional power are given. The heart of the analysis is development of information-theoretic packing numbers for these classes of functions.

Original languageEnglish (US)
Title of host publication2017 IEEE International Symposium on Information Theory, ISIT 2017
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1376-1380
Number of pages5
ISBN (Electronic)9781509040964
DOIs
StatePublished - Aug 9 2017
Externally publishedYes
Event2017 IEEE International Symposium on Information Theory, ISIT 2017 - Aachen, Germany
Duration: Jun 25 2017Jun 30 2017

Publication series

NameIEEE International Symposium on Information Theory - Proceedings
ISSN (Print)2157-8095

Other

Other2017 IEEE International Symposium on Information Theory, ISIT 2017
Country/TerritoryGermany
CityAachen
Period6/25/176/30/17

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Information Systems
  • Modeling and Simulation
  • Applied Mathematics

Keywords

  • Constant weight codes
  • Generalization error
  • Greedy algorithms
  • High-dimensional data analysis
  • Learning theory
  • Machine learning
  • Metric entropy
  • Neural nets
  • Nonlinear regression
  • Nonparametric regression
  • Packing sets
  • Penalization
  • Polynomial nets
  • Sinusoidal nets

Fingerprint

Dive into the research topics of 'Minimax lower bounds for ridge combinations including neural nets'. Together they form a unique fingerprint.

Cite this