Abstract
We establish a scale separation of Kolmogorov width type between subspaces of a given Banach space under the condition that a sequence of linear maps converges much faster on one of the subspaces. The general technique is then applied to show that reproducing kernel Hilbert spaces are poor L2-approximators for the class of two-layer neural networks in high dimension, and that multi-layer networks with small path norm are poor approximators for certain Lipschitz functions, also in the L2-topology.
Original language | English (US) |
---|---|
Article number | 5 |
Journal | Research in Mathematical Sciences |
Volume | 8 |
Issue number | 1 |
DOIs | |
State | Published - Mar 2021 |
All Science Journal Classification (ASJC) codes
- Theoretical Computer Science
- Mathematics (miscellaneous)
- Computational Mathematics
- Applied Mathematics
Keywords
- Approximation theory
- Barron space
- Curse of dimensionality
- Kolmogorov width
- Multi-layer network
- Neural tangent kernel
- Population risk
- Random feature model
- Reproducing kernel Hilbert space
- Two-layer network