Abstract
We establish L∞ and L2 error bounds for functions of many variables that are approximated by linear combinations of rectified linear unit (ReLU) and squared ReLU ridge functions with 1 and 0 controls on their inner and outer parameters. With the squared ReLU ridge function, we show that the L2 approximation error is inversely proportional to the inner layer 0 sparsity and it need only be sublinear in the outer layer ℓ0 sparsity. Our constructions are obtained using a variant of the Maurey-Jones-Barron probabilistic method, which can be interpreted as either stratified sampling with proportionate allocation or two-stage cluster sampling. We also provide companion error lower bounds that reveal near optimality of our constructions. Despite the sparsity assumptions, we showcase the richness and flexibility of these ridge combinations by defining a large family of functions, in terms of certain spectral conditions, that are particularly well approximated by them.
Original language | English (US) |
---|---|
Article number | 8485650 |
Pages (from-to) | 7649-7656 |
Number of pages | 8 |
Journal | IEEE Transactions on Information Theory |
Volume | 64 |
Issue number | 12 |
DOIs | |
State | Published - Dec 2018 |
Externally published | Yes |
All Science Journal Classification (ASJC) codes
- Information Systems
- Computer Science Applications
- Library and Information Sciences
Keywords
- Ridge combinations
- approximation error
- rectified linear unit
- sparse models
- spline
- stratified sampling