. Sharp bounds for the Rademacher complexity and the generalization error are derived for the residual network model. The Rademacher complexity bound has no explicit dependency on the depth of the network, while the generalization bounds are comparable to the Monte Carlo error rates, suggesting that they are nearly optimal in the high dimensional setting. These estimates are achieved by constraining the hypothesis space with an appropriately defined path norm such that the constrained space is large enough for the approximation error rates to be optimal and small enough for the estimation error rates to be optimal at the same time. Comparisons are made with other norm-based bounds.
All Science Journal Classification (ASJC) codes
- Applied Mathematics
- a priori estimate
- residual network
- weighted path norm