Abstract
Heavy-tailed errors impair the accuracy of the least squares estimate, which can be spoiled by a single grossly outlying observation. As argued in the seminal work of Peter Huber in 1973 [Ann. Statist. 1 (1973) 799-821], robust alternatives to the method of least squares are sorely needed. To achieve robustness against heavy-tailed sampling distributions, we revisit the Huber estimator from a new perspective by letting the tuning parameter involved diverge with the sample size. In this paper, we develop nonasymptotic concentration results for such an adaptive Huber estimator, namely, the Huber estimator with the tuning parameter adapted to sample size, dimension and the variance of the noise. Specifically, we obtain a sub-Gaussian-type deviation inequality and a nonasymptotic Bahadur representation when noise variables only have finite second moments. The nonasymptotic results further yield two conventional normal approximation results that are of independent interest, the Berry-Esseen inequality and Cramér-type moderate deviation. As an important application to large-scale simultaneous inference, we apply these robust normal approximation results to analyze a dependence-adjusted multiple testing procedure for moderately heavy-tailed data. It is shown that the robust dependence-adjusted procedure asymptotically controls the overall false discovery proportion at the nominal level under mild moment conditions. Thorough numerical results on both simulated and real datasets are also provided to back up our theory.
Original language | English (US) |
---|---|
Pages (from-to) | 1904-1931 |
Number of pages | 28 |
Journal | Annals of Statistics |
Volume | 46 |
Issue number | 5 |
DOIs | |
State | Published - Oct 2018 |
Externally published | Yes |
All Science Journal Classification (ASJC) codes
- Statistics and Probability
- Statistics, Probability and Uncertainty
Keywords
- Approximate factor model
- Bahadur representation
- False discovery proportion
- Heavy-tailed data
- Huber loss
- Large-scale multiple testing
- M-estimator