We propose a semiparametric method for conducting scale-invariant sparse principal component analysis (PCA) on high-dimensional non-Gaussian data. Compared with sparse PCA, our method has a weaker modeling assumption and is more robust to possible data contamination. Theoretically, the proposed method achieves a parametric rate of convergence in estimating the parameter of interests under a flexible semiparametric distribution family; computationally, the proposed method exploits a rank-based procedure and is as efficient as sparse PCA; empirically, our method outperforms most competing methods on both synthetic and real-world datasets.
All Science Journal Classification (ASJC) codes
- Statistics and Probability
- Statistics, Probability and Uncertainty
- Elliptical distribution
- High-dimensional statistics
- Principal component analysis
- Robust statistics