TY - JOUR
T1 - Scale-invariant sparse PCA on high-dimensional meta-elliptical data
AU - Han, Fang
AU - Liu, Han
N1 - Funding Information:
Fang Han, Department of Biostatistics, Johns Hopkins University, Baltimore, MD 21205 (E-mail: [email protected]). Han Liu, Department of Operations Research and Financial Engineering, Princeton University, Princeton, NJ 08544 (E-mail: [email protected]). The authors thank the associate editor and two anonymous reviewers for their very helpful and constructive comments and suggestions. Fang Han’s research is supported by Google fellowship in statistics. Han Liu’s research is supported by NSF Grants III-1116730 and NSF III-1332109, an NIH subaward R01HG06841 and a FDA subaward HHSF223201000072C from Johns Hopkins University.
PY - 2014
Y1 - 2014
N2 - We propose a semiparametric method for conducting scale-invariant sparse principal component analysis (PCA) on high-dimensional non-Gaussian data. Compared with sparse PCA, our method has a weaker modeling assumption and is more robust to possible data contamination. Theoretically, the proposed method achieves a parametric rate of convergence in estimating the parameter of interests under a flexible semiparametric distribution family; computationally, the proposed method exploits a rank-based procedure and is as efficient as sparse PCA; empirically, our method outperforms most competing methods on both synthetic and real-world datasets.
AB - We propose a semiparametric method for conducting scale-invariant sparse principal component analysis (PCA) on high-dimensional non-Gaussian data. Compared with sparse PCA, our method has a weaker modeling assumption and is more robust to possible data contamination. Theoretically, the proposed method achieves a parametric rate of convergence in estimating the parameter of interests under a flexible semiparametric distribution family; computationally, the proposed method exploits a rank-based procedure and is as efficient as sparse PCA; empirically, our method outperforms most competing methods on both synthetic and real-world datasets.
KW - Elliptical distribution
KW - High-dimensional statistics
KW - Principal component analysis
KW - Robust statistics
UR - http://www.scopus.com/inward/record.url?scp=84901810122&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84901810122&partnerID=8YFLogxK
U2 - 10.1080/01621459.2013.844699
DO - 10.1080/01621459.2013.844699
M3 - Article
C2 - 24932056
AN - SCOPUS:84901810122
SN - 0162-1459
VL - 109
SP - 275
EP - 287
JO - Journal of the American Statistical Association
JF - Journal of the American Statistical Association
IS - 505
ER -