TY - JOUR
T1 - ECA
T2 - High-Dimensional Elliptical Component Analysis in Non-Gaussian Distributions
AU - Han, Fang
AU - Liu, Han
N1 - Funding Information:
Fang Han’s research was supported by NIBIB-EB012547, NSF DMS-1712536, and a UW faculty start-up grant. Han Liu’s research was supported by the NSF CAREER Award DMS-1454377, NSF IIS-1546482, NSF IIS-1408910, NSF IIS-1332109, NIH R01-MH102339, NIH R01-GM083084, and NIH R01-HG06841.
Publisher Copyright:
© 2018 American Statistical Association.
PY - 2018/1/2
Y1 - 2018/1/2
N2 - We present a robust alternative to principal component analysis (PCA)—called elliptical component analysis (ECA)—for analyzing high-dimensional, elliptically distributed data. ECA estimates the eigenspace of the covariance matrix of the elliptical data. To cope with heavy-tailed elliptical distributions, a multivariate rank statistic is exploited. At the model-level, we consider two settings: either that the leading eigenvectors of the covariance matrix are nonsparse or that they are sparse. Methodologically, we propose ECA procedures for both nonsparse and sparse settings. Theoretically, we provide both nonasymptotic and asymptotic analyses quantifying the theoretical performances of ECA. In the nonsparse setting, we show that ECA’s performance is highly related to the effective rank of the covariance matrix. In the sparse setting, the results are twofold: (i) we show that the sparse ECA estimator based on a combinatoric program attains the optimal rate of convergence; (ii) based on some recent developments in estimating sparse leading eigenvectors, we show that a computationally efficient sparse ECA estimator attains the optimal rate of convergence under a suboptimal scaling. Supplementary materials for this article are available online.
AB - We present a robust alternative to principal component analysis (PCA)—called elliptical component analysis (ECA)—for analyzing high-dimensional, elliptically distributed data. ECA estimates the eigenspace of the covariance matrix of the elliptical data. To cope with heavy-tailed elliptical distributions, a multivariate rank statistic is exploited. At the model-level, we consider two settings: either that the leading eigenvectors of the covariance matrix are nonsparse or that they are sparse. Methodologically, we propose ECA procedures for both nonsparse and sparse settings. Theoretically, we provide both nonasymptotic and asymptotic analyses quantifying the theoretical performances of ECA. In the nonsparse setting, we show that ECA’s performance is highly related to the effective rank of the covariance matrix. In the sparse setting, the results are twofold: (i) we show that the sparse ECA estimator based on a combinatoric program attains the optimal rate of convergence; (ii) based on some recent developments in estimating sparse leading eigenvectors, we show that a computationally efficient sparse ECA estimator attains the optimal rate of convergence under a suboptimal scaling. Supplementary materials for this article are available online.
KW - Elliptical component analysis
KW - Elliptical distribution
KW - Multivariate Kendall’s tau
KW - Optimality property
KW - Robust estimators
KW - Sparse principal component analysis
UR - http://www.scopus.com/inward/record.url?scp=85029907068&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85029907068&partnerID=8YFLogxK
U2 - 10.1080/01621459.2016.1246366
DO - 10.1080/01621459.2016.1246366
M3 - Article
AN - SCOPUS:85029907068
SN - 0162-1459
VL - 113
SP - 252
EP - 268
JO - Journal of the American Statistical Association
JF - Journal of the American Statistical Association
IS - 521
ER -