TY - JOUR
T1 - Asymmetry helps
T2 - Eigenvalue and eigenvector analyses of asymmetrically perturbed low-rank matrices
AU - Chen, Yuxin
AU - Cheng, Chen
AU - Fan, Jianqing
N1 - Funding Information:
Acknowledgments Y. Chen is supported in part by the AFOSR YIP award FA9550-19-1-0030, by the ARO grants W911NF-20-1-0097 and W911NF18-1-0303, by the ONR grant N00014-19-1-2120, by the NSF grants CCF-190766, IIS-1900140 and DMS-2014279.
Funding Information:
C. Cheng is supported in part by the Elite Undergraduate Training Program of School of Mathematical Sciences in Peking University, and by the William R. Hewlett Stanford graduate fellowship. We thank Cong Ma for helpful discussions, and Zhou Fan for showing us an example of asymmetrizing the Gaussian matrix. Y. Chen is the corresponding author.
Funding Information:
J. Fan is supported in part by the NSF grants DMS-1662139 and DMS-1712591, by the ONR grant N00014-19-1-2120, and by the NIH grant 2R01-GM072611-13.
Publisher Copyright:
© Institute of Mathematical Statistics, 2021
PY - 2021/2
Y1 - 2021/2
N2 - This paper is concerned with the interplay between statistical asymmetry and spectral methods. Suppose we are interested in estimating a rank-1 and symmetric matrix M* ∈ Rn×n, yet only a randomly perturbed version M is observed. The noise matrix M − M* is composed of independent (but not necessarily homoscedastic) entries and is, therefore, not symmetric in general. This might arise if, for example, when we have two independent samples for each entry of M* and arrange them in an asymmetric fashion. The aim is to estimate the leading eigenvalue and the leading eigenvector of M*. We demonstrate that the leading eigenvalue of the data matrix M can be O(√n) times more accurate (up to some log factor) than its (unadjusted) leading singular value of M in eigenvalue estimation. Moreover, the eigendecomposition approach is fully adaptive to heteroscedasticity of noise, without the need of any prior knowledge about the noise distributions. In a nutshell, this curious phenomenon arises since the statistical asymmetry automatically mitigates the bias of the eigenvalue approach, thus eliminating the need of careful bias correction. Additionally, we develop appealing nonasymptotic eigenvector perturbation bounds; in particular, we are able to bound the perturbation of any linear function of the leading eigenvector of M (e.g., entrywise eigenvector perturbation). We also provide partial theory for the more general rank-r case. The takeaway message is this: arranging the data samples in an asymmetric manner and performing eigendecomposition could sometimes be quite beneficial.
AB - This paper is concerned with the interplay between statistical asymmetry and spectral methods. Suppose we are interested in estimating a rank-1 and symmetric matrix M* ∈ Rn×n, yet only a randomly perturbed version M is observed. The noise matrix M − M* is composed of independent (but not necessarily homoscedastic) entries and is, therefore, not symmetric in general. This might arise if, for example, when we have two independent samples for each entry of M* and arrange them in an asymmetric fashion. The aim is to estimate the leading eigenvalue and the leading eigenvector of M*. We demonstrate that the leading eigenvalue of the data matrix M can be O(√n) times more accurate (up to some log factor) than its (unadjusted) leading singular value of M in eigenvalue estimation. Moreover, the eigendecomposition approach is fully adaptive to heteroscedasticity of noise, without the need of any prior knowledge about the noise distributions. In a nutshell, this curious phenomenon arises since the statistical asymmetry automatically mitigates the bias of the eigenvalue approach, thus eliminating the need of careful bias correction. Additionally, we develop appealing nonasymptotic eigenvector perturbation bounds; in particular, we are able to bound the perturbation of any linear function of the leading eigenvector of M (e.g., entrywise eigenvector perturbation). We also provide partial theory for the more general rank-r case. The takeaway message is this: arranging the data samples in an asymmetric manner and performing eigendecomposition could sometimes be quite beneficial.
KW - Eigenvalue perturbation
KW - Entrywise eigenvector perturbation
KW - Heteroscedasticity
KW - Linear form of eigenvectors
KW - Spectral methods
UR - http://www.scopus.com/inward/record.url?scp=85101293590&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85101293590&partnerID=8YFLogxK
U2 - 10.1214/20-AOS1963
DO - 10.1214/20-AOS1963
M3 - Article
AN - SCOPUS:85101293590
SN - 0090-5364
VL - 49
SP - 435
EP - 458
JO - Annals of Statistics
JF - Annals of Statistics
IS - 1
ER -