TY - JOUR

T1 - Asymmetry helps

T2 - Eigenvalue and eigenvector analyses of asymmetrically perturbed low-rank matrices

AU - Chen, Yuxin

AU - Cheng, Chen

AU - Fan, Jianqing

N1 - Funding Information:
Acknowledgments Y. Chen is supported in part by the AFOSR YIP award FA9550-19-1-0030, by the ARO grants W911NF-20-1-0097 and W911NF18-1-0303, by the ONR grant N00014-19-1-2120, by the NSF grants CCF-190766, IIS-1900140 and DMS-2014279.
Funding Information:
C. Cheng is supported in part by the Elite Undergraduate Training Program of School of Mathematical Sciences in Peking University, and by the William R. Hewlett Stanford graduate fellowship. We thank Cong Ma for helpful discussions, and Zhou Fan for showing us an example of asymmetrizing the Gaussian matrix. Y. Chen is the corresponding author.
Funding Information:
J. Fan is supported in part by the NSF grants DMS-1662139 and DMS-1712591, by the ONR grant N00014-19-1-2120, and by the NIH grant 2R01-GM072611-13.
Publisher Copyright:
© Institute of Mathematical Statistics, 2021

PY - 2021/2

Y1 - 2021/2

N2 - This paper is concerned with the interplay between statistical asymmetry and spectral methods. Suppose we are interested in estimating a rank-1 and symmetric matrix M* ∈ Rn×n, yet only a randomly perturbed version M is observed. The noise matrix M − M* is composed of independent (but not necessarily homoscedastic) entries and is, therefore, not symmetric in general. This might arise if, for example, when we have two independent samples for each entry of M* and arrange them in an asymmetric fashion. The aim is to estimate the leading eigenvalue and the leading eigenvector of M*. We demonstrate that the leading eigenvalue of the data matrix M can be O(√n) times more accurate (up to some log factor) than its (unadjusted) leading singular value of M in eigenvalue estimation. Moreover, the eigendecomposition approach is fully adaptive to heteroscedasticity of noise, without the need of any prior knowledge about the noise distributions. In a nutshell, this curious phenomenon arises since the statistical asymmetry automatically mitigates the bias of the eigenvalue approach, thus eliminating the need of careful bias correction. Additionally, we develop appealing nonasymptotic eigenvector perturbation bounds; in particular, we are able to bound the perturbation of any linear function of the leading eigenvector of M (e.g., entrywise eigenvector perturbation). We also provide partial theory for the more general rank-r case. The takeaway message is this: arranging the data samples in an asymmetric manner and performing eigendecomposition could sometimes be quite beneficial.

AB - This paper is concerned with the interplay between statistical asymmetry and spectral methods. Suppose we are interested in estimating a rank-1 and symmetric matrix M* ∈ Rn×n, yet only a randomly perturbed version M is observed. The noise matrix M − M* is composed of independent (but not necessarily homoscedastic) entries and is, therefore, not symmetric in general. This might arise if, for example, when we have two independent samples for each entry of M* and arrange them in an asymmetric fashion. The aim is to estimate the leading eigenvalue and the leading eigenvector of M*. We demonstrate that the leading eigenvalue of the data matrix M can be O(√n) times more accurate (up to some log factor) than its (unadjusted) leading singular value of M in eigenvalue estimation. Moreover, the eigendecomposition approach is fully adaptive to heteroscedasticity of noise, without the need of any prior knowledge about the noise distributions. In a nutshell, this curious phenomenon arises since the statistical asymmetry automatically mitigates the bias of the eigenvalue approach, thus eliminating the need of careful bias correction. Additionally, we develop appealing nonasymptotic eigenvector perturbation bounds; in particular, we are able to bound the perturbation of any linear function of the leading eigenvector of M (e.g., entrywise eigenvector perturbation). We also provide partial theory for the more general rank-r case. The takeaway message is this: arranging the data samples in an asymmetric manner and performing eigendecomposition could sometimes be quite beneficial.

KW - Eigenvalue perturbation

KW - Entrywise eigenvector perturbation

KW - Heteroscedasticity

KW - Linear form of eigenvectors

KW - Spectral methods

UR - http://www.scopus.com/inward/record.url?scp=85101293590&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85101293590&partnerID=8YFLogxK

U2 - 10.1214/20-AOS1963

DO - 10.1214/20-AOS1963

M3 - Article

AN - SCOPUS:85101293590

VL - 49

SP - 435

EP - 458

JO - Annals of Statistics

JF - Annals of Statistics

SN - 0090-5364

IS - 1

ER -