Abstract
Eigenvector perturbation analysis plays a vital role in various data science applications. A large body of prior works, however, focused on establishing ℓ2 eigenvector perturbation bounds, which are often highly inadequate in addressing tasks that rely on fine-grained behavior of an eigenvector. This paper makes progress on this by studying the perturbation of linear functions of an unknown eigenvector. Focusing on two fundamental problems - matrix denoising and principal component analysis - in the presence of Gaussian noise, we develop a suite of statistical theory that characterizes the perturbation of arbitrary linear functions of an unknown eigenvector. In order to mitigate a non-negligible bias issue inherent to the natural "plug-in"estimator, we develop de-biased estimators that (1) achieve minimax lower bounds for a family of scenarios (modulo some logarithmic factor), and (2) can be computed in a data-driven manner without sample splitting. Noteworthily, the proposed estimators are nearly minimax optimal even when the associated eigen-gap is substantially smaller than what is required in prior statistical theory.
Original language | English (US) |
---|---|
Pages (from-to) | 1200-1247 |
Number of pages | 48 |
Journal | IEEE Transactions on Information Theory |
Volume | 71 |
Issue number | 2 |
DOIs | |
State | Published - 2025 |
Externally published | Yes |
All Science Journal Classification (ASJC) codes
- Information Systems
- Computer Science Applications
- Library and Information Sciences
Keywords
- bias correction
- Linear forms of eigenvectors
- matrix denoising
- principal component analysis
- small eigen-gap