TY - JOUR
T1 - Near-optimal stochastic approximation for online principal component estimation
AU - Li, Chris Junchi
AU - Wang, Mengdi
AU - Liu, Han
AU - Zhang, Tong
N1 - Publisher Copyright:
© 2017, Springer-Verlag GmbH Germany and Mathematical Optimization Society.
PY - 2018/1/1
Y1 - 2018/1/1
N2 - Principal component analysis (PCA) has been a prominent tool for high-dimensional data analysis. Online algorithms that estimate the principal component by processing streaming data are of tremendous practical and theoretical interests. Despite its rich applications, theoretical convergence analysis remains largely open. In this paper, we cast online PCA into a stochastic nonconvex optimization problem, and we analyze the online PCA algorithm as a stochastic approximation iteration. The stochastic approximation iteration processes data points incrementally and maintains a running estimate of the principal component. We prove for the first time a nearly optimal finite-sample error bound for the online PCA algorithm. Under the subgaussian assumption, we show that the finite-sample error bound closely matches the minimax information lower bound.
AB - Principal component analysis (PCA) has been a prominent tool for high-dimensional data analysis. Online algorithms that estimate the principal component by processing streaming data are of tremendous practical and theoretical interests. Despite its rich applications, theoretical convergence analysis remains largely open. In this paper, we cast online PCA into a stochastic nonconvex optimization problem, and we analyze the online PCA algorithm as a stochastic approximation iteration. The stochastic approximation iteration processes data points incrementally and maintains a running estimate of the principal component. We prove for the first time a nearly optimal finite-sample error bound for the online PCA algorithm. Under the subgaussian assumption, we show that the finite-sample error bound closely matches the minimax information lower bound.
KW - Finite-sample analysis
KW - High-dimensional data
KW - Nonconvex optimization
KW - Online algorithm
KW - Principal component analysis
KW - Stochastic approximation
KW - Stochastic gradient method
UR - http://www.scopus.com/inward/record.url?scp=85027889035&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85027889035&partnerID=8YFLogxK
U2 - 10.1007/s10107-017-1182-z
DO - 10.1007/s10107-017-1182-z
M3 - Article
AN - SCOPUS:85027889035
VL - 167
SP - 75
EP - 97
JO - Mathematical Programming
JF - Mathematical Programming
SN - 0025-5610
IS - 1
ER -