TY - JOUR
T1 - Nonconvex Matrix Factorization from Rank-One Measurements
AU - Li, Yuanxin
AU - Ma, Cong
AU - Chen, Yuxin
AU - Chi, Yuejie
N1 - Funding Information:
Manuscript received October 6, 2018; revised March 2, 2020; accepted December 7, 2020. Date of publication January 8, 2021; date of current version February 17, 2021. The work of Yuanxin Li and Yuejie Chi was supported in part by the Air Force Office of Scientific (AFOSR) under Grant FA9550-15-1-0205; in part by the Office of Naval Research (ONR) under Grant N00014-18-1-2142 and Grant N00014-19-1-2404; in part by the Army Research Office (ARO) under Grant W911NF-18-1-0303; and in part by the NSF under Grant CAREER ECCS-1818571, Grant CCF-1806154, and Grant CCF-1901199. The work of Yuxin Chen was supported in part by the AFOSR Young Investigator Program (YIP) under Award FA9550-19-1-0030; in part by the ONR under Grant N00014-19-1-2120; in part by the ARO YIP under Award W911NF-20-1-0097; in part by the ARO under Grant W911NF-18-1-0303; in part by the NSF under Grant CCF-1907661, Grant IIS-1900140, and Grant DMS-2014279; and in part by the Princeton SEAS Innovation Award. This article was presented at the 2019 22nd International Conference on Artificial Intelligence and Statistics (AISTATS). (Corresponding author: Yuejie Chi.) Yuanxin Li was with the Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA 15213 USA.
Publisher Copyright:
© 1963-2012 IEEE.
PY - 2021/3
Y1 - 2021/3
N2 - We consider the problem of recovering low-rank matrices from random rank-one measurements, which spans numerous applications including covariance sketching, phase retrieval, quantum state tomography, and learning shallow polynomial neural networks, among others. Our approach is to directly estimate the low-rank factor by minimizing a nonconvex least-squares loss function via vanilla gradient descent, following a tailored spectral initialization. When the true rank is bounded by a constant, this algorithm is guaranteed to converge to the ground truth (up to global ambiguity) with near-optimal sample complexity and computational complexity. To the best of our knowledge, this is the first guarantee that achieves near-optimality in both metrics. In particular, the key enabler of near-optimal computational guarantees is an implicit regularization phenomenon: without explicit regularization, both spectral initialization and the gradient descent iterates automatically stay within a region incoherent with the measurement vectors. This feature allows one to employ much more aggressive step sizes compared with the ones suggested in prior literature, without the need of sample splitting.
AB - We consider the problem of recovering low-rank matrices from random rank-one measurements, which spans numerous applications including covariance sketching, phase retrieval, quantum state tomography, and learning shallow polynomial neural networks, among others. Our approach is to directly estimate the low-rank factor by minimizing a nonconvex least-squares loss function via vanilla gradient descent, following a tailored spectral initialization. When the true rank is bounded by a constant, this algorithm is guaranteed to converge to the ground truth (up to global ambiguity) with near-optimal sample complexity and computational complexity. To the best of our knowledge, this is the first guarantee that achieves near-optimality in both metrics. In particular, the key enabler of near-optimal computational guarantees is an implicit regularization phenomenon: without explicit regularization, both spectral initialization and the gradient descent iterates automatically stay within a region incoherent with the measurement vectors. This feature allows one to employ much more aggressive step sizes compared with the ones suggested in prior literature, without the need of sample splitting.
KW - Matrix factorization
KW - gradient descent
KW - nonconvex optimization
KW - rank-one measurements
UR - http://www.scopus.com/inward/record.url?scp=85099549718&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85099549718&partnerID=8YFLogxK
U2 - 10.1109/TIT.2021.3050427
DO - 10.1109/TIT.2021.3050427
M3 - Article
AN - SCOPUS:85099549718
SN - 0018-9448
VL - 67
SP - 1928
EP - 1950
JO - IRE Professional Group on Information Theory
JF - IRE Professional Group on Information Theory
IS - 3
M1 - 9317779
ER -