TY - CONF
T1 - Nonconvex matrix factorization from rank-one measurements
AU - Li, Yuanxin
AU - Ma, Cong
AU - Chen, Yuxin
AU - Chi, Yuejie
N1 - Funding Information:
The work of Y. Li and Y. Chi is supported in part by AFOSR under the grant FA9550-15-1-0205, by ONR under the grant N00014-18-1-2142, by ARO under the grant W911NF-18-1-0303, and by NSF under the grants CAREER ECCS-1818571 and CCF-1704245. The work of Y. Chen is supported in part by the AFOSR YIP award FA9550-19-1-0030, by the ARO grant W911NF-18-1-0303, and by the Princeton SEAS innovation award.
Publisher Copyright:
© 2019 by the author(s).
PY - 2020
Y1 - 2020
N2 - We consider the problem of recovering low-rank matrices from random rank-one measurements, which spans numerous applications including phase retrieval, quantum state tomography, and learning shallow neural networks with quadratic activations, among others. Our approach is to directly estimate the low-rank factor by minimizing a nonconvex least-squares loss function via vanilla gradient descent, following a tailored spectral initialization. When the true rank is small, this algorithm is guaranteed to converge to the ground truth (up to global ambiguity) with near-optimal sample and computational complexities with respect to the problem size. To the best of our knowledge, this is the first theoretical guarantee that achieves near optimality in both metrics. In particular, the key enabler of near-optimal computational guarantees is an implicit regularization phenomenon: without explicit regularization, both spectral initialization and the gradient descent iterates automatically stay within a region incoherent with the measurement vectors. This feature allows one to employ much more aggressive step sizes compared with the ones suggested in prior literature, without the need of sample splitting.
AB - We consider the problem of recovering low-rank matrices from random rank-one measurements, which spans numerous applications including phase retrieval, quantum state tomography, and learning shallow neural networks with quadratic activations, among others. Our approach is to directly estimate the low-rank factor by minimizing a nonconvex least-squares loss function via vanilla gradient descent, following a tailored spectral initialization. When the true rank is small, this algorithm is guaranteed to converge to the ground truth (up to global ambiguity) with near-optimal sample and computational complexities with respect to the problem size. To the best of our knowledge, this is the first theoretical guarantee that achieves near optimality in both metrics. In particular, the key enabler of near-optimal computational guarantees is an implicit regularization phenomenon: without explicit regularization, both spectral initialization and the gradient descent iterates automatically stay within a region incoherent with the measurement vectors. This feature allows one to employ much more aggressive step sizes compared with the ones suggested in prior literature, without the need of sample splitting.
UR - http://www.scopus.com/inward/record.url?scp=85077733931&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85077733931&partnerID=8YFLogxK
M3 - Paper
AN - SCOPUS:85077733931
T2 - 22nd International Conference on Artificial Intelligence and Statistics, AISTATS 2019
Y2 - 16 April 2019 through 18 April 2019
ER -