TY - JOUR
T1 - Efficiently escaping saddle points on manifolds
AU - Criscitiello, Chris
AU - Boumal, Nicolas
N1 - Funding Information:
We thank Yue Sun, Nicolas Flammarion and Maryam Fazel, authors of [31], for numerous relevant discussions. NB is partially supported by NSF grant DMS-1719558.
Publisher Copyright:
© 2019 Neural information processing systems foundation. All rights reserved.
PY - 2019
Y1 - 2019
N2 - Smooth, non-convex optimization problems on Riemannian manifolds occur in machine learning as a result of orthonormality, rank or positivity constraints. First- and second-order necessary optimality conditions state that the Riemannian gradient must be zero, and the Riemannian Hessian must be positive semidefinite. Generalizing Jin et al.'s recent work on perturbed gradient descent (PGD) for optimization on linear spaces [How to Escape Saddle Points Efficiently (2017) [17], Stochastic Gradient Descent Escapes Saddle Points Efficiently (2019) [18]], we propose a version of perturbed Riemannian gradient descent (PRGD) to show that necessary optimality conditions can be met approximately with high probability, without evaluating the Hessian. Specifically, for an arbitrary Riemannian manifold M of dimension d, a sufficiently smooth (possibly non-convex) objective function f, and under weak conditions on the retraction chosen to move on the manifold, with high probability, our version of PRGD produces a point with gradient smaller than e and Hessian within pe of being positive semidefinite in O((log d)4/e2) gradient queries. This matches the complexity of PGD in the Euclidean case. Crucially, the dependence on dimension is low. This matters for large-scale applications including PCA and low-rank matrix completion, which both admit natural formulations on manifolds. The key technical idea is to generalize PRGD with a distinction between two types of gradient steps: “steps on the manifold” and “perturbed steps in a tangent space of the manifold.” Ultimately, this distinction makes it possible to extend Jin et al.'s analysis seamlessly.
AB - Smooth, non-convex optimization problems on Riemannian manifolds occur in machine learning as a result of orthonormality, rank or positivity constraints. First- and second-order necessary optimality conditions state that the Riemannian gradient must be zero, and the Riemannian Hessian must be positive semidefinite. Generalizing Jin et al.'s recent work on perturbed gradient descent (PGD) for optimization on linear spaces [How to Escape Saddle Points Efficiently (2017) [17], Stochastic Gradient Descent Escapes Saddle Points Efficiently (2019) [18]], we propose a version of perturbed Riemannian gradient descent (PRGD) to show that necessary optimality conditions can be met approximately with high probability, without evaluating the Hessian. Specifically, for an arbitrary Riemannian manifold M of dimension d, a sufficiently smooth (possibly non-convex) objective function f, and under weak conditions on the retraction chosen to move on the manifold, with high probability, our version of PRGD produces a point with gradient smaller than e and Hessian within pe of being positive semidefinite in O((log d)4/e2) gradient queries. This matches the complexity of PGD in the Euclidean case. Crucially, the dependence on dimension is low. This matters for large-scale applications including PCA and low-rank matrix completion, which both admit natural formulations on manifolds. The key technical idea is to generalize PRGD with a distinction between two types of gradient steps: “steps on the manifold” and “perturbed steps in a tangent space of the manifold.” Ultimately, this distinction makes it possible to extend Jin et al.'s analysis seamlessly.
UR - http://www.scopus.com/inward/record.url?scp=85084700018&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85084700018&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85084700018
SN - 1049-5258
VL - 32
JO - Advances in Neural Information Processing Systems
JF - Advances in Neural Information Processing Systems
T2 - 33rd Annual Conference on Neural Information Processing Systems, NeurIPS 2019
Y2 - 8 December 2019 through 14 December 2019
ER -