Abstract
We establish that first-order methods avoid strict saddle points for almost all initializations. Our results apply to a wide variety of first-order methods, including (manifold) gradient descent, block coordinate descent, mirror descent and variants thereof. The connecting thread is that such algorithms can be studied from a dynamical systems perspective in which appropriate instantiations of the Stable Manifold Theorem allow for a global stability analysis. Thus, neither access to second-order derivative information nor randomness beyond initialization is necessary to provably avoid strict saddle points.
| Original language | English (US) |
|---|---|
| Pages (from-to) | 311-337 |
| Number of pages | 27 |
| Journal | Mathematical Programming |
| Volume | 176 |
| Issue number | 1-2 |
| DOIs | |
| State | Published - Jul 1 2019 |
| Externally published | Yes |
All Science Journal Classification (ASJC) codes
- Software
- General Mathematics
Keywords
- Dynamical systems
- Gradient descent
- Local minimum
- Saddle points
- Smooth optimization