Abstract
We show that gradient descent converges to a local minimizer, almost surely with random initialization. This is proved by applying the Stable Manifold Theorem from dynamical systems theory.
| Original language | English (US) |
|---|---|
| Pages (from-to) | 1246-1257 |
| Number of pages | 12 |
| Journal | Journal of Machine Learning Research |
| Volume | 49 |
| Issue number | June |
| State | Published - Jun 6 2016 |
| Externally published | Yes |
| Event | 29th Conference on Learning Theory, COLT 2016 - New York, United States Duration: Jun 23 2016 → Jun 26 2016 |
All Science Journal Classification (ASJC) codes
- Software
- Control and Systems Engineering
- Statistics and Probability
- Artificial Intelligence
Keywords
- Gradient descent
- Local minimum
- Non-convex
- Saddle points
Fingerprint
Dive into the research topics of 'Gradient descent only converges to minimizers'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver