Abstract
We show that gradient descent converges to a local minimizer, almost surely with random initialization. This is proved by applying the Stable Manifold Theorem from dynamical systems theory.
Original language | English (US) |
---|---|
Pages (from-to) | 1246-1257 |
Number of pages | 12 |
Journal | Journal of Machine Learning Research |
Volume | 49 |
Issue number | June |
State | Published - Jun 6 2016 |
Externally published | Yes |
Event | 29th Conference on Learning Theory, COLT 2016 - New York, United States Duration: Jun 23 2016 → Jun 26 2016 |
All Science Journal Classification (ASJC) codes
- Software
- Artificial Intelligence
- Control and Systems Engineering
- Statistics and Probability
Keywords
- Gradient descent
- Local minimum
- Non-convex
- Saddle points