Abstract
This work considers the question: what convergence guarantees does the stochastic subgradient method have in the absence of smoothness and convexity? We prove that the stochastic subgradient method, on any semialgebraic locally Lipschitz function, produces limit points that are all first-order stationary. More generally, our result applies to any function with a Whitney stratifiable graph. In particular, this work endows the stochastic subgradient method, and its proximal extension, with rigorous convergence guarantees for a wide class of problems arising in data science—including all popular deep learning architectures.
Original language | English (US) |
---|---|
Pages (from-to) | 119-154 |
Number of pages | 36 |
Journal | Foundations of Computational Mathematics |
Volume | 20 |
Issue number | 1 |
DOIs | |
State | Published - Feb 1 2020 |
Externally published | Yes |
All Science Journal Classification (ASJC) codes
- Analysis
- Computational Mathematics
- Computational Theory and Mathematics
- Applied Mathematics
Keywords
- Differential inclusion
- Lyapunov function
- Proximal
- Semialgebraic
- Stochastic subgradient method
- Subgradient
- Tame