Abstract
This paper proposes a stochastic variant of a classic algorithm-the cubic-regularized Newton method [Nesterov and Polyak, 2006]. The proposed algorithm efficiently escapes saddle points and finds approximate local minima for general smooth, nonconvex functions in only Õ(3.5) stochastic gradient and stochastic Hessian-vector product evaluations. The latter can be computed as efficiently as stochastic gradients. This improves upon the Õ(4) rate of stochastic gradient descent. Our rate matches the best-known result for finding local minima without requiring any delicate acceleration or variance-reduction techniques.
| Original language | English (US) |
|---|---|
| Pages (from-to) | 2899-2908 |
| Number of pages | 10 |
| Journal | Advances in Neural Information Processing Systems |
| Volume | 2018-December |
| State | Published - 2018 |
| Externally published | Yes |
| Event | 32nd Conference on Neural Information Processing Systems, NeurIPS 2018 - Montreal, Canada Duration: Dec 2 2018 → Dec 8 2018 |
All Science Journal Classification (ASJC) codes
- Computer Networks and Communications
- Information Systems
- Signal Processing