Abstract
We give novel algorithms for stochastic strongly-convex optimization in the gradient oracle model which return a O( 1/T )-approximate solution after T iterations. The first algorithm is deterministic, and achieves this rate via gradient updates and historical averaging. The second algorithm is randomized, and is based on pure gradient steps with a random step size. This rate of convergence is optimal in the gradient oracle model. This improves upon the previously known best rate of O( log(T)/T ), which was obtained by applying an online strongly-convex optimization algorithm with regret O(log(T)) to the batch setting. We complement this result by proving that any algorithm has expected regret of Ω(log(T)) in the online stochastic strongly-convex optimization setting. This shows that any online-to-batch conversion is inherently suboptimal for stochastic strongly-convex optimization. This is the first formal evidence that online convex optimization is strictly more difficult than batch stochastic convex optimization.
Original language | English (US) |
---|---|
Pages (from-to) | 2489-2512 |
Number of pages | 24 |
Journal | Journal of Machine Learning Research |
Volume | 15 |
State | Published - Jul 2014 |
Externally published | Yes |
All Science Journal Classification (ASJC) codes
- Software
- Control and Systems Engineering
- Statistics and Probability
- Artificial Intelligence
Keywords
- Convex optimization
- Online Learning
- Regret minimization
- Stochastic gradient descent