Abstract
We describe, analyze, and experiment with a framework for empirical loss minimization with regularization. Our algorithmic framework alternates between two phases. On each iteration we first perform an unconstrained gradient descent step. We then cast and solve an instantaneous optimization problem that trades off minimization of a regularization term while keeping close proximity to the result of the first phase. This view yields a simple yet effective algorithm that can be used for batch penalized risk minimization and online learning. Furthermore, the two phase approach enables sparse solutions when used in conjunction with regularization functions that promote sparsity, such as ℓ1. We derive concrete and very simple algorithms for minimization of loss functions with ℓ1, ℓ2, ℓ22, and ℓ∞ regularization. We also show how to construct efficient algorithms for mixed-norm ℓ1 / ℓq regularization. We further extend the algorithms and give efficient implementations for very high-dimensional data with sparsity. We demonstrate the potential of the proposed framework in a series of experiments with synthetic and natural data sets.
Original language | English (US) |
---|---|
Pages (from-to) | 2899-2934 |
Number of pages | 36 |
Journal | Journal of Machine Learning Research |
Volume | 10 |
State | Published - 2009 |
All Science Journal Classification (ASJC) codes
- Software
- Control and Systems Engineering
- Statistics and Probability
- Artificial Intelligence
Keywords
- Convex optimization
- Group sparsity
- Online learning
- Subgradient methods