Abstract
We derive generalizations of AdaBoost and related gradient-based coordinate descent methods that incorporate sparsity-promoting penalties for the norm of the predictor that is being learned. The end result is a family of coordinate descent algorithms that integrate forward feature induction and back-pruning through regularization and give an automatic stopping criterion for feature induction. We study penalties based on the ℓ1, ℓ2, and ℓ∞ norms of the predictor and introduce mixed-norm penalties that build upon the initial penalties. The mixed-norm regularizes facilitate structural sparsity in parameter space, which is a useful property in multiclass prediction and other related tasks. We report empirical results that demonstrate the power of our approach in building accurate and structurally sparse models.
Original language | English (US) |
---|---|
Title of host publication | Proceedings of the 26th International Conference On Machine Learning, ICML 2009 |
Pages | 297-304 |
Number of pages | 8 |
State | Published - Dec 9 2009 |
Externally published | Yes |
Event | 26th International Conference On Machine Learning, ICML 2009 - Montreal, QC, Canada Duration: Jun 14 2009 → Jun 18 2009 |
Other
Other | 26th International Conference On Machine Learning, ICML 2009 |
---|---|
Country/Territory | Canada |
City | Montreal, QC |
Period | 6/14/09 → 6/18/09 |
All Science Journal Classification (ASJC) codes
- Artificial Intelligence
- Computer Networks and Communications
- Software