The rate of convergence of AdaBoost

Indraneel Mukherjee, Cynthia Rudin, Robert E. Schapire

Research output: Contribution to journalArticlepeer-review

31 Scopus citations


The AdaBoost algorithm was designed to combine many "weak" hypotheses that perform slightly better than random guessing into a "strong" hypothesis that has very low error. We study the rate at which AdaBoost iteratively converges to the minimum of the "exponential loss." Unlike previous work, our proofs do not require a weak-learning assumption, nor do they require that minimizers of the exponential loss are finite. Our first result shows that the exponential loss of AdaBoost's computed parameter vector will be at most e more than that of any parameter vector of l1-norm bounded by B in a number of rounds that is at most a polynomial in B and 1/ε. We also provide lower bounds showing that a polynomial dependence is necessary. Our second result is that within C/ε iterations, AdaBoost achieves a value of the exponential loss that is at most e more than the best possible value, where C depends on the data set. We show that this dependence of the rate on ε is optimal up to constant factors, that is, at least Ω(1/ε) rounds are necessary to achieve within e of the optimal exponential loss.

Original languageEnglish (US)
Pages (from-to)2315-2347
Number of pages33
JournalJournal of Machine Learning Research
StatePublished - Aug 2013
Externally publishedYes

All Science Journal Classification (ASJC) codes

  • Software
  • Artificial Intelligence
  • Control and Systems Engineering
  • Statistics and Probability


  • AdaBoost
  • Convergence rate
  • Coordinate descent
  • Optimization


Dive into the research topics of 'The rate of convergence of AdaBoost'. Together they form a unique fingerprint.

Cite this