Sharp Analysis of a Simple Model for Random Forests

Research output: Contribution to journalConference articlepeer-review

5 Scopus citations

Abstract

Random forests have become an important tool for improving accuracy in regression and classification problems since their inception by Leo Breiman in 2001. In this paper, we revisit a historically important random forest model, called centered random forests, originally proposed by Breiman in 2004 and later studied by Gérard Biau in 2012, where a feature is selected at random and the splits occurs at the midpoint of the node along the chosen feature. If the regression function is d-dimensional and Lipschitz, we show that, given access to n observations, the mean-squared prediction er-1 ror is (Equation presented). This positively answers an outstanding question of Biau about whether the rate of convergence for this random forest model could be im-1 proved beyond (Equation presented). Furthermore, by a refined analysis of the approximation and estimation errors for linear models, we show that our new rate cannot be improved in general. Finally, we generalize our analysis and improve current prediction error bounds for another random forest model, called median random forests, in which each tree is constructed from subsampled data and the splits are performed at the empirical median along a chosen feature.

Original languageEnglish (US)
Pages (from-to)757-765
Number of pages9
JournalProceedings of Machine Learning Research
Volume130
StatePublished - 2021
Event24th International Conference on Artificial Intelligence and Statistics, AISTATS 2021 - Virtual, Online, United States
Duration: Apr 13 2021Apr 15 2021

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Software
  • Control and Systems Engineering
  • Statistics and Probability

Fingerprint

Dive into the research topics of 'Sharp Analysis of a Simple Model for Random Forests'. Together they form a unique fingerprint.

Cite this