Sparse nonparametric density estimation in high dimensions using the rodeo

Han Liu, John Lafferty, Larry Wasserman

Research output: Contribution to journalConference articlepeer-review

32 Scopus citations


We consider the problem of estimating the joint density of a d-dimensional random vector X = (X 1,X 2,...,X d) when d is large. We assume that the density is a product of a parametric component and a nonparametric component which depends on an unknown subset of the variables. Using a modification of a recently developed nonparametric regression framework called rodeo (regularization of derivative expectation operator), we propose a method to greedily select bandwidths in a kernel density estimate. It is shown empirically that the density rodeo works well even for very high dimensional problems. When the unknown density function satisfies a suitably defined sparsity condition, and the parametric baseline density is smooth, the approach is shown to achieve near optimal minimax rates of convergence, and thus avoids the curse of dimensionality.

Original languageEnglish (US)
Pages (from-to)283-290
Number of pages8
JournalJournal of Machine Learning Research
StatePublished - 2007
Event11th International Conference on Artificial Intelligence and Statistics, AISTATS 2007 - San Juan, Puerto Rico
Duration: Mar 21 2007Mar 24 2007

All Science Journal Classification (ASJC) codes

  • Control and Systems Engineering
  • Software
  • Statistics and Probability
  • Artificial Intelligence


Dive into the research topics of 'Sparse nonparametric density estimation in high dimensions using the rodeo'. Together they form a unique fingerprint.

Cite this