Abstract
We consider the problem of estimating the joint density of a d-dimensional random vector X = (X 1,X 2,...,X d) when d is large. We assume that the density is a product of a parametric component and a nonparametric component which depends on an unknown subset of the variables. Using a modification of a recently developed nonparametric regression framework called rodeo (regularization of derivative expectation operator), we propose a method to greedily select bandwidths in a kernel density estimate. It is shown empirically that the density rodeo works well even for very high dimensional problems. When the unknown density function satisfies a suitably defined sparsity condition, and the parametric baseline density is smooth, the approach is shown to achieve near optimal minimax rates of convergence, and thus avoids the curse of dimensionality.
| Original language | English (US) |
|---|---|
| Pages (from-to) | 283-290 |
| Number of pages | 8 |
| Journal | Journal of Machine Learning Research |
| Volume | 2 |
| State | Published - 2007 |
| Event | 11th International Conference on Artificial Intelligence and Statistics, AISTATS 2007 - San Juan, Puerto Rico Duration: Mar 21 2007 → Mar 24 2007 |
All Science Journal Classification (ASJC) codes
- Control and Systems Engineering
- Software
- Statistics and Probability
- Artificial Intelligence
Fingerprint
Dive into the research topics of 'Sparse nonparametric density estimation in high dimensions using the rodeo'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver