## Abstract

We consider the problem of nonparametric regression, consisting of learning an arbitrary mapping f:X→Y from a data set of (x,y) pairs in which the y values are corrupted by noise of mean zero. This statistical task is known to be subject to a severe curse of dimensionality: if X⊂ ^{RD}, and if the only smoothness assumption on f is that it satisfies a Lipschitz condition, it is known that any estimator based on n data points will have an error rate (risk) of Ω(n- ^{2/(2+D)}). Here we present a tree-based regressor whose risk depends only on the doubling dimension of X, not on D. This notion of dimension generalizes two cases of contemporary interest: when X is a low-dimensional manifold, and when X is sparse. The tree is built using random hyperplanes as splitting criteria, building upon recent work of Dasgupta and Freund (2008) [5]; and we show that axis-parallel splits cannot achieve the same finite-sample rate of convergence.

Original language | English (US) |
---|---|

Pages (from-to) | 1496-1515 |

Number of pages | 20 |

Journal | Journal of Computer and System Sciences |

Volume | 78 |

Issue number | 5 |

DOIs | |

State | Published - Sep 2012 |

## All Science Journal Classification (ASJC) codes

- Theoretical Computer Science
- Computer Networks and Communications
- Computational Theory and Mathematics
- Applied Mathematics

## Keywords

- Manifold
- Nonparametric regression
- Notions of dimension
- Sparse data