Rates of Convergence of Nearest Neighbor Estimation Under Arbitrary Sampling

Sanjeev R. Kulkarni, Steven E. Posner

Research output: Contribution to journalArticle

67 Scopus citations

Abstract

Rates of convergence for nearest neighbor estimation are established in a general framework in terms of metric covering numbers of the underlying space. Our first result is to find explicit finite sample upper bounds for the classical independent and identically distributed (i.i.d.) random sampling problem in a separable metric space setting. The convergence rate is a function of the covering numbers of the support of the distribution. For example, for bounded subsets of ℝr, the convergence rate is O (1/n2/r). Our main result is to extend the problem to allow samples drawn from a completely arbitrary random process in a separable metric space and to examine the performance in terms of the individual sample sequences. We show that for every sequence of samples the asymptotic time-average of nearest neighbor risks equals twice the time-average of the conditional Bayes risks of the sequence. Finite sample upper bounds under arbitrary sampling are again obtained in terms of the covering numbers of the underlying space. In particular, for bounded subsets of ℝr the convergence rate of the time-averaged risk is O (1/n2/r). We then establish a consistency result for kn-nearest neighbor estimation under arbitrary sampling and prove a convergence rate matching established rates for i.i.d. sampling. Finally, we show how our arbitrary sampling results lead to some classical i.i.d. sampling results and in fact extend them to stationary sampling. Our framework and results are quite general while the proof techniques are surprisingly elementary.

Original languageEnglish (US)
Pages (from-to)1028-1039
Number of pages12
JournalIEEE Transactions on Information Theory
Volume41
Issue number4
DOIs
StatePublished - Jul 1995

All Science Journal Classification (ASJC) codes

  • Information Systems
  • Computer Science Applications
  • Library and Information Sciences

Keywords

  • Nearest neighbor
  • arbitrary sampling
  • consistency
  • covering numbers
  • deterministic sampling
  • metric entropy
  • nonparametric regression estimation
  • rates of convergence
  • worst case

Fingerprint Dive into the research topics of 'Rates of Convergence of Nearest Neighbor Estimation Under Arbitrary Sampling'. Together they form a unique fingerprint.

  • Cite this