Demystifying fixed k-nearest neighbor information estimators

Weihao Gao, Sewoong Oh, Pramod Viswanath

Research output: Chapter in Book/Report/Conference proceedingConference contribution

15 Scopus citations

Abstract

Estimating mutual information from i.i.d. samples drawn from an unknown joint density function is a basic statistical problem of broad interest with multitudinous applications. The most popular estimator is one proposed by Kraskov and Stogbauer and Grassberger (KSG) in 2004, and is nonparametric and based on the distances of each sample to its kth nearest neighboring sample, where k is a fixed small integer. Despite its widespread use (part of scientific software packages), theoretical properties of this estimator have been largely unexplored. In this paper we demonstrate that the estimator is consistent and also identify an upper bound on the rate of convergence of the ℓ2 error as a function of number of samples. We argue that the performance benefits of the KSG estimator stems from a curious 'correlation boosting' effect and build on this intuition to modify the KSG estimator in novel ways to construct a superior estimator. As a byproduct of our investigations, we obtain nearly tight rates of convergence of the ℓ2 error of the well known fixed k nearest neighbor estimator of differential entropy by Kozachenko and Leonenko.

Original languageEnglish (US)
Title of host publication2017 IEEE International Symposium on Information Theory, ISIT 2017
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1267-1271
Number of pages5
ISBN (Electronic)9781509040964
DOIs
StatePublished - Aug 9 2017
Externally publishedYes
Event2017 IEEE International Symposium on Information Theory, ISIT 2017 - Aachen, Germany
Duration: Jun 25 2017Jun 30 2017

Publication series

NameIEEE International Symposium on Information Theory - Proceedings
ISSN (Print)2157-8095

Other

Other2017 IEEE International Symposium on Information Theory, ISIT 2017
Country/TerritoryGermany
CityAachen
Period6/25/176/30/17

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Information Systems
  • Modeling and Simulation
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'Demystifying fixed k-nearest neighbor information estimators'. Together they form a unique fingerprint.

Cite this