Abstract
Estimators of information theoretic measures, such as entropy and mutual information, are a basic workhorse for many downstream applications in modern data science. State-of-the-art approaches have been either geometric [nearest neighbor (NN)-based] or kernel-based (with a globally chosen bandwidth). In this paper, we combine both these approaches to design new estimators of entropy and mutual information that outperform the state-of-the-art methods. Our estimator uses local bandwidth choices of k -NN distances with a finite k , independent of the sample size. Such a local and data dependent choice ameliorates boundary bias and improves performance in practice, but the bandwidth is vanishing at a fast rate, leading to a non-vanishing bias. We show that the asymptotic bias of the proposed estimator is universal; it is independent of the underlying distribution. Hence, it can be precomputed and subtracted from the estimate. As a byproduct, we obtain a unified way of obtaining both the kernel and NN estimators. The corresponding theoretical contribution relating the asymptotic geometry of nearest neighbors to order statistics is of independent mathematical interest.
| Original language | English (US) |
|---|---|
| Pages (from-to) | 3313-3330 |
| Number of pages | 18 |
| Journal | IEEE Transactions on Information Theory |
| Volume | 64 |
| Issue number | 5 |
| DOIs | |
| State | Published - May 2018 |
| Externally published | Yes |
All Science Journal Classification (ASJC) codes
- Information Systems
- Computer Science Applications
- Library and Information Sciences
Keywords
- Information theory
- density estimation robust algorithm
- information entropy
- mutual information
- nearest neighbor methods