Abstract
Estimators of information theoretic measures, such as entropy and mutual information, are a basic workhorse for many downstream applications in modern data science. State-of-the-art approaches have been either geometric [nearest neighbor (NN)-based] or kernel-based (with a globally chosen bandwidth). In this paper, we combine both these approaches to design new estimators of entropy and mutual information that outperform the state-of-the-art methods. Our estimator uses local bandwidth choices of k -NN distances with a finite k , independent of the sample size. Such a local and data dependent choice ameliorates boundary bias and improves performance in practice, but the bandwidth is vanishing at a fast rate, leading to a non-vanishing bias. We show that the asymptotic bias of the proposed estimator is universal; it is independent of the underlying distribution. Hence, it can be precomputed and subtracted from the estimate. As a byproduct, we obtain a unified way of obtaining both the kernel and NN estimators. The corresponding theoretical contribution relating the asymptotic geometry of nearest neighbors to order statistics is of independent mathematical interest.
Original language | English (US) |
---|---|
Pages (from-to) | 3313-3330 |
Number of pages | 18 |
Journal | IEEE Transactions on Information Theory |
Volume | 64 |
Issue number | 5 |
DOIs | |
State | Published - May 2018 |
Externally published | Yes |
All Science Journal Classification (ASJC) codes
- Information Systems
- Computer Science Applications
- Library and Information Sciences
Keywords
- Information theory
- density estimation robust algorithm
- information entropy
- mutual information
- nearest neighbor methods