TY - GEN
T1 - Efficient K-nearest neighbor graph construction for generic similarity measures
AU - Dong, Wei
AU - Charikar, Moses
AU - Li, Kai
PY - 2011
Y1 - 2011
N2 - K-Nearest Neighbor Graph (K-NNG) construction is an important operation with many web related applications, including collaborative filtering, similarity search, and many others in data mining and machine learning. Existing methods for K-NNG construction either do not scale, or are specific to certain similarity measures. We present NN-Descent, a simple yet efficient algorithm for approximate K-NNG construction with arbitrary similarity measures. Our method is based on local search, has minimal space overhead and does not rely on any shared global index. Hence, it is especially suitable for large-scale applications where data structures need to be distributed over the network. We have shown with a variety of datasets and similarity measures that the proposed method typically converges to above 90% recall with each point comparing only to several percent of the whole dataset on average.
AB - K-Nearest Neighbor Graph (K-NNG) construction is an important operation with many web related applications, including collaborative filtering, similarity search, and many others in data mining and machine learning. Existing methods for K-NNG construction either do not scale, or are specific to certain similarity measures. We present NN-Descent, a simple yet efficient algorithm for approximate K-NNG construction with arbitrary similarity measures. Our method is based on local search, has minimal space overhead and does not rely on any shared global index. Hence, it is especially suitable for large-scale applications where data structures need to be distributed over the network. We have shown with a variety of datasets and similarity measures that the proposed method typically converges to above 90% recall with each point comparing only to several percent of the whole dataset on average.
KW - Arbitrary similarity measure
KW - Iterative method
KW - K-nearest neighbor graph
UR - http://www.scopus.com/inward/record.url?scp=84862193372&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84862193372&partnerID=8YFLogxK
U2 - 10.1145/1963405.1963487
DO - 10.1145/1963405.1963487
M3 - Conference contribution
AN - SCOPUS:84862193372
SN - 9781450306324
T3 - Proceedings of the 20th International Conference on World Wide Web, WWW 2011
SP - 577
EP - 586
BT - Proceedings of the 20th International Conference on World Wide Web, WWW 2011
T2 - 20th International Conference on World Wide Web, WWW 2011
Y2 - 28 March 2011 through 1 April 2011
ER -