Efficient K-nearest neighbor graph construction for generic similarity measures

Wei Dong, Moses Charikar, Kai Li

Research output: Chapter in Book/Report/Conference proceedingConference contribution

238 Scopus citations

Abstract

K-Nearest Neighbor Graph (K-NNG) construction is an important operation with many web related applications, including collaborative filtering, similarity search, and many others in data mining and machine learning. Existing methods for K-NNG construction either do not scale, or are specific to certain similarity measures. We present NN-Descent, a simple yet efficient algorithm for approximate K-NNG construction with arbitrary similarity measures. Our method is based on local search, has minimal space overhead and does not rely on any shared global index. Hence, it is especially suitable for large-scale applications where data structures need to be distributed over the network. We have shown with a variety of datasets and similarity measures that the proposed method typically converges to above 90% recall with each point comparing only to several percent of the whole dataset on average.

Original languageEnglish (US)
Title of host publicationProceedings of the 20th International Conference on World Wide Web, WWW 2011
Pages577-586
Number of pages10
DOIs
StatePublished - Dec 1 2011
Event20th International Conference on World Wide Web, WWW 2011 - Hyderabad, India
Duration: Mar 28 2011Apr 1 2011

Publication series

NameProceedings of the 20th International Conference on World Wide Web, WWW 2011

Other

Other20th International Conference on World Wide Web, WWW 2011
CountryIndia
CityHyderabad
Period3/28/114/1/11

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications

Keywords

  • Arbitrary similarity measure
  • Iterative method
  • K-nearest neighbor graph

Fingerprint Dive into the research topics of 'Efficient K-nearest neighbor graph construction for generic similarity measures'. Together they form a unique fingerprint.

Cite this