Intelligent probing for locality sensitive hashing: Multi-probe LSH and beyond

Qin Lv, William Josephson, Zhe Wang, Moses Charikar, Kai Li

Research output: Contribution to journalConference articlepeer-review

14 Scopus citations

Abstract

The past decade has been marked by the (continued) explosion of diverse data content and the fast development of intelligent data analytics techniques. One problem we identified in the mid-2000s was similarity search of feature-rich data. The challenge here was achieving both high accuracy and high effciency in high-dimensional spaces. Locality sensitive hashing (LSH), which uses certain random space partitions and hash table lookups to find approximate nearest neighbors, was a promising approach with theoretical guarantees. But LSH alone was insuffcient since a large number of hash tables were required to achieve good search quality. Building on an idea of Panigrahy, our multi-probe LSH method introduced the idea of intelligent probing. Given a query object, we strategically probe its neighboring hash buckets (in a query-dependent fashion) by calculating the statistical probabilities of similar objects falling into each bucket. Such intelligent probing can significantly reduce the number of hash tables while achieving high quality. In this paper, we revisit the problem motivation, the challenges, the key design considerations of multi-probe LSH, as well as discuss recent developments in this space and some questions for further research.

Original languageEnglish (US)
Pages (from-to)2021-2024
Number of pages4
JournalProceedings of the VLDB Endowment
Volume10
Issue number12
DOIs
StatePublished - Aug 1 2017
Event43rd International Conference on Very Large Data Bases, VLDB 2017 - Munich, Germany
Duration: Aug 28 2017Sep 1 2017

All Science Journal Classification (ASJC) codes

  • Computer Science (miscellaneous)
  • General Computer Science

Fingerprint

Dive into the research topics of 'Intelligent probing for locality sensitive hashing: Multi-probe LSH and beyond'. Together they form a unique fingerprint.

Cite this