Intelligent probing for locality sensitive hashing

Multi-probe LSH and beyond

Qin Lv, William Josephson, Zhe Wang, Moses Charikar, Kai Li

Research output: Contribution to journalConference article

1 Citation (Scopus)

Abstract

The past decade has been marked by the (continued) explosion of diverse data content and the fast development of intelligent data analytics techniques. One problem we identified in the mid-2000s was similarity search of feature-rich data. The challenge here was achieving both high accuracy and high effciency in high-dimensional spaces. Locality sensitive hashing (LSH), which uses certain random space partitions and hash table lookups to find approximate nearest neighbors, was a promising approach with theoretical guarantees. But LSH alone was insuffcient since a large number of hash tables were required to achieve good search quality. Building on an idea of Panigrahy, our multi-probe LSH method introduced the idea of intelligent probing. Given a query object, we strategically probe its neighboring hash buckets (in a query-dependent fashion) by calculating the statistical probabilities of similar objects falling into each bucket. Such intelligent probing can significantly reduce the number of hash tables while achieving high quality. In this paper, we revisit the problem motivation, the challenges, the key design considerations of multi-probe LSH, as well as discuss recent developments in this space and some questions for further research.

Original languageEnglish (US)
Pages (from-to)2021-2024
Number of pages4
JournalProceedings of the VLDB Endowment
Volume10
Issue number12
DOIs
StatePublished - Aug 1 2017
Event43rd International Conference on Very Large Data Bases, VLDB 2017 - Munich, Germany
Duration: Aug 28 2017Sep 1 2017

Fingerprint

Explosions
Nearest neighbor search

All Science Journal Classification (ASJC) codes

  • Computer Science (miscellaneous)
  • Computer Science(all)

Cite this

Lv, Qin ; Josephson, William ; Wang, Zhe ; Charikar, Moses ; Li, Kai. / Intelligent probing for locality sensitive hashing : Multi-probe LSH and beyond. In: Proceedings of the VLDB Endowment. 2017 ; Vol. 10, No. 12. pp. 2021-2024.
@article{96e9a93851124dc8b27624e7c6b43a9b,
title = "Intelligent probing for locality sensitive hashing: Multi-probe LSH and beyond",
abstract = "The past decade has been marked by the (continued) explosion of diverse data content and the fast development of intelligent data analytics techniques. One problem we identified in the mid-2000s was similarity search of feature-rich data. The challenge here was achieving both high accuracy and high effciency in high-dimensional spaces. Locality sensitive hashing (LSH), which uses certain random space partitions and hash table lookups to find approximate nearest neighbors, was a promising approach with theoretical guarantees. But LSH alone was insuffcient since a large number of hash tables were required to achieve good search quality. Building on an idea of Panigrahy, our multi-probe LSH method introduced the idea of intelligent probing. Given a query object, we strategically probe its neighboring hash buckets (in a query-dependent fashion) by calculating the statistical probabilities of similar objects falling into each bucket. Such intelligent probing can significantly reduce the number of hash tables while achieving high quality. In this paper, we revisit the problem motivation, the challenges, the key design considerations of multi-probe LSH, as well as discuss recent developments in this space and some questions for further research.",
author = "Qin Lv and William Josephson and Zhe Wang and Moses Charikar and Kai Li",
year = "2017",
month = "8",
day = "1",
doi = "10.14778/3137765.3137836",
language = "English (US)",
volume = "10",
pages = "2021--2024",
journal = "Proceedings of the VLDB Endowment",
issn = "2150-8097",
publisher = "Very Large Data Base Endowment Inc.",
number = "12",

}

Intelligent probing for locality sensitive hashing : Multi-probe LSH and beyond. / Lv, Qin; Josephson, William; Wang, Zhe; Charikar, Moses; Li, Kai.

In: Proceedings of the VLDB Endowment, Vol. 10, No. 12, 01.08.2017, p. 2021-2024.

Research output: Contribution to journalConference article

TY - JOUR

T1 - Intelligent probing for locality sensitive hashing

T2 - Multi-probe LSH and beyond

AU - Lv, Qin

AU - Josephson, William

AU - Wang, Zhe

AU - Charikar, Moses

AU - Li, Kai

PY - 2017/8/1

Y1 - 2017/8/1

N2 - The past decade has been marked by the (continued) explosion of diverse data content and the fast development of intelligent data analytics techniques. One problem we identified in the mid-2000s was similarity search of feature-rich data. The challenge here was achieving both high accuracy and high effciency in high-dimensional spaces. Locality sensitive hashing (LSH), which uses certain random space partitions and hash table lookups to find approximate nearest neighbors, was a promising approach with theoretical guarantees. But LSH alone was insuffcient since a large number of hash tables were required to achieve good search quality. Building on an idea of Panigrahy, our multi-probe LSH method introduced the idea of intelligent probing. Given a query object, we strategically probe its neighboring hash buckets (in a query-dependent fashion) by calculating the statistical probabilities of similar objects falling into each bucket. Such intelligent probing can significantly reduce the number of hash tables while achieving high quality. In this paper, we revisit the problem motivation, the challenges, the key design considerations of multi-probe LSH, as well as discuss recent developments in this space and some questions for further research.

AB - The past decade has been marked by the (continued) explosion of diverse data content and the fast development of intelligent data analytics techniques. One problem we identified in the mid-2000s was similarity search of feature-rich data. The challenge here was achieving both high accuracy and high effciency in high-dimensional spaces. Locality sensitive hashing (LSH), which uses certain random space partitions and hash table lookups to find approximate nearest neighbors, was a promising approach with theoretical guarantees. But LSH alone was insuffcient since a large number of hash tables were required to achieve good search quality. Building on an idea of Panigrahy, our multi-probe LSH method introduced the idea of intelligent probing. Given a query object, we strategically probe its neighboring hash buckets (in a query-dependent fashion) by calculating the statistical probabilities of similar objects falling into each bucket. Such intelligent probing can significantly reduce the number of hash tables while achieving high quality. In this paper, we revisit the problem motivation, the challenges, the key design considerations of multi-probe LSH, as well as discuss recent developments in this space and some questions for further research.

UR - http://www.scopus.com/inward/record.url?scp=85036657889&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85036657889&partnerID=8YFLogxK

U2 - 10.14778/3137765.3137836

DO - 10.14778/3137765.3137836

M3 - Conference article

VL - 10

SP - 2021

EP - 2024

JO - Proceedings of the VLDB Endowment

JF - Proceedings of the VLDB Endowment

SN - 2150-8097

IS - 12

ER -