Data-Dependent hashing via nonlinear spectral gaps

Alexandr Andoni, Assaf Naor, Aleksandar Nikolov, Ilya Razenshteyn, Erik Waingarten

Research output: Chapter in Book/Report/Conference proceedingConference contribution

20 Scopus citations

Abstract

We establish a generic reduction from nonlinear spectral gaps of metric spaces to data-dependent Locality-Sensitive Hashing, yielding a new approach to the high-dimensional Approximate Near Neighbor Search problem (ANN) under various distance functions. Using this reduction, we obtain the following results: For general d-dimensional normed spaces and n-point datasets, we obtain a cell-probe ANN data structure with approximation O(log ε2 d), space dO(1)n1+ε, and dO(1)nε cell probes per query, for any > 0. No non-trivial approximation was known before in this generality other than the O(d) bound which follows from embedding a general norm into ℓ2. For ℓp and Schatten-p norms, we improve the data structure further, to obtain approximation O(p) and sublinear query time. For ℓp, this improves upon the previous best approximation 2O(p) (which required polynomial as opposed to near-linear in n space). For the Schatten-p norm, no non-trivial ANN data structure was known before this work. Previous approaches to the ANN problem either exploit the low dimensionality of a metric, requiring space exponential in the dimension, or circumvent the curse of dimensionality by embedding a metric into a “tractable” space, such as ℓ1. Our new generic reduction proceeds differently from both of these approaches using a novel partitioning method.

Original languageEnglish (US)
Title of host publicationSTOC 2018 - Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing
EditorsMonika Henzinger, David Kempe, Ilias Diakonikolas
PublisherAssociation for Computing Machinery
Pages685-698
Number of pages14
ISBN (Electronic)9781450355599
DOIs
StatePublished - Jun 20 2018
Event50th Annual ACM Symposium on Theory of Computing, STOC 2018 - Los Angeles, United States
Duration: Jun 25 2018Jun 29 2018

Publication series

NameProceedings of the Annual ACM Symposium on Theory of Computing
ISSN (Print)0737-8017

Other

Other50th Annual ACM Symposium on Theory of Computing, STOC 2018
Country/TerritoryUnited States
CityLos Angeles
Period6/25/186/29/18

All Science Journal Classification (ASJC) codes

  • Software

Keywords

  • Locality-sensitive hashing
  • Nearest neighbor search
  • Nonlinear spectral gaps
  • Randomized space partitions

Fingerprint

Dive into the research topics of 'Data-Dependent hashing via nonlinear spectral gaps'. Together they form a unique fingerprint.

Cite this