Image similarity search with compact data structures

Qin Lv, Moses Charikar, Kai Li

Research output: Contribution to conferencePaperpeer-review

77 Scopus citations


The recent theoretical advances on compact data structures (also called "sketches" ) have raised the question of whether they can effectively be applied to content-based image retrieval systems. The main challenge is to derive an algorithm that achieves high-quality similarity searches while using compact metadata. This paper proposes a new similarity search method consisting of three parts. The first is a new region feature representation with weighted L1 distance function, and EMD* match, an improved EMD match, to compute image similarity. The second is a thresholding and transformation algorithm to convert feature vectors into very compact data structures. The third is an EMD embedding based filtering method to speed up the query process. We have implemented a prototype system with the proposed method and performed experiments with a 10,000 image database. Our results show that the proposed method can achieve more effective similarity searches than previous approaches with metadata 3 to 72 times more compact than previous systems. The experiments also show that our EMD embedding based filtering technique can speed up the query process by a factor of 5 or more with little loss in query effectiveness.

Original languageEnglish (US)
Number of pages10
StatePublished - 2004
Externally publishedYes
EventCIKM 2004: Proceedings of the Thirteenth ACM Conference on Information and Knowledge Management - Washington, DC, United States
Duration: Nov 8 2004Nov 13 2004


OtherCIKM 2004: Proceedings of the Thirteenth ACM Conference on Information and Knowledge Management
Country/TerritoryUnited States
CityWashington, DC

All Science Journal Classification (ASJC) codes

  • General Decision Sciences
  • General Business, Management and Accounting


  • Compact data structures
  • Image similarity
  • Search


Dive into the research topics of 'Image similarity search with compact data structures'. Together they form a unique fingerprint.

Cite this