Efficient filtering with sketches in the ferret toolkit

Qin Lv, William Josephson, Zhe Wang, Moses Charikar, Kai Li

Research output: Chapter in Book/Report/Conference proceedingConference contribution

7 Scopus citations

Abstract

Ferret is a toolkit for building content-based similarity search systems for feature-rich data types such as audio, video, and digital photos.The key component of this toolkit is a content-based similarity search engine for generic, multi-feature object representations. This paper describes the filtering mechanism used in the Ferret toolkit and experimental results with several datasets. The filtering mechanism uses approximation algorithms to generate a candidate set, and then ranks the objects in the candidate set with a more sophisticated multi-feature distance measure. The paper compared two filtering methods: using segment feature vectors and sketches constructed from segment feature vectors. Our experimental results show that filtering can substantially speedup the search process and reduce memory requirement while maintaining good search quality. To help systems designers choose the filtering parameters, we have developed a rank-based analytical model for the filtering algorithm using sketches. Our experiments show that the model gives conservative and good prediction for different datasets.

Original languageEnglish (US)
Title of host publicationProceedings of the 8th ACM Multimedia International Workshop on Multimedia Information Retrieval, MIR 2006
Pages279-288
Number of pages10
DOIs
StatePublished - 2006
Event8th ACM Multimedia International Workshop on Multimedia Information Retrieval, MIR 2006, co-located with the 2006 ACM International Multimedia Conferenc - Santa Barbara, CA, United States
Duration: Oct 26 2006Oct 27 2006

Publication series

NameProceedings of the ACM International Multimedia Conference and Exhibition

Other

Other8th ACM Multimedia International Workshop on Multimedia Information Retrieval, MIR 2006, co-located with the 2006 ACM International Multimedia Conferenc
Country/TerritoryUnited States
CitySanta Barbara, CA
Period10/26/0610/27/06

All Science Journal Classification (ASJC) codes

  • General Computer Science

Keywords

  • Feature-rich data
  • Filtering
  • Similarity search
  • Sketch
  • Toolkit

Fingerprint

Dive into the research topics of 'Efficient filtering with sketches in the ferret toolkit'. Together they form a unique fingerprint.

Cite this