Learning globally-consistent local distance functions for shape-based image retrieval and classification

Andrea Frome, Fei Sha, Yoram Singer, Jitendra Malik

Research output: Contribution to conferencePaper

250 Scopus citations

Abstract

We address the problem of visual category recognition by learning an image-to-image distance function that attempts to satisfy the following property: the distance between images from the same category should be less than the distance between images from different categories. We use patch-based feature vectors common in object recognition work as a basis for our image-to-image distance functions. Our large-margin formulation for learning the distance functions is similar to formulations used in the machine learning literature on distance metric learning, however we differ in that we learn local distance functions - a different parameterized function for every image of our training set - whereas typically a single global distance function is learned. This was a novel approach first introduced in Frome, Singer, & Malik, NIPS 2006. In that work we learned the local distance functions independently, and the outputs of these functions could not be compared at test time without the use of additional heuristics or training. Here we introduce a different approach that has the advantage that it learns distance functions that are globally consistent in that they can be directly compared for purposes of retrieval and classification. The output of the learning algorithm are weights assigned to the image features, which is intuitively appealing in the computer vision setting: some features are more salient than others, and which are more salient depends on the category, or image, being considered. We train and test using the Caltech 101 object recognition benchmark. Using fifteen training images per category, we achieved a mean recognition rate of 63.2% and using twenty images per category, a rate of 66.6%.

Original languageEnglish (US)
DOIs
StatePublished - Dec 1 2007
Externally publishedYes
Event2007 IEEE 11th International Conference on Computer Vision, ICCV - Rio de Janeiro, Brazil
Duration: Oct 14 2007Oct 21 2007

Other

Other2007 IEEE 11th International Conference on Computer Vision, ICCV
CountryBrazil
CityRio de Janeiro
Period10/14/0710/21/07

All Science Journal Classification (ASJC) codes

  • Software
  • Computer Vision and Pattern Recognition

Fingerprint Dive into the research topics of 'Learning globally-consistent local distance functions for shape-based image retrieval and classification'. Together they form a unique fingerprint.

  • Cite this

    Frome, A., Sha, F., Singer, Y., & Malik, J. (2007). Learning globally-consistent local distance functions for shape-based image retrieval and classification. Paper presented at 2007 IEEE 11th International Conference on Computer Vision, ICCV, Rio de Janeiro, Brazil. https://doi.org/10.1109/ICCV.2007.4408839