Predicting Entry-Level Categories

Vicente Ordonez, Wei Liu, Jia Deng, Yejin Choi, Alexander C. Berg, Tamara L. Berg

Research output: Contribution to journalArticle

12 Scopus citations

Abstract

Entry-level categories—the labels people use to name an object—were originally defined and studied by psychologists in the 1970s and 1980s. In this paper we extend these ideas to study entry-level categories at a larger scale and to learn models that can automatically predict entry-level categories for images. Our models combine visual recognition predictions with linguistic resources like WordNet and proxies for word “naturalness” mined from the enormous amount of text on the web. We demonstrate the usefulness of our models for predicting nouns (entry-level words) associated with images by people, and for learning mappings between concepts predicted by existing visual recognition systems and entry-level concepts. In this work we make use of recent successful efforts on convolutional network models for visual recognition by training classifiers for 7404 object categories on ConvNet activation features. Results for category mapping and entry-level category prediction for images show promise for producing more natural human-like labels. We also demonstrate the potential applicability of our results to the task of image description generation.

Original languageEnglish (US)
Pages (from-to)29-43
Number of pages15
JournalInternational Journal of Computer Vision
Volume115
Issue number1
DOIs
StatePublished - Oct 29 2015
Externally publishedYes

All Science Journal Classification (ASJC) codes

  • Software
  • Computer Vision and Pattern Recognition
  • Artificial Intelligence

Keywords

  • Categorization
  • Entry-level categories
  • Psychology
  • Recognition

Fingerprint Dive into the research topics of 'Predicting Entry-Level Categories'. Together they form a unique fingerprint.

  • Cite this

    Ordonez, V., Liu, W., Deng, J., Choi, Y., Berg, A. C., & Berg, T. L. (2015). Predicting Entry-Level Categories. International Journal of Computer Vision, 115(1), 29-43. https://doi.org/10.1007/s11263-015-0815-z