What does classifying more than 10,000 image categories tell us?

Jia Deng, Alexander C. Berg, Kai Li, Li Fei-Fei

Research output: Chapter in Book/Report/Conference proceedingConference contribution

206 Scopus citations

Abstract

Image classification is a critical task for both humans and computers. One of the challenges lies in the large scale of the semantic space. In particular, humans can recognize tens of thousands of object classes and scenes. No computer vision algorithm today has been tested at this scale. This paper presents a study of large scale categorization including a series of challenging experiments on classification with more than 10,000 image classes. We find that a) computational issues become crucial in algorithm design; b) conventional wisdom from a couple of hundred image categories on relative performance of different classifiers does not necessarily hold when the number of categories increases; c) there is a surprisingly strong relationship between the structure of WordNet (developed for studying language) and the difficulty of visual categorization; d) classification can be improved by exploiting the semantic hierarchy. Toward the future goal of developing automatic vision algorithms to recognize tens of thousands or even millions of image categories, we make a series of observations and arguments about dataset scale, category density, and image hierarchy.

Original languageEnglish (US)
Title of host publicationComputer Vision, ECCV 2010 - 11th European Conference on Computer Vision, Proceedings
PublisherSpringer Verlag
Pages71-84
Number of pages14
EditionPART 5
ISBN (Print)3642155545, 9783642155543
DOIs
StatePublished - Jan 1 2010
Event11th European Conference on Computer Vision, ECCV 2010 - Heraklion, Crete, Greece
Duration: Sep 10 2010Sep 11 2010

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 5
Volume6315 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other11th European Conference on Computer Vision, ECCV 2010
CountryGreece
CityHeraklion, Crete
Period9/10/109/11/10

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint Dive into the research topics of 'What does classifying more than 10,000 image categories tell us?'. Together they form a unique fingerprint.

  • Cite this

    Deng, J., Berg, A. C., Li, K., & Fei-Fei, L. (2010). What does classifying more than 10,000 image categories tell us? In Computer Vision, ECCV 2010 - 11th European Conference on Computer Vision, Proceedings (PART 5 ed., pp. 71-84). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 6315 LNCS, No. PART 5). Springer Verlag. https://doi.org/10.1007/978-3-642-15555-0_6