TY - GEN
T1 - What does classifying more than 10,000 image categories tell us?
AU - Deng, Jia
AU - Berg, Alexander C.
AU - Li, Kai
AU - Fei-Fei, Li
PY - 2010
Y1 - 2010
N2 - Image classification is a critical task for both humans and computers. One of the challenges lies in the large scale of the semantic space. In particular, humans can recognize tens of thousands of object classes and scenes. No computer vision algorithm today has been tested at this scale. This paper presents a study of large scale categorization including a series of challenging experiments on classification with more than 10,000 image classes. We find that a) computational issues become crucial in algorithm design; b) conventional wisdom from a couple of hundred image categories on relative performance of different classifiers does not necessarily hold when the number of categories increases; c) there is a surprisingly strong relationship between the structure of WordNet (developed for studying language) and the difficulty of visual categorization; d) classification can be improved by exploiting the semantic hierarchy. Toward the future goal of developing automatic vision algorithms to recognize tens of thousands or even millions of image categories, we make a series of observations and arguments about dataset scale, category density, and image hierarchy.
AB - Image classification is a critical task for both humans and computers. One of the challenges lies in the large scale of the semantic space. In particular, humans can recognize tens of thousands of object classes and scenes. No computer vision algorithm today has been tested at this scale. This paper presents a study of large scale categorization including a series of challenging experiments on classification with more than 10,000 image classes. We find that a) computational issues become crucial in algorithm design; b) conventional wisdom from a couple of hundred image categories on relative performance of different classifiers does not necessarily hold when the number of categories increases; c) there is a surprisingly strong relationship between the structure of WordNet (developed for studying language) and the difficulty of visual categorization; d) classification can be improved by exploiting the semantic hierarchy. Toward the future goal of developing automatic vision algorithms to recognize tens of thousands or even millions of image categories, we make a series of observations and arguments about dataset scale, category density, and image hierarchy.
UR - http://www.scopus.com/inward/record.url?scp=78149302207&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=78149302207&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-15555-0_6
DO - 10.1007/978-3-642-15555-0_6
M3 - Conference contribution
AN - SCOPUS:78149302207
SN - 3642155545
SN - 9783642155543
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 71
EP - 84
BT - Computer Vision, ECCV 2010 - 11th European Conference on Computer Vision, Proceedings
PB - Springer Verlag
T2 - 11th European Conference on Computer Vision, ECCV 2010
Y2 - 10 September 2010 through 11 September 2010
ER -