Mining semantic affordances of visual object categories

Yu Wei Chao, Zhan Wang, Rada Mihalcea, Jia Deng

Research output: Chapter in Book/Report/Conference proceedingConference contribution

25 Scopus citations

Abstract

Affordances are fundamental attributes of objects. Affordances reveal the functionalities of objects and the possible actions that can be performed on them. Understanding affordances is crucial for recognizing human activities in visual data and for robots to interact with the world. In this paper we introduce the new problem of mining the knowledge of semantic affordance: given an object, determining whether an action can be performed on it. This is equivalent to connecting verb nodes and noun nodes in WordNet, or filling an affordance matrix encoding the plausibility of each action-object pair. We introduce a new benchmark with crowdsourced ground truth affordances on 20 PASCAL VOC object classes and 957 action classes. We explore a number of approaches including text mining, visual mining, and collaborative filtering. Our analyses yield a number of significant insights that reveal the most effective ways of collecting knowledge of semantic affordances.

Original languageEnglish (US)
Title of host publicationIEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015
PublisherIEEE Computer Society
Pages4259-4267
Number of pages9
ISBN (Electronic)9781467369640
DOIs
StatePublished - Oct 14 2015
Externally publishedYes
EventIEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015 - Boston, United States
Duration: Jun 7 2015Jun 12 2015

Publication series

NameProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Volume07-12-June-2015
ISSN (Print)1063-6919

Other

OtherIEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015
CountryUnited States
CityBoston
Period6/7/156/12/15

All Science Journal Classification (ASJC) codes

  • Software
  • Computer Vision and Pattern Recognition

Fingerprint Dive into the research topics of 'Mining semantic affordances of visual object categories'. Together they form a unique fingerprint.

Cite this