The more you look, the more you see: Towards general object understanding through recursive refinement

Jingyan Wang, Olga Russakovsky, Deva Ramanan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Scopus citations

Abstract

Comprehensive object understanding is a central challenge in visual recognition, yet most advances with deep neural networks reason about each aspect in isolation. In this work, we present a unified framework to tackle this broader object understanding problem. We formalize a refinement module that recursively develops understanding across space and semantics-'the more it looks, the more it sees.' More concretely, we cluster the objects within each semantic category into fine-grained subcategories; our recursive model extracts features for each region of interest, recursively predicts the location and the content of the region, and selectively chooses a small subset of the regions to process in the next step. Our model can quickly determine if an object is present, followed by its class ('Is this a person?'), and finally report finegrained predictions ('Is this person standing?'). Our experiments demonstrate the advantages of joint reasoning about spatial layout and fine-grained semantics. On the PASCAL VOC dataset, our proposed model simultaneously achieves strong performance on instance segmentation, part segmentation and keypoint detection in a single efficient pipeline that does not require explicit training for each task. One of the reasons for our strong performance is the ability to naturally leverage highly-engineered architectures, such as Faster-RCNN, within our pipeline. Source code is available at https://github.com/jingyanw/recursive-refinement.

Original languageEnglish (US)
Title of host publicationProceedings - 2018 IEEE Winter Conference on Applications of Computer Vision, WACV 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1794-1803
Number of pages10
ISBN (Electronic)9781538648865
DOIs
StatePublished - May 3 2018
Event18th IEEE Winter Conference on Applications of Computer Vision, WACV 2018 - Lake Tahoe, United States
Duration: Mar 12 2018Mar 15 2018

Publication series

NameProceedings - 2018 IEEE Winter Conference on Applications of Computer Vision, WACV 2018
Volume2018-January

Other

Other18th IEEE Winter Conference on Applications of Computer Vision, WACV 2018
Country/TerritoryUnited States
CityLake Tahoe
Period3/12/183/15/18

All Science Journal Classification (ASJC) codes

  • Computer Vision and Pattern Recognition
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'The more you look, the more you see: Towards general object understanding through recursive refinement'. Together they form a unique fingerprint.

Cite this