Best of both worlds: Human-machine collaboration for object annotation

Olga Russakovsky, Li Jia Li, Li Fei-Fei

Research output: Chapter in Book/Report/Conference proceedingConference contribution

130 Scopus citations

Abstract

The long-standing goal of localizing every object in an image remains elusive. Manually annotating objects is quite expensive despite crowd engineering innovations. Current state-of-the-art automatic object detectors can accurately detect at most a few objects per image. This paper brings together the latest advancements in object detection and in crowd engineering into a principled framework for accurately and efficiently localizing objects in images. The input to the system is an image to annotate and a set of annotation constraints: desired precision, utility and/or human cost of the labeling. The output is a set of object annotations, informed by human feedback and computer vision. Our model seamlessly integrates multiple computer vision models with multiple sources of human input in a Markov Decision Process. We empirically validate the effectiveness of our human-in-the-loop labeling approach on the ILSVRC2014 object detection dataset.

Original languageEnglish (US)
Title of host publicationIEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015
PublisherIEEE Computer Society
Pages2121-2131
Number of pages11
ISBN (Electronic)9781467369640
DOIs
StatePublished - Oct 14 2015
Externally publishedYes
EventIEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015 - Boston, United States
Duration: Jun 7 2015Jun 12 2015

Publication series

NameProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Volume07-12-June-2015
ISSN (Print)1063-6919

Other

OtherIEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015
Country/TerritoryUnited States
CityBoston
Period6/7/156/12/15

All Science Journal Classification (ASJC) codes

  • Software
  • Computer Vision and Pattern Recognition

Fingerprint

Dive into the research topics of 'Best of both worlds: Human-machine collaboration for object annotation'. Together they form a unique fingerprint.

Cite this