Learning to Detect Human-Object Interactions

Yu Wei Chao, Yunfan Liu, Xieyang Liu, Huayi Zeng, Jia Deng

Research output: Chapter in Book/Report/Conference proceedingConference contribution

365 Scopus citations

Abstract

We study the problem of detecting human-object interactions (HOI) in static images, defined as predicting a human and an object bounding box with an interaction class label that connects them. HOI detection is a fundamental problem in computer vision as it provides semantic information about the interactions among the detected objects. We introduce HICO-DET, a new large benchmark for HOI detection, by augmenting the current HICO classification benchmark with instance annotations. To solve the task, we propose Human-Object Region-based Convolutional Neural Networks (HO-RCNN). At the core of our HO-RCNN is the Interaction Pattern, a novel DNN input that characterizes the spatial relations between two bounding boxes. Experiments on HICO-DET demonstrate that our HO-RCNN, by exploiting human-object spatial relations through Interaction Patterns, significantly improves the performance of HOI detection over baseline approaches.

Original languageEnglish (US)
Title of host publicationProceedings - 2018 IEEE Winter Conference on Applications of Computer Vision, WACV 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages381-389
Number of pages9
ISBN (Electronic)9781538648865
DOIs
StatePublished - May 3 2018
Externally publishedYes
Event18th IEEE Winter Conference on Applications of Computer Vision, WACV 2018 - Lake Tahoe, United States
Duration: Mar 12 2018Mar 15 2018

Publication series

NameProceedings - 2018 IEEE Winter Conference on Applications of Computer Vision, WACV 2018
Volume2018-January

Other

Other18th IEEE Winter Conference on Applications of Computer Vision, WACV 2018
Country/TerritoryUnited States
CityLake Tahoe
Period3/12/183/15/18

All Science Journal Classification (ASJC) codes

  • Computer Vision and Pattern Recognition
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'Learning to Detect Human-Object Interactions'. Together they form a unique fingerprint.

Cite this