Learning actionness from action/background discrimination

Ozge Yalcinkaya Simsek, Olga Russakovsky, Pinar Duygulu

Research output: Contribution to journalArticlepeer-review

Abstract

Localizing actions in instructional web videos is a complex problem due to background scenes that are unrelated to the task described in the video. Wrong prediction of the action step labels could be reduced by separating backgrounds from actions. Yet, discrimination of actions from backgrounds is challenging due to various styles for the same activity. In this study, we aim to improve the action localization results through learning the actionness of video clips to determine the possibility of a clip having an action. We present a method to learn an actionness score for each video clip to be used for post-processing baseline video clip to step label assignment scores. We propose to use auxiliary representation formed from baseline video to step label assignment scores to reinforce the discrimination of video clips. The experiments on CrossTask and COIN datasets show that our actionness score helps to improve the performance of action step localization and also action segmentation.

Original languageEnglish (US)
JournalSignal, Image and Video Processing
DOIs
StateAccepted/In press - 2022

All Science Journal Classification (ASJC) codes

  • Signal Processing
  • Electrical and Electronic Engineering

Keywords

  • Action localization
  • Action segmentation
  • Actionness
  • Video representation

Fingerprint

Dive into the research topics of 'Learning actionness from action/background discrimination'. Together they form a unique fingerprint.

Cite this