LifeQA: A real-life dataset for video question answering

Santiago Castro, Mahmoud Azab, Jonathan C. Stroud, Cristina Noujaim, Ruoyao Wang, Jia Deng, Rada Mihalcea

Research output: Chapter in Book/Report/Conference proceedingConference contribution

21 Scopus citations

Abstract

We introduce LifeQA, a benchmark dataset for video question answering that focuses on day-to-day real-life situations. Current video question answering datasets consist of movies and TV shows. However, it is well-known that these visual domains are not representative of our day-to-day lives. Movies and TV shows, for example, benefit from professional camera movements, clean editing, crisp audio recordings, and scripted dialog between professional actors. While these domains provide a large amount of data for training models, their properties make them unsuitable for testing real-life question answering systems. Our dataset, by contrast, consists of video clips that represent only real-life scenarios. We collect 275 such video clips and over 2.3k multiple-choice questions. In this paper, we analyze the challenging but realistic aspects of LifeQA, and we apply several state-of-the-art video question answering models to provide benchmarks for future research. The full dataset is publicly available at https://lit.eecs.umich.edu/lifeqa/.

Original languageEnglish (US)
Title of host publicationLREC 2020 - 12th International Conference on Language Resources and Evaluation, Conference Proceedings
EditorsNicoletta Calzolari, Frederic Bechet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
PublisherEuropean Language Resources Association (ELRA)
Pages4352-4358
Number of pages7
ISBN (Electronic)9791095546344
StatePublished - 2020
Externally publishedYes
Event12th International Conference on Language Resources and Evaluation, LREC 2020 - Marseille, France
Duration: May 11 2020May 16 2020

Publication series

NameLREC 2020 - 12th International Conference on Language Resources and Evaluation, Conference Proceedings

Conference

Conference12th International Conference on Language Resources and Evaluation, LREC 2020
Country/TerritoryFrance
CityMarseille
Period5/11/205/16/20

All Science Journal Classification (ASJC) codes

  • Language and Linguistics
  • Education
  • Library and Information Sciences
  • Linguistics and Language

Keywords

  • Computer vision
  • Natural language processing
  • Question answering
  • Video question answering

Fingerprint

Dive into the research topics of 'LifeQA: A real-life dataset for video question answering'. Together they form a unique fingerprint.

Cite this