TY - GEN
T1 - Speaker naming in movies
AU - Azab, Mahmoud
AU - Wang, Mingzhe
AU - Smith, Max
AU - Kojima, Noriyuki
AU - Deng, Jia
AU - Mihalcea, Rada
N1 - Publisher Copyright:
© 2018 The Association for Computational Linguistics.
PY - 2018
Y1 - 2018
N2 - We propose a new model for speaker naming in movies that leverages visual, textual, and acoustic modalities in an unified optimization framework. To evaluate the performance of our model, we introduce a new dataset consisting of six episodes of the Big Bang Theory TV show and eighteen full movies covering different genres. Our experiments show that our multimodal model significantly outperforms several competitive baselines on the average weighted F-score metric. To demonstrate the effectiveness of our framework, we design an end-To-end memory network model that leverages our speaker naming model and achieves state-of-The-Art results on the subtitles task of the MovieQA 2017 Challenge.
AB - We propose a new model for speaker naming in movies that leverages visual, textual, and acoustic modalities in an unified optimization framework. To evaluate the performance of our model, we introduce a new dataset consisting of six episodes of the Big Bang Theory TV show and eighteen full movies covering different genres. Our experiments show that our multimodal model significantly outperforms several competitive baselines on the average weighted F-score metric. To demonstrate the effectiveness of our framework, we design an end-To-end memory network model that leverages our speaker naming model and achieves state-of-The-Art results on the subtitles task of the MovieQA 2017 Challenge.
UR - http://www.scopus.com/inward/record.url?scp=85061738416&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85061738416&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85061738416
T3 - NAACL HLT 2018 - 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference
SP - 2206
EP - 2216
BT - Long Papers
PB - Association for Computational Linguistics (ACL)
T2 - 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2018
Y2 - 1 June 2018 through 6 June 2018
ER -