TY - GEN
T1 - A thorough examination of the CNN/daily mail reading comprehension task
AU - Chen, Danqi
AU - Bolton, Jason
AU - Manning, Christopher D.
N1 - Publisher Copyright:
© 2016 Association for Computational Linguistics.
PY - 2016
Y1 - 2016
N2 - Enabling a computer to understand a document so that it can answer comprehension questions is a central, yet unsolved goal of NLP. A key factor impeding its solution by machine learned systems is the limited availability of human-annotated data. Hermann et al. (2015) seek to solve this problem by creating over a million training examples by pairing CAW and Daily Mail news articles with their summarized bullet points, and show that a neural network can then be trained to give good performance on this task. In this paper, we conduct a thorough examination of this new reading comprehension task. Our primary aim is to understand what depth of language understanding is required to do well on this task. We approach this from one side by doing a careful hand-analysis of a small subset of the problems and from the other by showing that simple, carefully designed systems can obtain accuracies of 72.4% and 75.8% on these two datasets, exceeding current state-of-the-art results by over 5% and approaching what we believe is the ceiling for performance on this task1.
AB - Enabling a computer to understand a document so that it can answer comprehension questions is a central, yet unsolved goal of NLP. A key factor impeding its solution by machine learned systems is the limited availability of human-annotated data. Hermann et al. (2015) seek to solve this problem by creating over a million training examples by pairing CAW and Daily Mail news articles with their summarized bullet points, and show that a neural network can then be trained to give good performance on this task. In this paper, we conduct a thorough examination of this new reading comprehension task. Our primary aim is to understand what depth of language understanding is required to do well on this task. We approach this from one side by doing a careful hand-analysis of a small subset of the problems and from the other by showing that simple, carefully designed systems can obtain accuracies of 72.4% and 75.8% on these two datasets, exceeding current state-of-the-art results by over 5% and approaching what we believe is the ceiling for performance on this task1.
UR - http://www.scopus.com/inward/record.url?scp=85012005003&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85012005003&partnerID=8YFLogxK
U2 - 10.18653/v1/p16-1223
DO - 10.18653/v1/p16-1223
M3 - Conference contribution
AN - SCOPUS:85012005003
T3 - 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Long Papers
SP - 2358
EP - 2367
BT - 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Long Papers
PB - Association for Computational Linguistics (ACL)
T2 - 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016
Y2 - 7 August 2016 through 12 August 2016
ER -