TY - GEN
T1 - TopicCheck
T2 - Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2015
AU - Chuang, Jason
AU - Roberts, Margaret E.
AU - Stewart, Brandon Michael
AU - Weiss, Rebecca
AU - Tingley, Dustin
AU - Grimmer, Justin
AU - Heer, Jeffrey
PY - 2015
Y1 - 2015
N2 - Content analysis, a widely-applied social science research method, is increasingly being supplemented by topic modeling. However, while the discourse on content analysis centers heavily on reproducibility, computer scientists often focus more on scalability and less on coding reliability, leading to growing skepticism on the usefulness of topic models for automated content analysis. In response, we introduce TopicCheck, an interactive tool for assessing topic model stability. Our contributions are threefold. First, from established guidelines on reproducible content analysis, we distill a set of design requirements on how to computationally assess the stability of an automated coding process. Second, we devise an interactive alignment algorithm for matching latent topics from multiple models, and enable sensitivity evaluation across a large number of models. Finally, we demonstrate that our tool enables social scientists to gain novel insights into three active research questions.
AB - Content analysis, a widely-applied social science research method, is increasingly being supplemented by topic modeling. However, while the discourse on content analysis centers heavily on reproducibility, computer scientists often focus more on scalability and less on coding reliability, leading to growing skepticism on the usefulness of topic models for automated content analysis. In response, we introduce TopicCheck, an interactive tool for assessing topic model stability. Our contributions are threefold. First, from established guidelines on reproducible content analysis, we distill a set of design requirements on how to computationally assess the stability of an automated coding process. Second, we devise an interactive alignment algorithm for matching latent topics from multiple models, and enable sensitivity evaluation across a large number of models. Finally, we demonstrate that our tool enables social scientists to gain novel insights into three active research questions.
UR - http://www.scopus.com/inward/record.url?scp=84956624433&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84956624433&partnerID=8YFLogxK
U2 - 10.3115/v1/n15-1018
DO - 10.3115/v1/n15-1018
M3 - Conference contribution
AN - SCOPUS:84956624433
T3 - NAACL HLT 2015 - 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference
SP - 175
EP - 184
BT - NAACL HLT 2015 - 2015 Conference of the North American Chapter of the Association for Computational Linguistics
PB - Association for Computational Linguistics (ACL)
Y2 - 31 May 2015 through 5 June 2015
ER -