Topics, Concepts, and Measurement: A Crowdsourced Procedure for Validating Topics as Measures

Luwei Ying, Jacob M. Montgomery, Brandon M. Stewart

Research output: Contribution to journalArticlepeer-review

Abstract

Topic models, as developed in computer science, are effective tools for exploring and summarizing large document collections. When applied in social science research, however, they are commonly used for measurement, a task that requires careful validation to ensure that the model outputs actually capture the desired concept of interest. In this paper, we review current practices for topic validation in the field and show that extensive model validation is increasingly rare, or at least not systematically reported in papers and appendices. To supplement current practices, we refine an existing crowd-sourcing method by Chang and coauthors for validating topic quality and go on to create new procedures for validating conceptual labels provided by the researcher. We illustrate our method with an analysis of Facebook posts by U.S. Senators and provide software and guidance for researchers wishing to validate their own topic models. While tailored, case-specific validation exercises will always be best, we aim to improve standard practices by providing a general-purpose tool to validate topics as measures.

Original languageEnglish (US)
JournalPolitical Analysis
DOIs
StateAccepted/In press - 2021

All Science Journal Classification (ASJC) codes

  • Sociology and Political Science
  • Political Science and International Relations

Keywords

  • crowd-sourcing
  • measurement
  • text as data
  • topic model
  • validation

Fingerprint

Dive into the research topics of 'Topics, Concepts, and Measurement: A Crowdsourced Procedure for Validating Topics as Measures'. Together they form a unique fingerprint.

Cite this