Naïve regression requires weaker assumptions than factor models to adjust for multiple cause confounding

Justin Grimmer, Dean Knox, Brandon M. Stewart

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

The empirical practice of using factor models to adjust for shared, unobserved confounders, Z, in observational settings with multiple treatments, A, is widespread in fields including genetics, networks, medicine, and politics. Wang and Blei (2019, WB) generalize these procedures to develop the “deconfounder,” a causal inference method using factor models of A to estimate “substitute confounders,” Ẑ, then estimating treatment effects—regressing the outcome, Y , on part of A while adjusting for Ẑ. WB claim the deconfounder is unbiased when (among other assumptions) there are no single-cause confounders and Ẑ is “pinpointed.” We clarify pinpointing requires each confounder to affect infinitely many treatments. We prove that when the conditions hold for the deconfounder to be asymptotically unbiased, a naïve semiparametric regression of Y on A which ignores confounding is also asymptotically unbiased. We provide bias formulas for finite numbers of treatments and show that different deconfounders exhibit different kinds of bias. We replicate every deconfounder analysis with available data and find that neither the naïve regression nor the deconfounder consistently outperform the other. In practice, the deconfounder produces implausible estimates in WB’s case study to movie earnings: estimates suggest comic author Stan Lee’s cameo appearances causally contributed $15.5 billion, most of Marvel movie revenue. We conclude neither approach is a viable substitute for careful research design in real-world applications.

Original languageEnglish (US)
JournalJournal of Machine Learning Research
Volume24
StatePublished - 2023

All Science Journal Classification (ASJC) codes

  • Control and Systems Engineering
  • Software
  • Statistics and Probability
  • Artificial Intelligence

Keywords

  • causal inference
  • deconfounder
  • machine learning
  • unmeasured confounding

Fingerprint

Dive into the research topics of 'Naïve regression requires weaker assumptions than factor models to adjust for multiple cause confounding'. Together they form a unique fingerprint.

Cite this