Abstract
The empirical practice of using factor models to adjust for shared, unobserved confounders, Z, in observational settings with multiple treatments, A, is widespread in fields including genetics, networks, medicine, and politics. Wang and Blei (2019, WB) generalize these procedures to develop the “deconfounder,” a causal inference method using factor models of A to estimate “substitute confounders,” Ẑ, then estimating treatment effects—regressing the outcome, Y , on part of A while adjusting for Ẑ. WB claim the deconfounder is unbiased when (among other assumptions) there are no single-cause confounders and Ẑ is “pinpointed.” We clarify pinpointing requires each confounder to affect infinitely many treatments. We prove that when the conditions hold for the deconfounder to be asymptotically unbiased, a naïve semiparametric regression of Y on A which ignores confounding is also asymptotically unbiased. We provide bias formulas for finite numbers of treatments and show that different deconfounders exhibit different kinds of bias. We replicate every deconfounder analysis with available data and find that neither the naïve regression nor the deconfounder consistently outperform the other. In practice, the deconfounder produces implausible estimates in WB’s case study to movie earnings: estimates suggest comic author Stan Lee’s cameo appearances causally contributed $15.5 billion, most of Marvel movie revenue. We conclude neither approach is a viable substitute for careful research design in real-world applications.
Original language | English (US) |
---|---|
Journal | Journal of Machine Learning Research |
Volume | 24 |
State | Published - 2023 |
All Science Journal Classification (ASJC) codes
- Control and Systems Engineering
- Software
- Statistics and Probability
- Artificial Intelligence
Keywords
- causal inference
- deconfounder
- machine learning
- unmeasured confounding