TY - JOUR

T1 - Logistic or Linear? Estimating Causal Effects of Experimental Treatments on Binary Outcomes Using Regression Analysis

AU - Gomila, Robin

N1 - Publisher Copyright:
© 2020 American Psychological Association.

PY - 2020

Y1 - 2020

N2 - When the outcome is binary, psychologists often use nonlinear modeling strategies such as logit or probit. These strategies are often neither optimal nor justified when the objective is to estimate causal effects of experimental treatments. Researchers need to take extra steps to convert logit and probit coefficients into interpretable quantities, and when they do, these quantities often remain difficult to understand. Odds ratios, for instance, are described as obscure in many textbooks (e.g., Gelman & Hill, 2006, p. 83). I draw on econometric theory and established statistical findings to demonstrate that linear regression is generally the best strategy to estimate causal effects of treatments on binary outcomes. Linear regression coefficients are directly interpretable in terms of probabilities and, when interaction terms or fixed effects are included, linear regression is safer. I review the Neyman-Rubin causal model, which I use to prove analytically that linear regression yields unbiased estimates of treatment effects on binary outcomes. Then, I run simulations and analyze existing data on 24,191 students from 56 middle schools (Paluck, Shepherd, & Aronow, 2013) to illustrate the effectiveness of linear regression. Based on these grounds, I recommend that psychologists use linear regression to estimate treatment effects on binary outcomes.

AB - When the outcome is binary, psychologists often use nonlinear modeling strategies such as logit or probit. These strategies are often neither optimal nor justified when the objective is to estimate causal effects of experimental treatments. Researchers need to take extra steps to convert logit and probit coefficients into interpretable quantities, and when they do, these quantities often remain difficult to understand. Odds ratios, for instance, are described as obscure in many textbooks (e.g., Gelman & Hill, 2006, p. 83). I draw on econometric theory and established statistical findings to demonstrate that linear regression is generally the best strategy to estimate causal effects of treatments on binary outcomes. Linear regression coefficients are directly interpretable in terms of probabilities and, when interaction terms or fixed effects are included, linear regression is safer. I review the Neyman-Rubin causal model, which I use to prove analytically that linear regression yields unbiased estimates of treatment effects on binary outcomes. Then, I run simulations and analyze existing data on 24,191 students from 56 middle schools (Paluck, Shepherd, & Aronow, 2013) to illustrate the effectiveness of linear regression. Based on these grounds, I recommend that psychologists use linear regression to estimate treatment effects on binary outcomes.

KW - Average treatment effects

KW - Binary outcomes

KW - Causal effects

KW - Linear regression

KW - Logistic regression

UR - http://www.scopus.com/inward/record.url?scp=85091608335&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85091608335&partnerID=8YFLogxK

U2 - 10.1037/xge0000920

DO - 10.1037/xge0000920

M3 - Article

C2 - 32969684

AN - SCOPUS:85091608335

JO - Journal of Experimental Psychology: General

JF - Journal of Experimental Psychology: General

SN - 0096-3445

ER -