TY - GEN
T1 - The Worst of Both Worlds
T2 - 5th AAAI/ACM Conference on Artificial Intelligence, Ethics, and Society, AIES 2022
AU - Hullman, Jessica
AU - Kapoor, Sayash
AU - Nanayakkara, Priyanka
AU - Gelman, Andrew
AU - Narayanan, Arvind
N1 - Publisher Copyright:
© 2022 ACM.
PY - 2022/7/26
Y1 - 2022/7/26
N2 - Arguments that machine learning (ML) is facing a reproducibility and replication crisis suggest that some published claims in research cannot be taken at face value. Concerns inspire analogies to the replication crisis affecting the social and medical sciences. A deeper understanding of what reproducibility concerns in supervised ML research have in common with the replication crisis in experimental science puts the new concerns in perspective, and helps researchers avoid "the worst of both worlds,"where ML researchers begin borrowing methodologies from explanatory modeling without understanding their limitations and vice versa. We contribute a comparative analysis of concerns about inductive learning that arise in causal attribution as exemplified in psychology versus predictive modeling as exemplified in ML. We identify common themes in reform discussions, like overreliance on asymptotic theory and non-credible beliefs about real-world data generating processes. We argue that in both fields, claims from learning are implied to generalize outside the specific environment studied (e.g., the input dataset or subject sample, modeling implementation, etc.) but are often difficult to refute due to underspecification of key parts of the learning pipeline. We conclude by discussing risks that arise when sources of errors are misdiagnosed and the need to acknowledge the role of human inductive biases in learning and reform.
AB - Arguments that machine learning (ML) is facing a reproducibility and replication crisis suggest that some published claims in research cannot be taken at face value. Concerns inspire analogies to the replication crisis affecting the social and medical sciences. A deeper understanding of what reproducibility concerns in supervised ML research have in common with the replication crisis in experimental science puts the new concerns in perspective, and helps researchers avoid "the worst of both worlds,"where ML researchers begin borrowing methodologies from explanatory modeling without understanding their limitations and vice versa. We contribute a comparative analysis of concerns about inductive learning that arise in causal attribution as exemplified in psychology versus predictive modeling as exemplified in ML. We identify common themes in reform discussions, like overreliance on asymptotic theory and non-credible beliefs about real-world data generating processes. We argue that in both fields, claims from learning are implied to generalize outside the specific environment studied (e.g., the input dataset or subject sample, modeling implementation, etc.) but are often difficult to refute due to underspecification of key parts of the learning pipeline. We conclude by discussing risks that arise when sources of errors are misdiagnosed and the need to acknowledge the role of human inductive biases in learning and reform.
KW - generalizability
KW - machine learning
KW - replication
KW - science reform
UR - http://www.scopus.com/inward/record.url?scp=85137156206&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85137156206&partnerID=8YFLogxK
U2 - 10.1145/3514094.3534196
DO - 10.1145/3514094.3534196
M3 - Conference contribution
AN - SCOPUS:85137156206
T3 - AIES 2022 - Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society
SP - 335
EP - 348
BT - AIES 2022 - Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society
PB - Association for Computing Machinery, Inc
Y2 - 1 August 2022 through 3 August 2022
ER -