Enhancing early autism prediction based on electronic records using clinical narratives

Junya Chen, Matthew Engelhard, Ricardo Henao, Samuel Berchuck, Brian Eichner, Eliana M. Perrin, Guillermo Sapiro, Geraldine Dawson

Research output: Contribution to journalArticlepeer-review

8 Scopus citations

Abstract

Recent work has shown that predictive models can be applied to structured electronic health record (EHR) data to stratify autism likelihood from an early age (<1 year). Integrating clinical narratives (or notes) with structured data has been shown to improve prediction performance in other clinical applications, but the added predictive value of this information in early autism prediction has not yet been explored. In this study, we aimed to enhance the performance of early autism prediction by using both structured EHR data and clinical narratives. We built models based on structured data and clinical narratives separately, and then an ensemble model that integrated both sources of data. We assessed the predictive value of these models from Duke University Health System over a 14-year span to evaluate ensemble models predicting later autism diagnosis (by age 4 years) from data collected from ages 30 to 360 days. Our sample included 11,750 children above by age 3 years (385 meeting autism diagnostic criteria). The ensemble model for autism prediction showed superior performance and at age 30 days achieved 46.8% sensitivity (95% confidence interval, CI: 22.0%, 52.9%), 28.0% positive predictive value (PPV) at high (90%) specificity (CI: 2.0%, 33.1%), and AUC4 (with at least 4-year follow-up for controls) reaching 0.769 (CI: 0.715, 0.811). Prediction by 360 days achieved 44.5% sensitivity (CI: 23.6%, 62.9%), and 13.7% PPV at high (90%) specificity (CI: 9.6%, 18.9%), and AUC4 reaching 0.797 (CI: 0.746, 0.840). Results show that incorporating clinical narratives in early autism prediction achieved promising accuracy by age 30 days, outperforming models based on structured data only. Furthermore, findings suggest that additional features learned from clinician narratives might be hypothesis generating for understanding early development in autism.

Original languageEnglish (US)
Article number104390
JournalJournal of Biomedical Informatics
Volume144
DOIs
StatePublished - Aug 2023
Externally publishedYes

All Science Journal Classification (ASJC) codes

  • Health Informatics
  • Computer Science Applications

Keywords

  • Autism
  • EHR data
  • Ensemble model
  • Language models
  • Unstructured data

Fingerprint

Dive into the research topics of 'Enhancing early autism prediction based on electronic records using clinical narratives'. Together they form a unique fingerprint.

Cite this