COP-E-CAT: Cleaning and organization pipeline for EHR computational and analytic tasks

Aishwarya Mandyam, Elizabeth C. Yoo, Jeff Soules, Krzysztof Laudanski, Barbara E. Engelhardt

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In order to ensure that analyses of complex electronic healthcare record (EHR) data are reproducible and generalizable, it is crucial for researchers to use comparable preprocessing, filtering, and imputation strategies. We introduce COP-E-CAT: Cleaning and Organization Pipeline for EHR Computational and Analytic Tasks, an open-source processing and analysis software for MIMIC-IV, a ubiquitous benchmark EHR dataset. COP-E-CAT allows users to select filtering characteristics and preprocess covariates to generate data structures for use in downstream analysis tasks. This user-friendly approach shows promise in facilitating reproducibility and comparability among studies that leverage the MIMIC-IV data, and enhances EHR accessibility to a wider spectrum of researchers than current data processing methods. We demonstrate the versatility of our workflow by describing three use cases: ensemble prediction, reinforcement learning, and dimension reduction. The software is available at: https://github.com/eyeshoe/cop-e-cat.

Original languageEnglish (US)
Title of host publicationProceedings of the 12th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, BCB 2021
PublisherAssociation for Computing Machinery, Inc
ISBN (Electronic)9781450384506
DOIs
StatePublished - Jan 18 2021
Event12th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, BCB 2021 - Virtual, Online, United States
Duration: Aug 1 2021Aug 4 2021

Publication series

NameProceedings of the 12th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, BCB 2021

Conference

Conference12th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, BCB 2021
Country/TerritoryUnited States
CityVirtual, Online
Period8/1/218/4/21

All Science Journal Classification (ASJC) codes

  • Computer Science Applications
  • Software
  • Biomedical Engineering
  • Health Informatics

Keywords

  • electronic health records
  • health informatics
  • reinforcement learning

Fingerprint

Dive into the research topics of 'COP-E-CAT: Cleaning and organization pipeline for EHR computational and analytic tasks'. Together they form a unique fingerprint.

Cite this