Identification of temporal condition patterns associated with pediatric obesity incidence using sequence mining and big data

Elizabeth A. Campbell, Ting Qian, Jeffrey M. Miller, Ellen J. Bass, Aaron J. Masino

Research output: Contribution to journalArticlepeer-review


Background: Electronic health records (EHRs) are potentially important components in addressing pediatric obesity in clinical settings and at the population level. This work aims to identify temporal condition patterns surrounding obesity incidence in a large pediatric population that may inform clinical care and childhood obesity policy and prevention efforts. Methods: EHR data from healthcare visits with an initial record of obesity incidence (index visit) from 2009 through 2016 at the Children’s Hospital of Philadelphia, and visits immediately before (pre-index) and after (post-index), were compared with a matched control population of patients with a healthy weight to characterize the prevalence of common diagnoses and condition trajectories. The study population consisted of 49,694 patients with pediatric obesity and their corresponding matched controls. The SPADE algorithm was used to identify common temporal condition patterns in the case population. McNemar’s test was used to assess the statistical significance of pattern prevalence differences between the case and control populations. Results: SPADE identified 163 condition patterns that were present in at least 1% of cases; 80 were significantly more common among cases and 45 were significantly more common among controls (p < 0.05). Asthma and allergic rhinitis were strongly associated with childhood obesity incidence, particularly during the pre-index and index visits. Seven conditions were commonly diagnosed for cases exclusively during pre-index visits, including ear, nose, and throat disorders and gastroenteritis. Conclusions: The novel application of SPADE on a large retrospective dataset revealed temporally dependent condition associations with obesity incidence. Allergic rhinitis and asthma had a particularly high prevalence during pre-index visits. These conditions, along with those exclusively observed during pre-index visits, may represent signals of future obesity. While causation cannot be inferred from these associations, the temporal condition patterns identified here represent hypotheses that can be investigated to determine causal relationships in future obesity research.

Original languageEnglish (US)
Pages (from-to)1753-1765
Number of pages13
JournalInternational Journal of Obesity
Issue number8
StatePublished - Aug 1 2020

All Science Journal Classification (ASJC) codes

  • Nutrition and Dietetics
  • Medicine (miscellaneous)
  • Endocrinology, Diabetes and Metabolism


Dive into the research topics of 'Identification of temporal condition patterns associated with pediatric obesity incidence using sequence mining and big data'. Together they form a unique fingerprint.

Cite this