Developing an algorithm to detect early childhood obesity in two tertiary pediatric medical centers

Todd Lingren, Vidhu Thaker, Cassandra Brady, Bahram Namjou, Stephanie Kennebeck, Jonathan Bickel, Nandan Patibandla, Yizhao Ni, Sara L. Van Driest, Lixin Chen, Ashton Roach, Beth Cobb, Jacqueline Kirby, Josh Denny, Lisa Bailey-Davis, Marc S. Williams, Keith Marsolo, Imre Solti, Ingrid A. Holm, John HarleyIsaac S. Kohane, Guergana Savova, Nancy Crimmins

Research output: Contribution to journalArticlepeer-review

13 Scopus citations


Objective: The objective of this study is to develop an algorithm to accurately identify children with severe early onset childhood obesity (ages 1-5.99 years) using structured and unstructured data from the electronic health record (EHR). Introduction: Childhood obesity increases risk factors for cardiovascular morbidity and vascular disease. Accurate definition of a high precision phenotype through a standardize tool is critical to the success of large-scale genomic studies and validating rare monogenic variants causing severe early onset obesity. Data and Methods: Rule based and machine learning based algorithms were developed using structured and unstructured data from two EHR databases from Boston Children’s Hospital (BCH) and Cincinnati Children’s Hospital and Medical Center (CCHMC). Exclusion criteria including medications or comorbid diagnoses were defined. Machine learning algorithms were developed using cross-site training and testing in addition to experimenting with natural language processing features. Results: Precision was emphasized for a high fidelity cohort. The rule-based algorithm performed the best overall, 0.895 (CCHMC) and 0.770 (BCH). The best feature set for machine learning employed Unified Medical Language System (UMLS) concept unique identifiers (CUIs), ICD-9 codes, and RxNorm codes. Conclusions: Detecting severe early childhood obesity is essential for the intervention potential in children at the highest long-term risk of developing comorbidities related to obesity and excluding patients with underlying pathological and non-syndromic causes of obesity assists in developing a high-precision cohort for genetic study. Further such phenotyping efforts inform future practical application in health care environments utilizing clinical decision support.

Original languageEnglish (US)
Pages (from-to)693-706
Number of pages14
JournalApplied Clinical Informatics
Issue number3
StatePublished - Jul 20 2016

All Science Journal Classification (ASJC) codes

  • Health Informatics
  • Computer Science Applications
  • Health Information Management

Fingerprint Dive into the research topics of 'Developing an algorithm to detect early childhood obesity in two tertiary pediatric medical centers'. Together they form a unique fingerprint.

Cite this