Developing an algorithm to detect early childhood obesity in two tertiary pediatric medical centers

Todd Lingren, Vidhu Thaker, Cassandra Brady, Bahram Namjou, Stephanie Kennebeck, Jonathan Bickel, Nandan Patibandla, Yizhao Ni, Sara L. Van Driest, Lixin Chen, Ashton Roach, Beth Cobb, Jacqueline Kirby, Josh Denny, Lisa Bailey-Davis, Marc S. Williams, Keith Marsolo, Imre Solti, Ingrid A. Holm, John HarleyIsaac S. Kohane, Guergana Savova, Nancy Crimmins

Research output: Contribution to journalArticle

4 Citations (Scopus)

Abstract

Objective: The objective of this study is to develop an algorithm to accurately identify children with severe early onset childhood obesity (ages 1-5.99 years) using structured and unstructured data from the electronic health record (EHR). Introduction: Childhood obesity increases risk factors for cardiovascular morbidity and vascular disease. Accurate definition of a high precision phenotype through a standardize tool is critical to the success of large-scale genomic studies and validating rare monogenic variants causing severe early onset obesity. Data and Methods: Rule based and machine learning based algorithms were developed using structured and unstructured data from two EHR databases from Boston Children’s Hospital (BCH) and Cincinnati Children’s Hospital and Medical Center (CCHMC). Exclusion criteria including medications or comorbid diagnoses were defined. Machine learning algorithms were developed using cross-site training and testing in addition to experimenting with natural language processing features. Results: Precision was emphasized for a high fidelity cohort. The rule-based algorithm performed the best overall, 0.895 (CCHMC) and 0.770 (BCH). The best feature set for machine learning employed Unified Medical Language System (UMLS) concept unique identifiers (CUIs), ICD-9 codes, and RxNorm codes. Conclusions: Detecting severe early childhood obesity is essential for the intervention potential in children at the highest long-term risk of developing comorbidities related to obesity and excluding patients with underlying pathological and non-syndromic causes of obesity assists in developing a high-precision cohort for genetic study. Further such phenotyping efforts inform future practical application in health care environments utilizing clinical decision support.

Original languageEnglish (US)
Pages (from-to)693-706
Number of pages14
JournalApplied Clinical Informatics
Volume7
Issue number3
DOIs
StatePublished - Jul 20 2016

Fingerprint

Pediatrics
Pediatric Obesity
Learning systems
Obesity
Electronic Health Records
International Classification of Diseases
RxNorm
Unified Medical Language System
Clinical Decision Support Systems
Health
Natural Language Processing
Vascular Diseases
Health care
Learning algorithms
Comorbidity
Cohort Studies
Cardiovascular Diseases
Databases
Morbidity
Delivery of Health Care

All Science Journal Classification (ASJC) codes

  • Health Informatics
  • Computer Science Applications
  • Health Information Management

Cite this

Lingren, T., Thaker, V., Brady, C., Namjou, B., Kennebeck, S., Bickel, J., ... Crimmins, N. (2016). Developing an algorithm to detect early childhood obesity in two tertiary pediatric medical centers. Applied Clinical Informatics, 7(3), 693-706. https://doi.org/10.4338/ACI-2016-01-RA-0015
Lingren, Todd ; Thaker, Vidhu ; Brady, Cassandra ; Namjou, Bahram ; Kennebeck, Stephanie ; Bickel, Jonathan ; Patibandla, Nandan ; Ni, Yizhao ; Van Driest, Sara L. ; Chen, Lixin ; Roach, Ashton ; Cobb, Beth ; Kirby, Jacqueline ; Denny, Josh ; Bailey-Davis, Lisa ; Williams, Marc S. ; Marsolo, Keith ; Solti, Imre ; Holm, Ingrid A. ; Harley, John ; Kohane, Isaac S. ; Savova, Guergana ; Crimmins, Nancy. / Developing an algorithm to detect early childhood obesity in two tertiary pediatric medical centers. In: Applied Clinical Informatics. 2016 ; Vol. 7, No. 3. pp. 693-706.
@article{fc10609afac44cd3b881d8922303ab46,
title = "Developing an algorithm to detect early childhood obesity in two tertiary pediatric medical centers",
abstract = "Objective: The objective of this study is to develop an algorithm to accurately identify children with severe early onset childhood obesity (ages 1-5.99 years) using structured and unstructured data from the electronic health record (EHR). Introduction: Childhood obesity increases risk factors for cardiovascular morbidity and vascular disease. Accurate definition of a high precision phenotype through a standardize tool is critical to the success of large-scale genomic studies and validating rare monogenic variants causing severe early onset obesity. Data and Methods: Rule based and machine learning based algorithms were developed using structured and unstructured data from two EHR databases from Boston Children’s Hospital (BCH) and Cincinnati Children’s Hospital and Medical Center (CCHMC). Exclusion criteria including medications or comorbid diagnoses were defined. Machine learning algorithms were developed using cross-site training and testing in addition to experimenting with natural language processing features. Results: Precision was emphasized for a high fidelity cohort. The rule-based algorithm performed the best overall, 0.895 (CCHMC) and 0.770 (BCH). The best feature set for machine learning employed Unified Medical Language System (UMLS) concept unique identifiers (CUIs), ICD-9 codes, and RxNorm codes. Conclusions: Detecting severe early childhood obesity is essential for the intervention potential in children at the highest long-term risk of developing comorbidities related to obesity and excluding patients with underlying pathological and non-syndromic causes of obesity assists in developing a high-precision cohort for genetic study. Further such phenotyping efforts inform future practical application in health care environments utilizing clinical decision support.",
author = "Todd Lingren and Vidhu Thaker and Cassandra Brady and Bahram Namjou and Stephanie Kennebeck and Jonathan Bickel and Nandan Patibandla and Yizhao Ni and {Van Driest}, {Sara L.} and Lixin Chen and Ashton Roach and Beth Cobb and Jacqueline Kirby and Josh Denny and Lisa Bailey-Davis and Williams, {Marc S.} and Keith Marsolo and Imre Solti and Holm, {Ingrid A.} and John Harley and Kohane, {Isaac S.} and Guergana Savova and Nancy Crimmins",
year = "2016",
month = "7",
day = "20",
doi = "10.4338/ACI-2016-01-RA-0015",
language = "English (US)",
volume = "7",
pages = "693--706",
journal = "Applied Clinical Informatics",
issn = "1869-0327",
publisher = "Schattauer GmbH",
number = "3",

}

Lingren, T, Thaker, V, Brady, C, Namjou, B, Kennebeck, S, Bickel, J, Patibandla, N, Ni, Y, Van Driest, SL, Chen, L, Roach, A, Cobb, B, Kirby, J, Denny, J, Bailey-Davis, L, Williams, MS, Marsolo, K, Solti, I, Holm, IA, Harley, J, Kohane, IS, Savova, G & Crimmins, N 2016, 'Developing an algorithm to detect early childhood obesity in two tertiary pediatric medical centers', Applied Clinical Informatics, vol. 7, no. 3, pp. 693-706. https://doi.org/10.4338/ACI-2016-01-RA-0015

Developing an algorithm to detect early childhood obesity in two tertiary pediatric medical centers. / Lingren, Todd; Thaker, Vidhu; Brady, Cassandra; Namjou, Bahram; Kennebeck, Stephanie; Bickel, Jonathan; Patibandla, Nandan; Ni, Yizhao; Van Driest, Sara L.; Chen, Lixin; Roach, Ashton; Cobb, Beth; Kirby, Jacqueline; Denny, Josh; Bailey-Davis, Lisa; Williams, Marc S.; Marsolo, Keith; Solti, Imre; Holm, Ingrid A.; Harley, John; Kohane, Isaac S.; Savova, Guergana; Crimmins, Nancy.

In: Applied Clinical Informatics, Vol. 7, No. 3, 20.07.2016, p. 693-706.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Developing an algorithm to detect early childhood obesity in two tertiary pediatric medical centers

AU - Lingren, Todd

AU - Thaker, Vidhu

AU - Brady, Cassandra

AU - Namjou, Bahram

AU - Kennebeck, Stephanie

AU - Bickel, Jonathan

AU - Patibandla, Nandan

AU - Ni, Yizhao

AU - Van Driest, Sara L.

AU - Chen, Lixin

AU - Roach, Ashton

AU - Cobb, Beth

AU - Kirby, Jacqueline

AU - Denny, Josh

AU - Bailey-Davis, Lisa

AU - Williams, Marc S.

AU - Marsolo, Keith

AU - Solti, Imre

AU - Holm, Ingrid A.

AU - Harley, John

AU - Kohane, Isaac S.

AU - Savova, Guergana

AU - Crimmins, Nancy

PY - 2016/7/20

Y1 - 2016/7/20

N2 - Objective: The objective of this study is to develop an algorithm to accurately identify children with severe early onset childhood obesity (ages 1-5.99 years) using structured and unstructured data from the electronic health record (EHR). Introduction: Childhood obesity increases risk factors for cardiovascular morbidity and vascular disease. Accurate definition of a high precision phenotype through a standardize tool is critical to the success of large-scale genomic studies and validating rare monogenic variants causing severe early onset obesity. Data and Methods: Rule based and machine learning based algorithms were developed using structured and unstructured data from two EHR databases from Boston Children’s Hospital (BCH) and Cincinnati Children’s Hospital and Medical Center (CCHMC). Exclusion criteria including medications or comorbid diagnoses were defined. Machine learning algorithms were developed using cross-site training and testing in addition to experimenting with natural language processing features. Results: Precision was emphasized for a high fidelity cohort. The rule-based algorithm performed the best overall, 0.895 (CCHMC) and 0.770 (BCH). The best feature set for machine learning employed Unified Medical Language System (UMLS) concept unique identifiers (CUIs), ICD-9 codes, and RxNorm codes. Conclusions: Detecting severe early childhood obesity is essential for the intervention potential in children at the highest long-term risk of developing comorbidities related to obesity and excluding patients with underlying pathological and non-syndromic causes of obesity assists in developing a high-precision cohort for genetic study. Further such phenotyping efforts inform future practical application in health care environments utilizing clinical decision support.

AB - Objective: The objective of this study is to develop an algorithm to accurately identify children with severe early onset childhood obesity (ages 1-5.99 years) using structured and unstructured data from the electronic health record (EHR). Introduction: Childhood obesity increases risk factors for cardiovascular morbidity and vascular disease. Accurate definition of a high precision phenotype through a standardize tool is critical to the success of large-scale genomic studies and validating rare monogenic variants causing severe early onset obesity. Data and Methods: Rule based and machine learning based algorithms were developed using structured and unstructured data from two EHR databases from Boston Children’s Hospital (BCH) and Cincinnati Children’s Hospital and Medical Center (CCHMC). Exclusion criteria including medications or comorbid diagnoses were defined. Machine learning algorithms were developed using cross-site training and testing in addition to experimenting with natural language processing features. Results: Precision was emphasized for a high fidelity cohort. The rule-based algorithm performed the best overall, 0.895 (CCHMC) and 0.770 (BCH). The best feature set for machine learning employed Unified Medical Language System (UMLS) concept unique identifiers (CUIs), ICD-9 codes, and RxNorm codes. Conclusions: Detecting severe early childhood obesity is essential for the intervention potential in children at the highest long-term risk of developing comorbidities related to obesity and excluding patients with underlying pathological and non-syndromic causes of obesity assists in developing a high-precision cohort for genetic study. Further such phenotyping efforts inform future practical application in health care environments utilizing clinical decision support.

UR - http://www.scopus.com/inward/record.url?scp=84979255229&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84979255229&partnerID=8YFLogxK

U2 - 10.4338/ACI-2016-01-RA-0015

DO - 10.4338/ACI-2016-01-RA-0015

M3 - Article

C2 - 27452794

AN - SCOPUS:84979255229

VL - 7

SP - 693

EP - 706

JO - Applied Clinical Informatics

JF - Applied Clinical Informatics

SN - 1869-0327

IS - 3

ER -