Combining isotonic regression and EM algorithm to predict genetic risk under monotonicity constraint

Jing Qin, Tanya P. Garcia, Yanyuan Ma, Ming Xin Tang, Karen Marder, Yuanjia Wang

Research output: Contribution to journalArticle

4 Citations (Scopus)

Abstract

In certain genetic studies, clinicians and genetic counselors are interested in estimating the cumulative risk of a disease for individuals with and without a rare deleterious mutation. Estimating the cumulative risk is difficult, however, when the estimates are based on family history data. Often, the genetic mutation status in many family members is unknown; instead, only estimated probabilities of a patient having a certain mutation status are available. Also, ages of disease-onset are subject to right censoring. Existing methods to estimate the cumulative risk using such family-based data only provide estimation at individual time points, and are not guaranteed to be monotonic or nonnegative. In this paper, we develop a novel method that combines Expectation-Maximization and isotonic regression to estimate the cumulative risk across the entire support. Our estimator is monotonic, satisfies self-consistent estimating equations and has high power in detecting differences between the cumulative risks of different populations. Application of our estimator to a Parkinson's disease (PD) study provides the age-at-onset distribution of PD in PARK2 mutation carriers and noncarriers, and reveals a significant difference between the distribution in compound heterozygous carriers compared to noncarriers, but not between heterozygous carriers and noncarriers.

Original languageEnglish (US)
Pages (from-to)1182-1208
Number of pages27
JournalAnnals of Applied Statistics
Volume8
Issue number2
DOIs
StatePublished - Jun 2014

Fingerprint

Isotonic Regression
EM Algorithm
Monotonicity
Mutation
Predict
Parkinson's Disease
Monotonic
Estimate
Estimator
Right Censoring
Expectation Maximization
Estimating Equation
High Power
Non-negative
EM algorithm
Entire
Unknown
Family

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Modeling and Simulation
  • Statistics, Probability and Uncertainty

Cite this

Qin, Jing ; Garcia, Tanya P. ; Ma, Yanyuan ; Tang, Ming Xin ; Marder, Karen ; Wang, Yuanjia. / Combining isotonic regression and EM algorithm to predict genetic risk under monotonicity constraint. In: Annals of Applied Statistics. 2014 ; Vol. 8, No. 2. pp. 1182-1208.
@article{46717a13a3aa42b1800c38fe2eed8f88,
title = "Combining isotonic regression and EM algorithm to predict genetic risk under monotonicity constraint",
abstract = "In certain genetic studies, clinicians and genetic counselors are interested in estimating the cumulative risk of a disease for individuals with and without a rare deleterious mutation. Estimating the cumulative risk is difficult, however, when the estimates are based on family history data. Often, the genetic mutation status in many family members is unknown; instead, only estimated probabilities of a patient having a certain mutation status are available. Also, ages of disease-onset are subject to right censoring. Existing methods to estimate the cumulative risk using such family-based data only provide estimation at individual time points, and are not guaranteed to be monotonic or nonnegative. In this paper, we develop a novel method that combines Expectation-Maximization and isotonic regression to estimate the cumulative risk across the entire support. Our estimator is monotonic, satisfies self-consistent estimating equations and has high power in detecting differences between the cumulative risks of different populations. Application of our estimator to a Parkinson's disease (PD) study provides the age-at-onset distribution of PD in PARK2 mutation carriers and noncarriers, and reveals a significant difference between the distribution in compound heterozygous carriers compared to noncarriers, but not between heterozygous carriers and noncarriers.",
author = "Jing Qin and Garcia, {Tanya P.} and Yanyuan Ma and Tang, {Ming Xin} and Karen Marder and Yuanjia Wang",
year = "2014",
month = "6",
doi = "10.1214/14-AOAS730",
language = "English (US)",
volume = "8",
pages = "1182--1208",
journal = "Annals of Applied Statistics",
issn = "1932-6157",
publisher = "Institute of Mathematical Statistics",
number = "2",

}

Combining isotonic regression and EM algorithm to predict genetic risk under monotonicity constraint. / Qin, Jing; Garcia, Tanya P.; Ma, Yanyuan; Tang, Ming Xin; Marder, Karen; Wang, Yuanjia.

In: Annals of Applied Statistics, Vol. 8, No. 2, 06.2014, p. 1182-1208.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Combining isotonic regression and EM algorithm to predict genetic risk under monotonicity constraint

AU - Qin, Jing

AU - Garcia, Tanya P.

AU - Ma, Yanyuan

AU - Tang, Ming Xin

AU - Marder, Karen

AU - Wang, Yuanjia

PY - 2014/6

Y1 - 2014/6

N2 - In certain genetic studies, clinicians and genetic counselors are interested in estimating the cumulative risk of a disease for individuals with and without a rare deleterious mutation. Estimating the cumulative risk is difficult, however, when the estimates are based on family history data. Often, the genetic mutation status in many family members is unknown; instead, only estimated probabilities of a patient having a certain mutation status are available. Also, ages of disease-onset are subject to right censoring. Existing methods to estimate the cumulative risk using such family-based data only provide estimation at individual time points, and are not guaranteed to be monotonic or nonnegative. In this paper, we develop a novel method that combines Expectation-Maximization and isotonic regression to estimate the cumulative risk across the entire support. Our estimator is monotonic, satisfies self-consistent estimating equations and has high power in detecting differences between the cumulative risks of different populations. Application of our estimator to a Parkinson's disease (PD) study provides the age-at-onset distribution of PD in PARK2 mutation carriers and noncarriers, and reveals a significant difference between the distribution in compound heterozygous carriers compared to noncarriers, but not between heterozygous carriers and noncarriers.

AB - In certain genetic studies, clinicians and genetic counselors are interested in estimating the cumulative risk of a disease for individuals with and without a rare deleterious mutation. Estimating the cumulative risk is difficult, however, when the estimates are based on family history data. Often, the genetic mutation status in many family members is unknown; instead, only estimated probabilities of a patient having a certain mutation status are available. Also, ages of disease-onset are subject to right censoring. Existing methods to estimate the cumulative risk using such family-based data only provide estimation at individual time points, and are not guaranteed to be monotonic or nonnegative. In this paper, we develop a novel method that combines Expectation-Maximization and isotonic regression to estimate the cumulative risk across the entire support. Our estimator is monotonic, satisfies self-consistent estimating equations and has high power in detecting differences between the cumulative risks of different populations. Application of our estimator to a Parkinson's disease (PD) study provides the age-at-onset distribution of PD in PARK2 mutation carriers and noncarriers, and reveals a significant difference between the distribution in compound heterozygous carriers compared to noncarriers, but not between heterozygous carriers and noncarriers.

UR - http://www.scopus.com/inward/record.url?scp=84903788452&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84903788452&partnerID=8YFLogxK

U2 - 10.1214/14-AOAS730

DO - 10.1214/14-AOAS730

M3 - Article

C2 - 25404955

AN - SCOPUS:84903788452

VL - 8

SP - 1182

EP - 1208

JO - Annals of Applied Statistics

JF - Annals of Applied Statistics

SN - 1932-6157

IS - 2

ER -