Nonparametric modeling and analysis of association between huntington's disease onset and CAG repeats

Yanyuan Ma, Yuanjia Wang

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

Huntington's disease (HD) is a neurodegenerative disorder with a dominant genetic mode of inheritance caused by an expansion of CAG repeats on chromosome 4. Typically, a longer sequence of CAG repeat length is associated with increased risk of experiencing earlier onset of HD. Previous studies of the association between HD onset age and CAG length have favored a logistic model, where the CAG repeat length enters the mean and variance components of the logistic model in a complex exponential-linear form. To relax the parametric assumption of the exponential-linear association to the true HD onset distribution, we propose to leave both mean and variance functions of the CAG repeat length unspecified and perform semiparametric estimation in this context through a local kernel and backfitting procedure. Motivated by including family history of HD information available in the family members of participants in the Cooperative Huntington's Observational Research Trial (COHORT), we develop the methodology in the context of mixture data, where some subjects have a positive probability of being risk free. We also allow censoring on the age at onset of disease and accommodate covariates other than the CAG length. We study the theoretical properties of the proposed estimator and derive its asymptotic distribution. Finally, we apply the proposed methods to the COHORT data to estimate the HD onset distribution using a group of study participants and the disease family history information available on their family members.

Original languageEnglish (US)
Pages (from-to)1369-1382
Number of pages14
JournalStatistics in Medicine
Volume33
Issue number8
DOIs
StatePublished - Apr 15 2014

Fingerprint

Huntington Disease
Modeling
Age of Onset
Logistic Model
Logistic Models
Chromosomes, Human, Pair 4
Backfitting
Research
Components of Variance
Neurodegenerative Diseases
Semiparametric Estimation
Variance Function
Linear Forms
Theoretical Models
Censoring
Asymptotic distribution
Chromosome
Disorder
Covariates
kernel

All Science Journal Classification (ASJC) codes

  • Epidemiology
  • Statistics and Probability

Cite this

@article{fb7a37cbb9b34290a60eb8f9ff8b23df,
title = "Nonparametric modeling and analysis of association between huntington's disease onset and CAG repeats",
abstract = "Huntington's disease (HD) is a neurodegenerative disorder with a dominant genetic mode of inheritance caused by an expansion of CAG repeats on chromosome 4. Typically, a longer sequence of CAG repeat length is associated with increased risk of experiencing earlier onset of HD. Previous studies of the association between HD onset age and CAG length have favored a logistic model, where the CAG repeat length enters the mean and variance components of the logistic model in a complex exponential-linear form. To relax the parametric assumption of the exponential-linear association to the true HD onset distribution, we propose to leave both mean and variance functions of the CAG repeat length unspecified and perform semiparametric estimation in this context through a local kernel and backfitting procedure. Motivated by including family history of HD information available in the family members of participants in the Cooperative Huntington's Observational Research Trial (COHORT), we develop the methodology in the context of mixture data, where some subjects have a positive probability of being risk free. We also allow censoring on the age at onset of disease and accommodate covariates other than the CAG length. We study the theoretical properties of the proposed estimator and derive its asymptotic distribution. Finally, we apply the proposed methods to the COHORT data to estimate the HD onset distribution using a group of study participants and the disease family history information available on their family members.",
author = "Yanyuan Ma and Yuanjia Wang",
year = "2014",
month = "4",
day = "15",
doi = "10.1002/sim.5971",
language = "English (US)",
volume = "33",
pages = "1369--1382",
journal = "Statistics in Medicine",
issn = "0277-6715",
publisher = "John Wiley and Sons Ltd",
number = "8",

}

Nonparametric modeling and analysis of association between huntington's disease onset and CAG repeats. / Ma, Yanyuan; Wang, Yuanjia.

In: Statistics in Medicine, Vol. 33, No. 8, 15.04.2014, p. 1369-1382.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Nonparametric modeling and analysis of association between huntington's disease onset and CAG repeats

AU - Ma, Yanyuan

AU - Wang, Yuanjia

PY - 2014/4/15

Y1 - 2014/4/15

N2 - Huntington's disease (HD) is a neurodegenerative disorder with a dominant genetic mode of inheritance caused by an expansion of CAG repeats on chromosome 4. Typically, a longer sequence of CAG repeat length is associated with increased risk of experiencing earlier onset of HD. Previous studies of the association between HD onset age and CAG length have favored a logistic model, where the CAG repeat length enters the mean and variance components of the logistic model in a complex exponential-linear form. To relax the parametric assumption of the exponential-linear association to the true HD onset distribution, we propose to leave both mean and variance functions of the CAG repeat length unspecified and perform semiparametric estimation in this context through a local kernel and backfitting procedure. Motivated by including family history of HD information available in the family members of participants in the Cooperative Huntington's Observational Research Trial (COHORT), we develop the methodology in the context of mixture data, where some subjects have a positive probability of being risk free. We also allow censoring on the age at onset of disease and accommodate covariates other than the CAG length. We study the theoretical properties of the proposed estimator and derive its asymptotic distribution. Finally, we apply the proposed methods to the COHORT data to estimate the HD onset distribution using a group of study participants and the disease family history information available on their family members.

AB - Huntington's disease (HD) is a neurodegenerative disorder with a dominant genetic mode of inheritance caused by an expansion of CAG repeats on chromosome 4. Typically, a longer sequence of CAG repeat length is associated with increased risk of experiencing earlier onset of HD. Previous studies of the association between HD onset age and CAG length have favored a logistic model, where the CAG repeat length enters the mean and variance components of the logistic model in a complex exponential-linear form. To relax the parametric assumption of the exponential-linear association to the true HD onset distribution, we propose to leave both mean and variance functions of the CAG repeat length unspecified and perform semiparametric estimation in this context through a local kernel and backfitting procedure. Motivated by including family history of HD information available in the family members of participants in the Cooperative Huntington's Observational Research Trial (COHORT), we develop the methodology in the context of mixture data, where some subjects have a positive probability of being risk free. We also allow censoring on the age at onset of disease and accommodate covariates other than the CAG length. We study the theoretical properties of the proposed estimator and derive its asymptotic distribution. Finally, we apply the proposed methods to the COHORT data to estimate the HD onset distribution using a group of study participants and the disease family history information available on their family members.

UR - http://www.scopus.com/inward/record.url?scp=84896717351&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84896717351&partnerID=8YFLogxK

U2 - 10.1002/sim.5971

DO - 10.1002/sim.5971

M3 - Article

C2 - 24027120

AN - SCOPUS:84896717351

VL - 33

SP - 1369

EP - 1382

JO - Statistics in Medicine

JF - Statistics in Medicine

SN - 0277-6715

IS - 8

ER -