Nonparametric estimation for censored mixture data with application to the Cooperative Huntington's Observational Research Trial

Yuanjia Wang, Tanya P. Garcia, Yanyuan Ma

Research output: Contribution to journalArticlepeer-review

10 Scopus citations

Abstract

This work presents methods for estimating genotype-specific outcome distributions from genetic epidemiology studies where the event times are subject to right censoring, the genotypes are not directly observed, and the data arise from a mixture of scientifically meaningful subpopulations. Examples of such studies include kin-cohort studies and quantitative trait locus (QTL) studies. Current methods for analyzing censored mixture data include two types of nonparametric maximum likelihood estimators (NPMLEs; Type I and Type II) that do not make parametric assumptions on the genotype-specific density functions. Although both NPMLEs are commonly used, we show that one is inefficient and the other inconsistent. To overcome these deficiencies, we propose three classes of consistent nonparametric estimators that do not assume parametric density models and areeasyto implement. They are based on inverse probability weighting (IPW), augmented IPW (AIPW), and nonparametric imputation (IMP). AIPW achieves the efficiency bound without additional modeling assumptions. Extensive simulation experiments demonstrate satisfactory performance of these estimators even when the data are heavily censored. We apply these estimators to the Cooperative Huntington's Observational Research Trial (COHORT), and provide age-specific estimates of the effect of mutation in the Huntington gene on mortality using a sample of family members. The close approximation of the estimated noncarrier survival rates to that of the U. S. population indicates small ascertainment bias in the COHORT family sample. Our analyses underscore an elevated risk of death in Huntington gene mutation carriers compared with that in noncarriers for a wide age range, and suggest that the mutation equally affects survival rates in both genders. The estimated survival rates are useful in genetic counseling for providing guidelines on interpreting the risk of death associated with a positive genetic test, and in helping future subjects at risk to make informed decisions on whether to undergo genetic mutation testing. Technical details and additional numerical results are provided in the online supplementary materials.

Original languageEnglish (US)
Pages (from-to)1324-1338
Number of pages15
JournalJournal of the American Statistical Association
Volume107
Issue number500
DOIs
StatePublished - 2012

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Statistics, Probability and Uncertainty

Fingerprint Dive into the research topics of 'Nonparametric estimation for censored mixture data with application to the Cooperative Huntington's Observational Research Trial'. Together they form a unique fingerprint.

Cite this