A small-sample kernel association test for correlated data with application to microbiome association studies

Xiang Zhan, Lingzhou Xue, Haotian Zheng, Anna Plantinga, Michael C. Wu, Daniel J. Schaid, Ni Zhao, Jun Chen

Research output: Contribution to journalArticle

6 Citations (Scopus)

Abstract

Recent research has highlighted the importance of the human microbiome in many human disease and health conditions. Most current microbiome association analyses focus on unrelated samples; such methods are not appropriate for analysis of data collected from more advanced study designs such as longitudinal and pedigree studies, where outcomes can be correlated. Ignoring such correlations can sometimes lead to suboptimal results or even possibly biased conclusions. Thus, new methods to handle correlated outcome data in microbiome association studies are needed. In this paper, we propose the correlated sequence kernel association test (CSKAT) to address such correlations using the linear mixed model. Specifically, random effects are used to account for the outcome correlations and a variance component test is used to examine the microbiome effect. Compared to existing genetic association tests for longitudinal and family samples, we implement a correction procedure to better calibrate the null distribution of the score test statistic to accommodate the small sample size nature of data collected from a typical microbiome study. Comprehensive simulation studies are conducted to demonstrate the validity and efficiency of our method, and we show that CSKAT achieves a higher power than existing methods while correctly controlling the Type I error rate. We also apply our method to a microbiome data set collected from a UK twin study to illustrate its potential usefulness. A free implementation of our method in R software is available at https://github.com/jchen1981/SSKAT.

Original languageEnglish (US)
Pages (from-to)772-782
Number of pages11
JournalGenetic Epidemiology
Volume42
Issue number8
DOIs
StatePublished - Dec 1 2018

Fingerprint

Microbiota
Twin Studies
Pedigree
Sample Size
Longitudinal Studies
Linear Models
Software
Health
Research

All Science Journal Classification (ASJC) codes

  • Epidemiology
  • Genetics(clinical)

Cite this

Zhan, Xiang ; Xue, Lingzhou ; Zheng, Haotian ; Plantinga, Anna ; Wu, Michael C. ; Schaid, Daniel J. ; Zhao, Ni ; Chen, Jun. / A small-sample kernel association test for correlated data with application to microbiome association studies. In: Genetic Epidemiology. 2018 ; Vol. 42, No. 8. pp. 772-782.
@article{590f45b2b27b4d3191b8e53b8a6b692c,
title = "A small-sample kernel association test for correlated data with application to microbiome association studies",
abstract = "Recent research has highlighted the importance of the human microbiome in many human disease and health conditions. Most current microbiome association analyses focus on unrelated samples; such methods are not appropriate for analysis of data collected from more advanced study designs such as longitudinal and pedigree studies, where outcomes can be correlated. Ignoring such correlations can sometimes lead to suboptimal results or even possibly biased conclusions. Thus, new methods to handle correlated outcome data in microbiome association studies are needed. In this paper, we propose the correlated sequence kernel association test (CSKAT) to address such correlations using the linear mixed model. Specifically, random effects are used to account for the outcome correlations and a variance component test is used to examine the microbiome effect. Compared to existing genetic association tests for longitudinal and family samples, we implement a correction procedure to better calibrate the null distribution of the score test statistic to accommodate the small sample size nature of data collected from a typical microbiome study. Comprehensive simulation studies are conducted to demonstrate the validity and efficiency of our method, and we show that CSKAT achieves a higher power than existing methods while correctly controlling the Type I error rate. We also apply our method to a microbiome data set collected from a UK twin study to illustrate its potential usefulness. A free implementation of our method in R software is available at https://github.com/jchen1981/SSKAT.",
author = "Xiang Zhan and Lingzhou Xue and Haotian Zheng and Anna Plantinga and Wu, {Michael C.} and Schaid, {Daniel J.} and Ni Zhao and Jun Chen",
year = "2018",
month = "12",
day = "1",
doi = "10.1002/gepi.22160",
language = "English (US)",
volume = "42",
pages = "772--782",
journal = "Genetic Epidemiology",
issn = "0741-0395",
publisher = "Wiley-Liss Inc.",
number = "8",

}

A small-sample kernel association test for correlated data with application to microbiome association studies. / Zhan, Xiang; Xue, Lingzhou; Zheng, Haotian; Plantinga, Anna; Wu, Michael C.; Schaid, Daniel J.; Zhao, Ni; Chen, Jun.

In: Genetic Epidemiology, Vol. 42, No. 8, 01.12.2018, p. 772-782.

Research output: Contribution to journalArticle

TY - JOUR

T1 - A small-sample kernel association test for correlated data with application to microbiome association studies

AU - Zhan, Xiang

AU - Xue, Lingzhou

AU - Zheng, Haotian

AU - Plantinga, Anna

AU - Wu, Michael C.

AU - Schaid, Daniel J.

AU - Zhao, Ni

AU - Chen, Jun

PY - 2018/12/1

Y1 - 2018/12/1

N2 - Recent research has highlighted the importance of the human microbiome in many human disease and health conditions. Most current microbiome association analyses focus on unrelated samples; such methods are not appropriate for analysis of data collected from more advanced study designs such as longitudinal and pedigree studies, where outcomes can be correlated. Ignoring such correlations can sometimes lead to suboptimal results or even possibly biased conclusions. Thus, new methods to handle correlated outcome data in microbiome association studies are needed. In this paper, we propose the correlated sequence kernel association test (CSKAT) to address such correlations using the linear mixed model. Specifically, random effects are used to account for the outcome correlations and a variance component test is used to examine the microbiome effect. Compared to existing genetic association tests for longitudinal and family samples, we implement a correction procedure to better calibrate the null distribution of the score test statistic to accommodate the small sample size nature of data collected from a typical microbiome study. Comprehensive simulation studies are conducted to demonstrate the validity and efficiency of our method, and we show that CSKAT achieves a higher power than existing methods while correctly controlling the Type I error rate. We also apply our method to a microbiome data set collected from a UK twin study to illustrate its potential usefulness. A free implementation of our method in R software is available at https://github.com/jchen1981/SSKAT.

AB - Recent research has highlighted the importance of the human microbiome in many human disease and health conditions. Most current microbiome association analyses focus on unrelated samples; such methods are not appropriate for analysis of data collected from more advanced study designs such as longitudinal and pedigree studies, where outcomes can be correlated. Ignoring such correlations can sometimes lead to suboptimal results or even possibly biased conclusions. Thus, new methods to handle correlated outcome data in microbiome association studies are needed. In this paper, we propose the correlated sequence kernel association test (CSKAT) to address such correlations using the linear mixed model. Specifically, random effects are used to account for the outcome correlations and a variance component test is used to examine the microbiome effect. Compared to existing genetic association tests for longitudinal and family samples, we implement a correction procedure to better calibrate the null distribution of the score test statistic to accommodate the small sample size nature of data collected from a typical microbiome study. Comprehensive simulation studies are conducted to demonstrate the validity and efficiency of our method, and we show that CSKAT achieves a higher power than existing methods while correctly controlling the Type I error rate. We also apply our method to a microbiome data set collected from a UK twin study to illustrate its potential usefulness. A free implementation of our method in R software is available at https://github.com/jchen1981/SSKAT.

UR - http://www.scopus.com/inward/record.url?scp=85053464289&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85053464289&partnerID=8YFLogxK

U2 - 10.1002/gepi.22160

DO - 10.1002/gepi.22160

M3 - Article

C2 - 30218543

AN - SCOPUS:85053464289

VL - 42

SP - 772

EP - 782

JO - Genetic Epidemiology

JF - Genetic Epidemiology

SN - 0741-0395

IS - 8

ER -