A novel bayesian graphical model for genome-wide multi-SNP association mapping

Yu Zhang

Research output: Contribution to journalArticle

29 Citations (Scopus)

Abstract

Most disease association mapping algorithms are based on hypothesis testing procedures that test one variant at a time. Those methods lose power when the disease mutations are jointly tagged by multiple variants, or when gene-gene interaction exist. Nearby variants are also correlated, for which procedures ignoring the dependence between variants will inevitably produce redundant results. With a large number of variants genotyped in current genome-wide disease association studies, simultaneous multivariant association mapping algorithms are strongly desired. We present a novel Bayesian method for automatic detection of multivariant joint association in genome-wide case-control studies. Our method has improved power and specificity over existing tools. We fit a joint probabilistic model to the entire data and identify disease variants simultaneously. The method dynamically accounts for the strong linkage disequilibrium (LD) between variants. As a result, only the primary disease variants will be identified, with all secondary associations due to LD effects filtered out. Our method better pinpoints the disease variants with improved resolution. The method is also computationally efficient for genome-wide studies. When applied to a real data set of inflammatory bowel disease (IBD) containing 401,473 variants in 4,720 individuals, our method detected all previously reported IBD loci in the same data, and recovered two missed loci. We further detected two novel interchromosome interactions. The first is between STAT3 and PARD6G, and the second is between DLG5 and an intergenic region at 5p14. We further validated the two interactions in an independent study.

Original languageEnglish (US)
Pages (from-to)36-47
Number of pages12
JournalGenetic Epidemiology
Volume36
Issue number1
DOIs
StatePublished - Jan 1 2012

Fingerprint

Single Nucleotide Polymorphism
Genome
Linkage Disequilibrium
Inflammatory Bowel Diseases
Intergenic DNA
Bayes Theorem
Genome-Wide Association Study
Statistical Models
Genes
Case-Control Studies
Joints
Mutation

All Science Journal Classification (ASJC) codes

  • Epidemiology
  • Genetics(clinical)

Cite this

@article{86b28f4d77de4d5a9064e4a461b6cfb8,
title = "A novel bayesian graphical model for genome-wide multi-SNP association mapping",
abstract = "Most disease association mapping algorithms are based on hypothesis testing procedures that test one variant at a time. Those methods lose power when the disease mutations are jointly tagged by multiple variants, or when gene-gene interaction exist. Nearby variants are also correlated, for which procedures ignoring the dependence between variants will inevitably produce redundant results. With a large number of variants genotyped in current genome-wide disease association studies, simultaneous multivariant association mapping algorithms are strongly desired. We present a novel Bayesian method for automatic detection of multivariant joint association in genome-wide case-control studies. Our method has improved power and specificity over existing tools. We fit a joint probabilistic model to the entire data and identify disease variants simultaneously. The method dynamically accounts for the strong linkage disequilibrium (LD) between variants. As a result, only the primary disease variants will be identified, with all secondary associations due to LD effects filtered out. Our method better pinpoints the disease variants with improved resolution. The method is also computationally efficient for genome-wide studies. When applied to a real data set of inflammatory bowel disease (IBD) containing 401,473 variants in 4,720 individuals, our method detected all previously reported IBD loci in the same data, and recovered two missed loci. We further detected two novel interchromosome interactions. The first is between STAT3 and PARD6G, and the second is between DLG5 and an intergenic region at 5p14. We further validated the two interactions in an independent study.",
author = "Yu Zhang",
year = "2012",
month = "1",
day = "1",
doi = "10.1002/gepi.20661",
language = "English (US)",
volume = "36",
pages = "36--47",
journal = "Genetic Epidemiology",
issn = "0741-0395",
publisher = "Wiley-Liss Inc.",
number = "1",

}

A novel bayesian graphical model for genome-wide multi-SNP association mapping. / Zhang, Yu.

In: Genetic Epidemiology, Vol. 36, No. 1, 01.01.2012, p. 36-47.

Research output: Contribution to journalArticle

TY - JOUR

T1 - A novel bayesian graphical model for genome-wide multi-SNP association mapping

AU - Zhang, Yu

PY - 2012/1/1

Y1 - 2012/1/1

N2 - Most disease association mapping algorithms are based on hypothesis testing procedures that test one variant at a time. Those methods lose power when the disease mutations are jointly tagged by multiple variants, or when gene-gene interaction exist. Nearby variants are also correlated, for which procedures ignoring the dependence between variants will inevitably produce redundant results. With a large number of variants genotyped in current genome-wide disease association studies, simultaneous multivariant association mapping algorithms are strongly desired. We present a novel Bayesian method for automatic detection of multivariant joint association in genome-wide case-control studies. Our method has improved power and specificity over existing tools. We fit a joint probabilistic model to the entire data and identify disease variants simultaneously. The method dynamically accounts for the strong linkage disequilibrium (LD) between variants. As a result, only the primary disease variants will be identified, with all secondary associations due to LD effects filtered out. Our method better pinpoints the disease variants with improved resolution. The method is also computationally efficient for genome-wide studies. When applied to a real data set of inflammatory bowel disease (IBD) containing 401,473 variants in 4,720 individuals, our method detected all previously reported IBD loci in the same data, and recovered two missed loci. We further detected two novel interchromosome interactions. The first is between STAT3 and PARD6G, and the second is between DLG5 and an intergenic region at 5p14. We further validated the two interactions in an independent study.

AB - Most disease association mapping algorithms are based on hypothesis testing procedures that test one variant at a time. Those methods lose power when the disease mutations are jointly tagged by multiple variants, or when gene-gene interaction exist. Nearby variants are also correlated, for which procedures ignoring the dependence between variants will inevitably produce redundant results. With a large number of variants genotyped in current genome-wide disease association studies, simultaneous multivariant association mapping algorithms are strongly desired. We present a novel Bayesian method for automatic detection of multivariant joint association in genome-wide case-control studies. Our method has improved power and specificity over existing tools. We fit a joint probabilistic model to the entire data and identify disease variants simultaneously. The method dynamically accounts for the strong linkage disequilibrium (LD) between variants. As a result, only the primary disease variants will be identified, with all secondary associations due to LD effects filtered out. Our method better pinpoints the disease variants with improved resolution. The method is also computationally efficient for genome-wide studies. When applied to a real data set of inflammatory bowel disease (IBD) containing 401,473 variants in 4,720 individuals, our method detected all previously reported IBD loci in the same data, and recovered two missed loci. We further detected two novel interchromosome interactions. The first is between STAT3 and PARD6G, and the second is between DLG5 and an intergenic region at 5p14. We further validated the two interactions in an independent study.

UR - http://www.scopus.com/inward/record.url?scp=84859100582&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84859100582&partnerID=8YFLogxK

U2 - 10.1002/gepi.20661

DO - 10.1002/gepi.20661

M3 - Article

C2 - 22127647

AN - SCOPUS:84859100582

VL - 36

SP - 36

EP - 47

JO - Genetic Epidemiology

JF - Genetic Epidemiology

SN - 0741-0395

IS - 1

ER -