TEAM: Efficient two-locus epistasis tests in human genome-wide association study

Xiang Zhang, Shunping Huang, Fei Zou, Wei Wang

Research output: Contribution to journalArticle

108 Citations (Scopus)

Abstract

As a promising tool for identifying genetic markers underlying phenotypic differences, genome-wide association study (GWAS) has been extensively investigated in recent years. In GWAS, detecting epistasis (or gene-gene interaction) is preferable over single locus study since many diseases are known to be complex traits. A brute force search is infeasible for epistasis detection in the genomewide scale because of the intensive computational burden. Existing epistasis detection algorithms are designed for dataset consisting of homozygous markers and small sample size. In human study, however, the genotype may be heterozygous, and number of individuals can be up to thousands. Thus, existing methods are not readily applicable to human datasets. In this article, we propose an efficient algorithm, TEAM, which significantly speeds up epistasis detection for human GWAS. Our algorithm is exhaustive, i.e. it does not ignore any epistatic interaction. Utilizing the minimum spanning tree structure, the algorithm incrementally updates the contingency tables for epistatic tests without scanning all individuals. Our algorithm has broader applicability and is more efficient than existing methods for large sample study. It supports any statistical test that is based on contingency tables, and enables both family-wise error rate and false discovery rate controlling. Extensive experiments show that our algorithm only needs to examine a small portion of the individuals to update the contingency tables, and it achieves at least an order of magnitude speed up over the brute force approach. Contact: xiang@cs.unc.edu.

Original languageEnglish (US)
Article numberbtq186
Pages (from-to)i217-i227
JournalBioinformatics
Volume26
Issue number12
DOIs
StatePublished - Jun 1 2010

Fingerprint

Epistasis
Genome-Wide Association Study
Human Genome
Locus
Genome
Genes
Contingency Table
Speedup
Update
Gene
Familywise Error Rate
Minimum Spanning Tree
Small Sample Size
Tree Structure
Statistical test
Genotype
Interaction
Statistical tests
Scanning
Genetic Markers

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Computational Theory and Mathematics
  • Computational Mathematics

Cite this

Zhang, Xiang ; Huang, Shunping ; Zou, Fei ; Wang, Wei. / TEAM : Efficient two-locus epistasis tests in human genome-wide association study. In: Bioinformatics. 2010 ; Vol. 26, No. 12. pp. i217-i227.
@article{a0718e7b1ef541be83ce803d3b5f111e,
title = "TEAM: Efficient two-locus epistasis tests in human genome-wide association study",
abstract = "As a promising tool for identifying genetic markers underlying phenotypic differences, genome-wide association study (GWAS) has been extensively investigated in recent years. In GWAS, detecting epistasis (or gene-gene interaction) is preferable over single locus study since many diseases are known to be complex traits. A brute force search is infeasible for epistasis detection in the genomewide scale because of the intensive computational burden. Existing epistasis detection algorithms are designed for dataset consisting of homozygous markers and small sample size. In human study, however, the genotype may be heterozygous, and number of individuals can be up to thousands. Thus, existing methods are not readily applicable to human datasets. In this article, we propose an efficient algorithm, TEAM, which significantly speeds up epistasis detection for human GWAS. Our algorithm is exhaustive, i.e. it does not ignore any epistatic interaction. Utilizing the minimum spanning tree structure, the algorithm incrementally updates the contingency tables for epistatic tests without scanning all individuals. Our algorithm has broader applicability and is more efficient than existing methods for large sample study. It supports any statistical test that is based on contingency tables, and enables both family-wise error rate and false discovery rate controlling. Extensive experiments show that our algorithm only needs to examine a small portion of the individuals to update the contingency tables, and it achieves at least an order of magnitude speed up over the brute force approach. Contact: xiang@cs.unc.edu.",
author = "Xiang Zhang and Shunping Huang and Fei Zou and Wei Wang",
year = "2010",
month = "6",
day = "1",
doi = "10.1093/bioinformatics/btq186",
language = "English (US)",
volume = "26",
pages = "i217--i227",
journal = "Bioinformatics",
issn = "1367-4803",
publisher = "Oxford University Press",
number = "12",

}

TEAM : Efficient two-locus epistasis tests in human genome-wide association study. / Zhang, Xiang; Huang, Shunping; Zou, Fei; Wang, Wei.

In: Bioinformatics, Vol. 26, No. 12, btq186, 01.06.2010, p. i217-i227.

Research output: Contribution to journalArticle

TY - JOUR

T1 - TEAM

T2 - Efficient two-locus epistasis tests in human genome-wide association study

AU - Zhang, Xiang

AU - Huang, Shunping

AU - Zou, Fei

AU - Wang, Wei

PY - 2010/6/1

Y1 - 2010/6/1

N2 - As a promising tool for identifying genetic markers underlying phenotypic differences, genome-wide association study (GWAS) has been extensively investigated in recent years. In GWAS, detecting epistasis (or gene-gene interaction) is preferable over single locus study since many diseases are known to be complex traits. A brute force search is infeasible for epistasis detection in the genomewide scale because of the intensive computational burden. Existing epistasis detection algorithms are designed for dataset consisting of homozygous markers and small sample size. In human study, however, the genotype may be heterozygous, and number of individuals can be up to thousands. Thus, existing methods are not readily applicable to human datasets. In this article, we propose an efficient algorithm, TEAM, which significantly speeds up epistasis detection for human GWAS. Our algorithm is exhaustive, i.e. it does not ignore any epistatic interaction. Utilizing the minimum spanning tree structure, the algorithm incrementally updates the contingency tables for epistatic tests without scanning all individuals. Our algorithm has broader applicability and is more efficient than existing methods for large sample study. It supports any statistical test that is based on contingency tables, and enables both family-wise error rate and false discovery rate controlling. Extensive experiments show that our algorithm only needs to examine a small portion of the individuals to update the contingency tables, and it achieves at least an order of magnitude speed up over the brute force approach. Contact: xiang@cs.unc.edu.

AB - As a promising tool for identifying genetic markers underlying phenotypic differences, genome-wide association study (GWAS) has been extensively investigated in recent years. In GWAS, detecting epistasis (or gene-gene interaction) is preferable over single locus study since many diseases are known to be complex traits. A brute force search is infeasible for epistasis detection in the genomewide scale because of the intensive computational burden. Existing epistasis detection algorithms are designed for dataset consisting of homozygous markers and small sample size. In human study, however, the genotype may be heterozygous, and number of individuals can be up to thousands. Thus, existing methods are not readily applicable to human datasets. In this article, we propose an efficient algorithm, TEAM, which significantly speeds up epistasis detection for human GWAS. Our algorithm is exhaustive, i.e. it does not ignore any epistatic interaction. Utilizing the minimum spanning tree structure, the algorithm incrementally updates the contingency tables for epistatic tests without scanning all individuals. Our algorithm has broader applicability and is more efficient than existing methods for large sample study. It supports any statistical test that is based on contingency tables, and enables both family-wise error rate and false discovery rate controlling. Extensive experiments show that our algorithm only needs to examine a small portion of the individuals to update the contingency tables, and it achieves at least an order of magnitude speed up over the brute force approach. Contact: xiang@cs.unc.edu.

UR - http://www.scopus.com/inward/record.url?scp=77954182718&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77954182718&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/btq186

DO - 10.1093/bioinformatics/btq186

M3 - Article

C2 - 20529910

AN - SCOPUS:77954182718

VL - 26

SP - i217-i227

JO - Bioinformatics

JF - Bioinformatics

SN - 1367-4803

IS - 12

M1 - btq186

ER -