Analysis pipeline for the epistasis search - statistical versus biological filtering

Xiangqing Sun, Qing Lu, Shubhabrata Mukheerjee, Paul K. Crane, Robert Elston, Marylyn Deriggi Ritchie

Research output: Contribution to journalShort survey

34 Citations (Scopus)

Abstract

Gene-gene interactions may contribute to the genetic variation underlying complex traits but have not always been taken fully into account. Statistical analyses that consider gene-gene interaction may increase the power of detecting associations, especially for low-marginal-effect markers, and may explain in part the "missing heritability." Detecting pair-wise and higher-order interactions genome-wide requires enormous computational power. Filtering pipelines increase the computational speed by limiting the number of tests performed. We summarize existing filtering approaches to detect epistasis, after distinguishing the purposes that lead us to search for epistasis. Statistical filtering includes quality control on the basis of single marker statistics to avoid the analysis of bad and least informative data, and limits the search space for finding interactions. Biological filtering includes targeting specific pathways, integrating various databases based on known biological and metabolic pathways, gene function ontology and protein-protein interactions. It is increasingly possible to target single-nucleotide polymorphisms that have defined functions on gene expression, though not belonging to protein-coding genes. Filtering can improve the power of an interaction association study, but also increases the chance of missing important findings.

Original languageEnglish (US)
Article numberArticle 106
JournalFrontiers in Genetics
Volume5
Issue numberAPR
DOIs
StatePublished - Jan 1 2014

Fingerprint

Genes
Gene Ontology
Proteins
Metabolic Networks and Pathways
Quality Control
Single Nucleotide Polymorphism
Genome
Databases
Gene Expression

All Science Journal Classification (ASJC) codes

  • Molecular Medicine
  • Genetics
  • Genetics(clinical)

Cite this

Sun, X., Lu, Q., Mukheerjee, S., Crane, P. K., Elston, R., & Ritchie, M. D. (2014). Analysis pipeline for the epistasis search - statistical versus biological filtering. Frontiers in Genetics, 5(APR), [Article 106]. https://doi.org/10.3389/fgene.2014.00106
Sun, Xiangqing ; Lu, Qing ; Mukheerjee, Shubhabrata ; Crane, Paul K. ; Elston, Robert ; Ritchie, Marylyn Deriggi. / Analysis pipeline for the epistasis search - statistical versus biological filtering. In: Frontiers in Genetics. 2014 ; Vol. 5, No. APR.
@article{8c2119bc1aa0423fab71909c3074dca4,
title = "Analysis pipeline for the epistasis search - statistical versus biological filtering",
abstract = "Gene-gene interactions may contribute to the genetic variation underlying complex traits but have not always been taken fully into account. Statistical analyses that consider gene-gene interaction may increase the power of detecting associations, especially for low-marginal-effect markers, and may explain in part the {"}missing heritability.{"} Detecting pair-wise and higher-order interactions genome-wide requires enormous computational power. Filtering pipelines increase the computational speed by limiting the number of tests performed. We summarize existing filtering approaches to detect epistasis, after distinguishing the purposes that lead us to search for epistasis. Statistical filtering includes quality control on the basis of single marker statistics to avoid the analysis of bad and least informative data, and limits the search space for finding interactions. Biological filtering includes targeting specific pathways, integrating various databases based on known biological and metabolic pathways, gene function ontology and protein-protein interactions. It is increasingly possible to target single-nucleotide polymorphisms that have defined functions on gene expression, though not belonging to protein-coding genes. Filtering can improve the power of an interaction association study, but also increases the chance of missing important findings.",
author = "Xiangqing Sun and Qing Lu and Shubhabrata Mukheerjee and Crane, {Paul K.} and Robert Elston and Ritchie, {Marylyn Deriggi}",
year = "2014",
month = "1",
day = "1",
doi = "10.3389/fgene.2014.00106",
language = "English (US)",
volume = "5",
journal = "Frontiers in Genetics",
issn = "1664-8021",
publisher = "Frontiers Media S. A.",
number = "APR",

}

Sun, X, Lu, Q, Mukheerjee, S, Crane, PK, Elston, R & Ritchie, MD 2014, 'Analysis pipeline for the epistasis search - statistical versus biological filtering', Frontiers in Genetics, vol. 5, no. APR, Article 106. https://doi.org/10.3389/fgene.2014.00106

Analysis pipeline for the epistasis search - statistical versus biological filtering. / Sun, Xiangqing; Lu, Qing; Mukheerjee, Shubhabrata; Crane, Paul K.; Elston, Robert; Ritchie, Marylyn Deriggi.

In: Frontiers in Genetics, Vol. 5, No. APR, Article 106, 01.01.2014.

Research output: Contribution to journalShort survey

TY - JOUR

T1 - Analysis pipeline for the epistasis search - statistical versus biological filtering

AU - Sun, Xiangqing

AU - Lu, Qing

AU - Mukheerjee, Shubhabrata

AU - Crane, Paul K.

AU - Elston, Robert

AU - Ritchie, Marylyn Deriggi

PY - 2014/1/1

Y1 - 2014/1/1

N2 - Gene-gene interactions may contribute to the genetic variation underlying complex traits but have not always been taken fully into account. Statistical analyses that consider gene-gene interaction may increase the power of detecting associations, especially for low-marginal-effect markers, and may explain in part the "missing heritability." Detecting pair-wise and higher-order interactions genome-wide requires enormous computational power. Filtering pipelines increase the computational speed by limiting the number of tests performed. We summarize existing filtering approaches to detect epistasis, after distinguishing the purposes that lead us to search for epistasis. Statistical filtering includes quality control on the basis of single marker statistics to avoid the analysis of bad and least informative data, and limits the search space for finding interactions. Biological filtering includes targeting specific pathways, integrating various databases based on known biological and metabolic pathways, gene function ontology and protein-protein interactions. It is increasingly possible to target single-nucleotide polymorphisms that have defined functions on gene expression, though not belonging to protein-coding genes. Filtering can improve the power of an interaction association study, but also increases the chance of missing important findings.

AB - Gene-gene interactions may contribute to the genetic variation underlying complex traits but have not always been taken fully into account. Statistical analyses that consider gene-gene interaction may increase the power of detecting associations, especially for low-marginal-effect markers, and may explain in part the "missing heritability." Detecting pair-wise and higher-order interactions genome-wide requires enormous computational power. Filtering pipelines increase the computational speed by limiting the number of tests performed. We summarize existing filtering approaches to detect epistasis, after distinguishing the purposes that lead us to search for epistasis. Statistical filtering includes quality control on the basis of single marker statistics to avoid the analysis of bad and least informative data, and limits the search space for finding interactions. Biological filtering includes targeting specific pathways, integrating various databases based on known biological and metabolic pathways, gene function ontology and protein-protein interactions. It is increasingly possible to target single-nucleotide polymorphisms that have defined functions on gene expression, though not belonging to protein-coding genes. Filtering can improve the power of an interaction association study, but also increases the chance of missing important findings.

UR - http://www.scopus.com/inward/record.url?scp=84901004089&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84901004089&partnerID=8YFLogxK

U2 - 10.3389/fgene.2014.00106

DO - 10.3389/fgene.2014.00106

M3 - Short survey

C2 - 24817878

AN - SCOPUS:84901004089

VL - 5

JO - Frontiers in Genetics

JF - Frontiers in Genetics

SN - 1664-8021

IS - APR

M1 - Article 106

ER -