Rapid and robust resampling-based multiple-testing correction with application in a genome-wide expression quantitative trait loci study

Xiang Zhang, Shunping Huang, Wei Sun, Wei Wang

Research output: Contribution to journalArticle

7 Citations (Scopus)

Abstract

Genome-wide expression quantitative trait loci (eQTL) studies have emerged as a powerful tool to understand the genetic basis of gene expression and complex traits. In a typical eQTL study, the huge number of genetic markers and expression traits and their complicated correlations present a challenging multiple-testing correction problem. The resampling-based test using permutation or bootstrap procedures is a standard approach to address the multiple-testing problem in eQTL studies. A brute force application of the resampling-based test to large-scale eQTL data sets is often computationally infeasible. Several computationally efficient methods have been proposed to calculate approximate resampling-based P-values. However, these methods rely on certain assumptions about the correlation structure of the genetic markers, which may not be valid for certain studies. We propose a novel algorithm, rapid and exact multiple testing correction by resampling (REM), to address this challenge. REM calculates the exact resampling-based P-values in a computationally efficient manner. The computational advantage of REM lies in its strategy of pruning the search space by skipping genetic markers whose upper bounds on test statistics are small. REM does not rely on any assumption about the correlation structure of the genetic markers. It can be applied to a variety of resampling-based multiple-testing correction methods including permutation and bootstrap methods. We evaluate REM on three eQTL data sets (yeast, inbred mouse, and human rare variants) and show that it achieves accurate resampling-based P-value estimation with much less computational cost than existing methods. The software is available at http://csbio.unc.edu/eQTL.

Original languageEnglish (US)
Pages (from-to)1511-1520
Number of pages10
JournalGenetics
Volume190
Issue number4
DOIs
StatePublished - Apr 1 2012

Fingerprint

Quantitative Trait Loci
Genome
Genetic Markers
Software
Yeasts
Gene Expression
Costs and Cost Analysis

All Science Journal Classification (ASJC) codes

  • Genetics

Cite this

@article{5b490ba7fa20426eaf0675607d695b63,
title = "Rapid and robust resampling-based multiple-testing correction with application in a genome-wide expression quantitative trait loci study",
abstract = "Genome-wide expression quantitative trait loci (eQTL) studies have emerged as a powerful tool to understand the genetic basis of gene expression and complex traits. In a typical eQTL study, the huge number of genetic markers and expression traits and their complicated correlations present a challenging multiple-testing correction problem. The resampling-based test using permutation or bootstrap procedures is a standard approach to address the multiple-testing problem in eQTL studies. A brute force application of the resampling-based test to large-scale eQTL data sets is often computationally infeasible. Several computationally efficient methods have been proposed to calculate approximate resampling-based P-values. However, these methods rely on certain assumptions about the correlation structure of the genetic markers, which may not be valid for certain studies. We propose a novel algorithm, rapid and exact multiple testing correction by resampling (REM), to address this challenge. REM calculates the exact resampling-based P-values in a computationally efficient manner. The computational advantage of REM lies in its strategy of pruning the search space by skipping genetic markers whose upper bounds on test statistics are small. REM does not rely on any assumption about the correlation structure of the genetic markers. It can be applied to a variety of resampling-based multiple-testing correction methods including permutation and bootstrap methods. We evaluate REM on three eQTL data sets (yeast, inbred mouse, and human rare variants) and show that it achieves accurate resampling-based P-value estimation with much less computational cost than existing methods. The software is available at http://csbio.unc.edu/eQTL.",
author = "Xiang Zhang and Shunping Huang and Wei Sun and Wei Wang",
year = "2012",
month = "4",
day = "1",
doi = "10.1534/genetics.111.137737",
language = "English (US)",
volume = "190",
pages = "1511--1520",
journal = "Genetics",
issn = "0016-6731",
publisher = "Genetics Society of America",
number = "4",

}

Rapid and robust resampling-based multiple-testing correction with application in a genome-wide expression quantitative trait loci study. / Zhang, Xiang; Huang, Shunping; Sun, Wei; Wang, Wei.

In: Genetics, Vol. 190, No. 4, 01.04.2012, p. 1511-1520.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Rapid and robust resampling-based multiple-testing correction with application in a genome-wide expression quantitative trait loci study

AU - Zhang, Xiang

AU - Huang, Shunping

AU - Sun, Wei

AU - Wang, Wei

PY - 2012/4/1

Y1 - 2012/4/1

N2 - Genome-wide expression quantitative trait loci (eQTL) studies have emerged as a powerful tool to understand the genetic basis of gene expression and complex traits. In a typical eQTL study, the huge number of genetic markers and expression traits and their complicated correlations present a challenging multiple-testing correction problem. The resampling-based test using permutation or bootstrap procedures is a standard approach to address the multiple-testing problem in eQTL studies. A brute force application of the resampling-based test to large-scale eQTL data sets is often computationally infeasible. Several computationally efficient methods have been proposed to calculate approximate resampling-based P-values. However, these methods rely on certain assumptions about the correlation structure of the genetic markers, which may not be valid for certain studies. We propose a novel algorithm, rapid and exact multiple testing correction by resampling (REM), to address this challenge. REM calculates the exact resampling-based P-values in a computationally efficient manner. The computational advantage of REM lies in its strategy of pruning the search space by skipping genetic markers whose upper bounds on test statistics are small. REM does not rely on any assumption about the correlation structure of the genetic markers. It can be applied to a variety of resampling-based multiple-testing correction methods including permutation and bootstrap methods. We evaluate REM on three eQTL data sets (yeast, inbred mouse, and human rare variants) and show that it achieves accurate resampling-based P-value estimation with much less computational cost than existing methods. The software is available at http://csbio.unc.edu/eQTL.

AB - Genome-wide expression quantitative trait loci (eQTL) studies have emerged as a powerful tool to understand the genetic basis of gene expression and complex traits. In a typical eQTL study, the huge number of genetic markers and expression traits and their complicated correlations present a challenging multiple-testing correction problem. The resampling-based test using permutation or bootstrap procedures is a standard approach to address the multiple-testing problem in eQTL studies. A brute force application of the resampling-based test to large-scale eQTL data sets is often computationally infeasible. Several computationally efficient methods have been proposed to calculate approximate resampling-based P-values. However, these methods rely on certain assumptions about the correlation structure of the genetic markers, which may not be valid for certain studies. We propose a novel algorithm, rapid and exact multiple testing correction by resampling (REM), to address this challenge. REM calculates the exact resampling-based P-values in a computationally efficient manner. The computational advantage of REM lies in its strategy of pruning the search space by skipping genetic markers whose upper bounds on test statistics are small. REM does not rely on any assumption about the correlation structure of the genetic markers. It can be applied to a variety of resampling-based multiple-testing correction methods including permutation and bootstrap methods. We evaluate REM on three eQTL data sets (yeast, inbred mouse, and human rare variants) and show that it achieves accurate resampling-based P-value estimation with much less computational cost than existing methods. The software is available at http://csbio.unc.edu/eQTL.

UR - http://www.scopus.com/inward/record.url?scp=84859641368&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84859641368&partnerID=8YFLogxK

U2 - 10.1534/genetics.111.137737

DO - 10.1534/genetics.111.137737

M3 - Article

VL - 190

SP - 1511

EP - 1520

JO - Genetics

JF - Genetics

SN - 0016-6731

IS - 4

ER -