TY - GEN
T1 - Fastchi
T2 - 14th Pacific Symposium on Biocomputing, PSB 2009
AU - Zhang, Xiang
AU - Zou, Fei
AU - Wang, Wei
PY - 2009
Y1 - 2009
N2 - Recent advances in high-throughput genotyping have inspired increasing research interests in genome-wide association study for diseases. To understand underlying biological mechanisms of many diseases, we need to consider simultaneously the genetic effects across multiple loci. The large number of SNPs often makes multilocus association study very computationally challenging because it needs to explicitly enumerate all possible SNP combinations at the genome-wide scale. Moreover, with the large number of SNPs correlated, permutation procedure is often needed for properly controlling family-wise error rates. This makes the problem even more computationally demanding, since the test procedure needs to be repeated for each permuted data. In this paper, we present FastChi, an exhaustive yet efficient algorithm for genome-wide two-locus chi-square test. FastChi utilizes an upper bound of the two-locus chi-square test, which can be expressed as the sum of two terms - both are efficient to compute: the first term is based on the single-locus chi-square test for the given phenotype; and the second term only depends on the genotypes and is independent of the phenotype. This upper bound enables the algorithm to only perform the two-locus chi-square test on a small number of candidate SNP pairs without the risk of missing any significant ones. Since the second part of the upper bound only needs to be precomputed once and stored for subsequence uses, the advantage is more prominent in large permutation tests. Extensive experimental results demonstrate that our method is an order of magnitude faster than the brute force alternative.
AB - Recent advances in high-throughput genotyping have inspired increasing research interests in genome-wide association study for diseases. To understand underlying biological mechanisms of many diseases, we need to consider simultaneously the genetic effects across multiple loci. The large number of SNPs often makes multilocus association study very computationally challenging because it needs to explicitly enumerate all possible SNP combinations at the genome-wide scale. Moreover, with the large number of SNPs correlated, permutation procedure is often needed for properly controlling family-wise error rates. This makes the problem even more computationally demanding, since the test procedure needs to be repeated for each permuted data. In this paper, we present FastChi, an exhaustive yet efficient algorithm for genome-wide two-locus chi-square test. FastChi utilizes an upper bound of the two-locus chi-square test, which can be expressed as the sum of two terms - both are efficient to compute: the first term is based on the single-locus chi-square test for the given phenotype; and the second term only depends on the genotypes and is independent of the phenotype. This upper bound enables the algorithm to only perform the two-locus chi-square test on a small number of candidate SNP pairs without the risk of missing any significant ones. Since the second part of the upper bound only needs to be precomputed once and stored for subsequence uses, the advantage is more prominent in large permutation tests. Extensive experimental results demonstrate that our method is an order of magnitude faster than the brute force alternative.
UR - http://www.scopus.com/inward/record.url?scp=61949439681&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=61949439681&partnerID=8YFLogxK
M3 - Conference contribution
C2 - 19209728
AN - SCOPUS:61949439681
SN - 9812836926
SN - 9789812836922
T3 - Pacific Symposium on Biocomputing 2009, PSB 2009
SP - 528
EP - 539
BT - Pacific Symposium on Biocomputing 2009, PSB 2009
Y2 - 5 January 2009 through 9 January 2009
ER -