Seqchip

A powerful method to integrate sequence and genotype data for the detection of rare variant associations

Dajiang Liu, Suzanne M. Leal

Research output: Contribution to journalArticle

4 Citations (Scopus)

Abstract

Motivation: Next-generation sequencing greatly increases the capacity to detect rare-variant complex-trait associations. However, it is still expensive to sequence a large number of samples and therefore often small datasets are used. Given cost constraints, a potentially more powerful two-step strategy is to sequence a subset of the sample to discover variants, and genotype the identified variants in the remaining sample. If only cases are sequenced, directly combining sequence and genotype data will lead to inflated type-I errors in rare-variant association analysis. Although several methods have been developed to correct for the bias, they are either underpowered or theoretically invalid. We proposed a new method SEQCHIP to integrate genotype and sequence data, which can be used with most existing rare-variant tests. Results: It is demonstrated using both simulated and real datasets that the SEQCHIP method has controlled type-I errors, and is substantially more powerful than all other currently available methods.

Original languageEnglish (US)
Article numberbts263
Pages (from-to)1745-1751
Number of pages7
JournalBioinformatics
Volume28
Issue number13
DOIs
StatePublished - Jul 1 2012

Fingerprint

Genotype
Integrate
Type I error
Sequencing
Costs
Costs and Cost Analysis
Subset
Datasets

All Science Journal Classification (ASJC) codes

  • Biochemistry
  • Molecular Biology
  • Computational Theory and Mathematics
  • Computer Science Applications
  • Computational Mathematics
  • Statistics and Probability
  • Medicine(all)

Cite this

@article{2d9f47d9e2b54d1d95c9936c7898cd17,
title = "Seqchip: A powerful method to integrate sequence and genotype data for the detection of rare variant associations",
abstract = "Motivation: Next-generation sequencing greatly increases the capacity to detect rare-variant complex-trait associations. However, it is still expensive to sequence a large number of samples and therefore often small datasets are used. Given cost constraints, a potentially more powerful two-step strategy is to sequence a subset of the sample to discover variants, and genotype the identified variants in the remaining sample. If only cases are sequenced, directly combining sequence and genotype data will lead to inflated type-I errors in rare-variant association analysis. Although several methods have been developed to correct for the bias, they are either underpowered or theoretically invalid. We proposed a new method SEQCHIP to integrate genotype and sequence data, which can be used with most existing rare-variant tests. Results: It is demonstrated using both simulated and real datasets that the SEQCHIP method has controlled type-I errors, and is substantially more powerful than all other currently available methods.",
author = "Dajiang Liu and Leal, {Suzanne M.}",
year = "2012",
month = "7",
day = "1",
doi = "10.1093/bioinformatics/bts263",
language = "English (US)",
volume = "28",
pages = "1745--1751",
journal = "Bioinformatics",
issn = "1367-4803",
publisher = "Oxford University Press",
number = "13",

}

Seqchip : A powerful method to integrate sequence and genotype data for the detection of rare variant associations. / Liu, Dajiang; Leal, Suzanne M.

In: Bioinformatics, Vol. 28, No. 13, bts263, 01.07.2012, p. 1745-1751.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Seqchip

T2 - A powerful method to integrate sequence and genotype data for the detection of rare variant associations

AU - Liu, Dajiang

AU - Leal, Suzanne M.

PY - 2012/7/1

Y1 - 2012/7/1

N2 - Motivation: Next-generation sequencing greatly increases the capacity to detect rare-variant complex-trait associations. However, it is still expensive to sequence a large number of samples and therefore often small datasets are used. Given cost constraints, a potentially more powerful two-step strategy is to sequence a subset of the sample to discover variants, and genotype the identified variants in the remaining sample. If only cases are sequenced, directly combining sequence and genotype data will lead to inflated type-I errors in rare-variant association analysis. Although several methods have been developed to correct for the bias, they are either underpowered or theoretically invalid. We proposed a new method SEQCHIP to integrate genotype and sequence data, which can be used with most existing rare-variant tests. Results: It is demonstrated using both simulated and real datasets that the SEQCHIP method has controlled type-I errors, and is substantially more powerful than all other currently available methods.

AB - Motivation: Next-generation sequencing greatly increases the capacity to detect rare-variant complex-trait associations. However, it is still expensive to sequence a large number of samples and therefore often small datasets are used. Given cost constraints, a potentially more powerful two-step strategy is to sequence a subset of the sample to discover variants, and genotype the identified variants in the remaining sample. If only cases are sequenced, directly combining sequence and genotype data will lead to inflated type-I errors in rare-variant association analysis. Although several methods have been developed to correct for the bias, they are either underpowered or theoretically invalid. We proposed a new method SEQCHIP to integrate genotype and sequence data, which can be used with most existing rare-variant tests. Results: It is demonstrated using both simulated and real datasets that the SEQCHIP method has controlled type-I errors, and is substantially more powerful than all other currently available methods.

UR - http://www.scopus.com/inward/record.url?scp=84863995321&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84863995321&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/bts263

DO - 10.1093/bioinformatics/bts263

M3 - Article

VL - 28

SP - 1745

EP - 1751

JO - Bioinformatics

JF - Bioinformatics

SN - 1367-4803

IS - 13

M1 - bts263

ER -