Improving the power of structural variation detection by augmenting the reference

Jan Schröder, Santhosh Girirajan, Anthony T. Papenfuss, Paul Medvedev

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

The uses of the Genome Reference Consortium's human reference sequence can be roughly categorized into three related but distinct categories: as a representative species genome, as a coordinate systemfor identifying variants, and as an alignment reference for variation detection algorithms. However, the use of this reference sequence as simultaneously a representative species genome and as an alignment reference leads to unnecessary artifacts for structural variation detection algorithms and limits their accuracy.We show how decoupling these two references and developing a separate alignment reference can significantly improve the accuracy of structural variation detection, lead to improved genotyping of disease related genes, and decrease the cost of studying polymorphismin a population.

Original languageEnglish (US)
Article numbere0136771
JournalPloS one
Volume10
Issue number8
DOIs
StatePublished - Aug 31 2015

Fingerprint

Genes
Genome
genome
Artifacts
genotyping
Limit of Detection
Costs and Cost Analysis
Population
genes
Costs

All Science Journal Classification (ASJC) codes

  • Biochemistry, Genetics and Molecular Biology(all)
  • Agricultural and Biological Sciences(all)
  • General

Cite this

Schröder, Jan ; Girirajan, Santhosh ; Papenfuss, Anthony T. ; Medvedev, Paul. / Improving the power of structural variation detection by augmenting the reference. In: PloS one. 2015 ; Vol. 10, No. 8.
@article{00500b3271cf4cd985e8632e44270581,
title = "Improving the power of structural variation detection by augmenting the reference",
abstract = "The uses of the Genome Reference Consortium's human reference sequence can be roughly categorized into three related but distinct categories: as a representative species genome, as a coordinate systemfor identifying variants, and as an alignment reference for variation detection algorithms. However, the use of this reference sequence as simultaneously a representative species genome and as an alignment reference leads to unnecessary artifacts for structural variation detection algorithms and limits their accuracy.We show how decoupling these two references and developing a separate alignment reference can significantly improve the accuracy of structural variation detection, lead to improved genotyping of disease related genes, and decrease the cost of studying polymorphismin a population.",
author = "Jan Schr{\"o}der and Santhosh Girirajan and Papenfuss, {Anthony T.} and Paul Medvedev",
year = "2015",
month = "8",
day = "31",
doi = "10.1371/journal.pone.0136771",
language = "English (US)",
volume = "10",
journal = "PLoS One",
issn = "1932-6203",
publisher = "Public Library of Science",
number = "8",

}

Improving the power of structural variation detection by augmenting the reference. / Schröder, Jan; Girirajan, Santhosh; Papenfuss, Anthony T.; Medvedev, Paul.

In: PloS one, Vol. 10, No. 8, e0136771, 31.08.2015.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Improving the power of structural variation detection by augmenting the reference

AU - Schröder, Jan

AU - Girirajan, Santhosh

AU - Papenfuss, Anthony T.

AU - Medvedev, Paul

PY - 2015/8/31

Y1 - 2015/8/31

N2 - The uses of the Genome Reference Consortium's human reference sequence can be roughly categorized into three related but distinct categories: as a representative species genome, as a coordinate systemfor identifying variants, and as an alignment reference for variation detection algorithms. However, the use of this reference sequence as simultaneously a representative species genome and as an alignment reference leads to unnecessary artifacts for structural variation detection algorithms and limits their accuracy.We show how decoupling these two references and developing a separate alignment reference can significantly improve the accuracy of structural variation detection, lead to improved genotyping of disease related genes, and decrease the cost of studying polymorphismin a population.

AB - The uses of the Genome Reference Consortium's human reference sequence can be roughly categorized into three related but distinct categories: as a representative species genome, as a coordinate systemfor identifying variants, and as an alignment reference for variation detection algorithms. However, the use of this reference sequence as simultaneously a representative species genome and as an alignment reference leads to unnecessary artifacts for structural variation detection algorithms and limits their accuracy.We show how decoupling these two references and developing a separate alignment reference can significantly improve the accuracy of structural variation detection, lead to improved genotyping of disease related genes, and decrease the cost of studying polymorphismin a population.

UR - http://www.scopus.com/inward/record.url?scp=84943339647&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84943339647&partnerID=8YFLogxK

U2 - 10.1371/journal.pone.0136771

DO - 10.1371/journal.pone.0136771

M3 - Article

C2 - 26322511

AN - SCOPUS:84943339647

VL - 10

JO - PLoS One

JF - PLoS One

SN - 1932-6203

IS - 8

M1 - e0136771

ER -