Controlling for contamination in re-sequencing studies with a reproducible web-based phylogenetic approach

Benjamin Dickins, Boris Rebolledo-Jaramillo, Marcia Shu Wei Su, Ian M. Paul, Daniel Blankenberg, Nicholas Stoler, Kateryna D. Makova, Anton Nekrutenko

Research output: Contribution to journalArticle

6 Citations (Scopus)

Abstract

Polymorphism discovery is a routine application of next-generation sequencing technology where multiple samples are sent to a service provider for library preparation, subsequent sequencing, and bioinformatic analyses. The decreasing cost and advances in multiplexing approaches have made it possible to analyze hundreds of samples at a reasonable cost. However, because of the manual steps involved in the initial processing of samples and handling of sequencing equipment, cross-contamination remains a significant challenge. It is especially problematic in cases where polymorphism frequencies do not adhere to diploid expectation, for example, heterogeneous tumor samples, organellar genomes, as well as during bacterial and viral sequencing. In these instances, low levels of contamination may be readily mistaken for polymorphisms, leading to false results. Here we describe practical steps designed to reliably detect contamination and uncover its origin, and also provide new, Galaxy-based, readily accessible computational tools and workflows for quality control. All results described in this report can be reproduced interactively on the web as described at http://usegalaxy.org/contamination.

Original languageEnglish (US)
Pages (from-to)134-141
Number of pages8
JournalBioTechniques
Volume56
Issue number3
DOIs
StatePublished - Mar 2014

Fingerprint

Equipment Contamination
Contamination
Galaxies
Library Services
Polymorphism
Costs and Cost Analysis
Workflow
Computational Biology
Diploidy
Quality Control
Genome
Technology
Bioinformatics
Multiplexing
Quality control
Costs
Tumors
Neoplasms
Genes
Processing

All Science Journal Classification (ASJC) codes

  • Biotechnology
  • Biochemistry, Genetics and Molecular Biology(all)

Cite this

Dickins, Benjamin ; Rebolledo-Jaramillo, Boris ; Su, Marcia Shu Wei ; Paul, Ian M. ; Blankenberg, Daniel ; Stoler, Nicholas ; Makova, Kateryna D. ; Nekrutenko, Anton. / Controlling for contamination in re-sequencing studies with a reproducible web-based phylogenetic approach. In: BioTechniques. 2014 ; Vol. 56, No. 3. pp. 134-141.
@article{61a1551c7e964d90ba5711aca99c6182,
title = "Controlling for contamination in re-sequencing studies with a reproducible web-based phylogenetic approach",
abstract = "Polymorphism discovery is a routine application of next-generation sequencing technology where multiple samples are sent to a service provider for library preparation, subsequent sequencing, and bioinformatic analyses. The decreasing cost and advances in multiplexing approaches have made it possible to analyze hundreds of samples at a reasonable cost. However, because of the manual steps involved in the initial processing of samples and handling of sequencing equipment, cross-contamination remains a significant challenge. It is especially problematic in cases where polymorphism frequencies do not adhere to diploid expectation, for example, heterogeneous tumor samples, organellar genomes, as well as during bacterial and viral sequencing. In these instances, low levels of contamination may be readily mistaken for polymorphisms, leading to false results. Here we describe practical steps designed to reliably detect contamination and uncover its origin, and also provide new, Galaxy-based, readily accessible computational tools and workflows for quality control. All results described in this report can be reproduced interactively on the web as described at http://usegalaxy.org/contamination.",
author = "Benjamin Dickins and Boris Rebolledo-Jaramillo and Su, {Marcia Shu Wei} and Paul, {Ian M.} and Daniel Blankenberg and Nicholas Stoler and Makova, {Kateryna D.} and Anton Nekrutenko",
year = "2014",
month = "3",
doi = "10.2144/000114146",
language = "English (US)",
volume = "56",
pages = "134--141",
journal = "BioTechniques",
issn = "0736-6205",
publisher = "Eaton Publishing Company",
number = "3",

}

Controlling for contamination in re-sequencing studies with a reproducible web-based phylogenetic approach. / Dickins, Benjamin; Rebolledo-Jaramillo, Boris; Su, Marcia Shu Wei; Paul, Ian M.; Blankenberg, Daniel; Stoler, Nicholas; Makova, Kateryna D.; Nekrutenko, Anton.

In: BioTechniques, Vol. 56, No. 3, 03.2014, p. 134-141.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Controlling for contamination in re-sequencing studies with a reproducible web-based phylogenetic approach

AU - Dickins, Benjamin

AU - Rebolledo-Jaramillo, Boris

AU - Su, Marcia Shu Wei

AU - Paul, Ian M.

AU - Blankenberg, Daniel

AU - Stoler, Nicholas

AU - Makova, Kateryna D.

AU - Nekrutenko, Anton

PY - 2014/3

Y1 - 2014/3

N2 - Polymorphism discovery is a routine application of next-generation sequencing technology where multiple samples are sent to a service provider for library preparation, subsequent sequencing, and bioinformatic analyses. The decreasing cost and advances in multiplexing approaches have made it possible to analyze hundreds of samples at a reasonable cost. However, because of the manual steps involved in the initial processing of samples and handling of sequencing equipment, cross-contamination remains a significant challenge. It is especially problematic in cases where polymorphism frequencies do not adhere to diploid expectation, for example, heterogeneous tumor samples, organellar genomes, as well as during bacterial and viral sequencing. In these instances, low levels of contamination may be readily mistaken for polymorphisms, leading to false results. Here we describe practical steps designed to reliably detect contamination and uncover its origin, and also provide new, Galaxy-based, readily accessible computational tools and workflows for quality control. All results described in this report can be reproduced interactively on the web as described at http://usegalaxy.org/contamination.

AB - Polymorphism discovery is a routine application of next-generation sequencing technology where multiple samples are sent to a service provider for library preparation, subsequent sequencing, and bioinformatic analyses. The decreasing cost and advances in multiplexing approaches have made it possible to analyze hundreds of samples at a reasonable cost. However, because of the manual steps involved in the initial processing of samples and handling of sequencing equipment, cross-contamination remains a significant challenge. It is especially problematic in cases where polymorphism frequencies do not adhere to diploid expectation, for example, heterogeneous tumor samples, organellar genomes, as well as during bacterial and viral sequencing. In these instances, low levels of contamination may be readily mistaken for polymorphisms, leading to false results. Here we describe practical steps designed to reliably detect contamination and uncover its origin, and also provide new, Galaxy-based, readily accessible computational tools and workflows for quality control. All results described in this report can be reproduced interactively on the web as described at http://usegalaxy.org/contamination.

UR - http://www.scopus.com/inward/record.url?scp=84896460187&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84896460187&partnerID=8YFLogxK

U2 - 10.2144/000114146

DO - 10.2144/000114146

M3 - Article

C2 - 24641477

AN - SCOPUS:84896460187

VL - 56

SP - 134

EP - 141

JO - BioTechniques

JF - BioTechniques

SN - 0736-6205

IS - 3

ER -