TY - JOUR
T1 - Controlling for contamination in re-sequencing studies with a reproducible web-based phylogenetic approach
AU - Dickins, Benjamin
AU - Rebolledo-Jaramillo, Boris
AU - Su, Marcia Shu Wei
AU - Paul, Ian M.
AU - Blankenberg, Daniel
AU - Stoler, Nicholas
AU - Makova, Kateryna D.
AU - Nekrutenko, Anton
PY - 2014/3
Y1 - 2014/3
N2 - Polymorphism discovery is a routine application of next-generation sequencing technology where multiple samples are sent to a service provider for library preparation, subsequent sequencing, and bioinformatic analyses. The decreasing cost and advances in multiplexing approaches have made it possible to analyze hundreds of samples at a reasonable cost. However, because of the manual steps involved in the initial processing of samples and handling of sequencing equipment, cross-contamination remains a significant challenge. It is especially problematic in cases where polymorphism frequencies do not adhere to diploid expectation, for example, heterogeneous tumor samples, organellar genomes, as well as during bacterial and viral sequencing. In these instances, low levels of contamination may be readily mistaken for polymorphisms, leading to false results. Here we describe practical steps designed to reliably detect contamination and uncover its origin, and also provide new, Galaxy-based, readily accessible computational tools and workflows for quality control. All results described in this report can be reproduced interactively on the web as described at http://usegalaxy.org/contamination.
AB - Polymorphism discovery is a routine application of next-generation sequencing technology where multiple samples are sent to a service provider for library preparation, subsequent sequencing, and bioinformatic analyses. The decreasing cost and advances in multiplexing approaches have made it possible to analyze hundreds of samples at a reasonable cost. However, because of the manual steps involved in the initial processing of samples and handling of sequencing equipment, cross-contamination remains a significant challenge. It is especially problematic in cases where polymorphism frequencies do not adhere to diploid expectation, for example, heterogeneous tumor samples, organellar genomes, as well as during bacterial and viral sequencing. In these instances, low levels of contamination may be readily mistaken for polymorphisms, leading to false results. Here we describe practical steps designed to reliably detect contamination and uncover its origin, and also provide new, Galaxy-based, readily accessible computational tools and workflows for quality control. All results described in this report can be reproduced interactively on the web as described at http://usegalaxy.org/contamination.
UR - http://www.scopus.com/inward/record.url?scp=84896460187&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84896460187&partnerID=8YFLogxK
U2 - 10.2144/000114146
DO - 10.2144/000114146
M3 - Article
C2 - 24641477
AN - SCOPUS:84896460187
SN - 0736-6205
VL - 56
SP - 134
EP - 141
JO - BioTechniques
JF - BioTechniques
IS - 3
ER -