Implementation of Cloud based Next Generation Sequencing data analysis in a clinical laboratory

Getiria Onsongo, Jesse Erdmann, Michael D. Spears, John M. Chilton, Kenneth B. Beckman, Adam Hauge, Sophia Yohe, Matthew Schomaker, Matthew Bower, Kevin A.T. Silverstein, Bharat Thyagarajan

Research output: Contribution to journalArticle

17 Citations (Scopus)

Abstract

Background: The introduction of next generation sequencing (NGS) has revolutionized molecular diagnostics, though several challenges remain limiting the widespread adoption of NGS testing into clinical practice. One such difficulty includes the development of a robust bioinformatics pipeline that can handle the volume of data generated by high-throughput sequencing in a cost-effective manner. Analysis of sequencing data typically requires a substantial level of computing power that is often cost-prohibitive to most clinical diagnostics laboratories. Findings. To address this challenge, our institution has developed a Galaxy-based data analysis pipeline which relies on a web-based, cloud-computing infrastructure to process NGS data and identify genetic variants. It provides additional flexibility, needed to control storage costs, resulting in a pipeline that is cost-effective on a per-sample basis. It does not require the usage of EBS disk to run a sample. Conclusions: We demonstrate the validation and feasibility of implementing this bioinformatics pipeline in a molecular diagnostics laboratory. Four samples were analyzed in duplicate pairs and showed 100% concordance in mutations identified. This pipeline is currently being used in the clinic and all identified pathogenic variants confirmed using Sanger sequencing further validating the software.

Original languageEnglish (US)
Article number314
JournalBMC Research Notes
Volume7
Issue number1
DOIs
StatePublished - May 23 2014

Fingerprint

Clinical laboratories
Molecular Pathology
Pipelines
Computational Biology
Costs and Cost Analysis
Galaxies
Bioinformatics
Cost Control
Costs
Software
Cloud computing
Mutation
Throughput
Testing

All Science Journal Classification (ASJC) codes

  • Medicine(all)
  • Biochemistry, Genetics and Molecular Biology(all)

Cite this

Onsongo, G., Erdmann, J., Spears, M. D., Chilton, J. M., Beckman, K. B., Hauge, A., ... Thyagarajan, B. (2014). Implementation of Cloud based Next Generation Sequencing data analysis in a clinical laboratory. BMC Research Notes, 7(1), [314]. https://doi.org/10.1186/1756-0500-7-314
Onsongo, Getiria ; Erdmann, Jesse ; Spears, Michael D. ; Chilton, John M. ; Beckman, Kenneth B. ; Hauge, Adam ; Yohe, Sophia ; Schomaker, Matthew ; Bower, Matthew ; Silverstein, Kevin A.T. ; Thyagarajan, Bharat. / Implementation of Cloud based Next Generation Sequencing data analysis in a clinical laboratory. In: BMC Research Notes. 2014 ; Vol. 7, No. 1.
@article{33b72ac640334ecdb66e019a4f255ed8,
title = "Implementation of Cloud based Next Generation Sequencing data analysis in a clinical laboratory",
abstract = "Background: The introduction of next generation sequencing (NGS) has revolutionized molecular diagnostics, though several challenges remain limiting the widespread adoption of NGS testing into clinical practice. One such difficulty includes the development of a robust bioinformatics pipeline that can handle the volume of data generated by high-throughput sequencing in a cost-effective manner. Analysis of sequencing data typically requires a substantial level of computing power that is often cost-prohibitive to most clinical diagnostics laboratories. Findings. To address this challenge, our institution has developed a Galaxy-based data analysis pipeline which relies on a web-based, cloud-computing infrastructure to process NGS data and identify genetic variants. It provides additional flexibility, needed to control storage costs, resulting in a pipeline that is cost-effective on a per-sample basis. It does not require the usage of EBS disk to run a sample. Conclusions: We demonstrate the validation and feasibility of implementing this bioinformatics pipeline in a molecular diagnostics laboratory. Four samples were analyzed in duplicate pairs and showed 100{\%} concordance in mutations identified. This pipeline is currently being used in the clinic and all identified pathogenic variants confirmed using Sanger sequencing further validating the software.",
author = "Getiria Onsongo and Jesse Erdmann and Spears, {Michael D.} and Chilton, {John M.} and Beckman, {Kenneth B.} and Adam Hauge and Sophia Yohe and Matthew Schomaker and Matthew Bower and Silverstein, {Kevin A.T.} and Bharat Thyagarajan",
year = "2014",
month = "5",
day = "23",
doi = "10.1186/1756-0500-7-314",
language = "English (US)",
volume = "7",
journal = "BMC Research Notes",
issn = "1756-0500",
publisher = "BioMed Central",
number = "1",

}

Onsongo, G, Erdmann, J, Spears, MD, Chilton, JM, Beckman, KB, Hauge, A, Yohe, S, Schomaker, M, Bower, M, Silverstein, KAT & Thyagarajan, B 2014, 'Implementation of Cloud based Next Generation Sequencing data analysis in a clinical laboratory', BMC Research Notes, vol. 7, no. 1, 314. https://doi.org/10.1186/1756-0500-7-314

Implementation of Cloud based Next Generation Sequencing data analysis in a clinical laboratory. / Onsongo, Getiria; Erdmann, Jesse; Spears, Michael D.; Chilton, John M.; Beckman, Kenneth B.; Hauge, Adam; Yohe, Sophia; Schomaker, Matthew; Bower, Matthew; Silverstein, Kevin A.T.; Thyagarajan, Bharat.

In: BMC Research Notes, Vol. 7, No. 1, 314, 23.05.2014.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Implementation of Cloud based Next Generation Sequencing data analysis in a clinical laboratory

AU - Onsongo, Getiria

AU - Erdmann, Jesse

AU - Spears, Michael D.

AU - Chilton, John M.

AU - Beckman, Kenneth B.

AU - Hauge, Adam

AU - Yohe, Sophia

AU - Schomaker, Matthew

AU - Bower, Matthew

AU - Silverstein, Kevin A.T.

AU - Thyagarajan, Bharat

PY - 2014/5/23

Y1 - 2014/5/23

N2 - Background: The introduction of next generation sequencing (NGS) has revolutionized molecular diagnostics, though several challenges remain limiting the widespread adoption of NGS testing into clinical practice. One such difficulty includes the development of a robust bioinformatics pipeline that can handle the volume of data generated by high-throughput sequencing in a cost-effective manner. Analysis of sequencing data typically requires a substantial level of computing power that is often cost-prohibitive to most clinical diagnostics laboratories. Findings. To address this challenge, our institution has developed a Galaxy-based data analysis pipeline which relies on a web-based, cloud-computing infrastructure to process NGS data and identify genetic variants. It provides additional flexibility, needed to control storage costs, resulting in a pipeline that is cost-effective on a per-sample basis. It does not require the usage of EBS disk to run a sample. Conclusions: We demonstrate the validation and feasibility of implementing this bioinformatics pipeline in a molecular diagnostics laboratory. Four samples were analyzed in duplicate pairs and showed 100% concordance in mutations identified. This pipeline is currently being used in the clinic and all identified pathogenic variants confirmed using Sanger sequencing further validating the software.

AB - Background: The introduction of next generation sequencing (NGS) has revolutionized molecular diagnostics, though several challenges remain limiting the widespread adoption of NGS testing into clinical practice. One such difficulty includes the development of a robust bioinformatics pipeline that can handle the volume of data generated by high-throughput sequencing in a cost-effective manner. Analysis of sequencing data typically requires a substantial level of computing power that is often cost-prohibitive to most clinical diagnostics laboratories. Findings. To address this challenge, our institution has developed a Galaxy-based data analysis pipeline which relies on a web-based, cloud-computing infrastructure to process NGS data and identify genetic variants. It provides additional flexibility, needed to control storage costs, resulting in a pipeline that is cost-effective on a per-sample basis. It does not require the usage of EBS disk to run a sample. Conclusions: We demonstrate the validation and feasibility of implementing this bioinformatics pipeline in a molecular diagnostics laboratory. Four samples were analyzed in duplicate pairs and showed 100% concordance in mutations identified. This pipeline is currently being used in the clinic and all identified pathogenic variants confirmed using Sanger sequencing further validating the software.

UR - http://www.scopus.com/inward/record.url?scp=84901697346&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84901697346&partnerID=8YFLogxK

U2 - 10.1186/1756-0500-7-314

DO - 10.1186/1756-0500-7-314

M3 - Article

VL - 7

JO - BMC Research Notes

JF - BMC Research Notes

SN - 1756-0500

IS - 1

M1 - 314

ER -