High resolution clustering of Salmonella enterica serovar Montevideo strains using a next-generation sequencing approach

Marc W. Allard, Yan Luo, Errol Strain, Cong Li, Christine E. Keys, Insook Son, Robert Stones, Steven M. Musser, Eric Wayne Brown

Research output: Contribution to journalArticle

89 Citations (Scopus)

Abstract

Background: Next-Generation Sequencing (NGS) is increasingly being used as a molecular epidemiologic tool for discerning ancestry and traceback of the most complicated, difficult to resolve bacterial pathogens. Making a linkage between possible food sources and clinical isolates requires distinguishing the suspected pathogen from an environmental background and placing the variation observed into the wider context of variation occurring within a serovar and among other closely related foodborne pathogens. Equally important is the need to validate these high resolution molecular tools for use in molecular epidemiologic traceback. Such efforts include the examination of strain cluster stability as well as the cumulative genetic effects of sub-culturing on these clusters. Numerous isolates of S. Montevideo were shot-gun sequenced including diverse lineage representatives as well as numerous replicate clones to determine how much variability is due to bias, sequencing error, and or the culturing of isolates. All new draft genomes were compared to 34 S. Montevideo isolates previously published during an NGS-based molecular epidemiological case study.Results: Intraserovar lineages of S. Montevideo differ by thousands of SNPs, that are only slightly less than the number of SNPs observed between S. Montevideo and other distinct serovars. Much less variability was discovered within an individual S. Montevideo clade implicated in a recent foodborne outbreak as well as among individual NGS replicates. These findings were similar to previous reports documenting homopolymeric and deletion error rates with the Roche 454 GS Titanium technology. In no case, however, did variability associated with sequencing methods or sample preparations create inconsistencies with our current phylogenetic results or the subsequent molecular epidemiological evidence gleaned from these data.Conclusions: Implementation of a validated pipeline for NGS data acquisition and analysis provides highly reproducible results that are stable and predictable for molecular epidemiological applications. When draft genomes are collected at 15×-20× coverage and passed through a quality filter as part of a data analysis pipeline, including sub-passaged replicates defined by a few SNPs, they can be accurately placed in a phylogenetic context. This reproducibility applies to all levels within and between serovars of Salmonella suggesting that investigators using these methods can have confidence in their conclusions.

Original languageEnglish (US)
Article number32
JournalBMC genomics
Volume13
Issue number1
DOIs
StatePublished - Jan 19 2012

Fingerprint

Salmonella enterica
Single Nucleotide Polymorphism
Cluster Analysis
Genome
Firearms
Titanium
Salmonella
Disease Outbreaks
Epidemiologic Studies
Clone Cells
Research Personnel
Technology
Food
Serogroup

All Science Journal Classification (ASJC) codes

  • Biotechnology
  • Genetics

Cite this

Allard, Marc W. ; Luo, Yan ; Strain, Errol ; Li, Cong ; Keys, Christine E. ; Son, Insook ; Stones, Robert ; Musser, Steven M. ; Brown, Eric Wayne. / High resolution clustering of Salmonella enterica serovar Montevideo strains using a next-generation sequencing approach. In: BMC genomics. 2012 ; Vol. 13, No. 1.
@article{3cc669fb9d2b4cf59cf86db2055ec020,
title = "High resolution clustering of Salmonella enterica serovar Montevideo strains using a next-generation sequencing approach",
abstract = "Background: Next-Generation Sequencing (NGS) is increasingly being used as a molecular epidemiologic tool for discerning ancestry and traceback of the most complicated, difficult to resolve bacterial pathogens. Making a linkage between possible food sources and clinical isolates requires distinguishing the suspected pathogen from an environmental background and placing the variation observed into the wider context of variation occurring within a serovar and among other closely related foodborne pathogens. Equally important is the need to validate these high resolution molecular tools for use in molecular epidemiologic traceback. Such efforts include the examination of strain cluster stability as well as the cumulative genetic effects of sub-culturing on these clusters. Numerous isolates of S. Montevideo were shot-gun sequenced including diverse lineage representatives as well as numerous replicate clones to determine how much variability is due to bias, sequencing error, and or the culturing of isolates. All new draft genomes were compared to 34 S. Montevideo isolates previously published during an NGS-based molecular epidemiological case study.Results: Intraserovar lineages of S. Montevideo differ by thousands of SNPs, that are only slightly less than the number of SNPs observed between S. Montevideo and other distinct serovars. Much less variability was discovered within an individual S. Montevideo clade implicated in a recent foodborne outbreak as well as among individual NGS replicates. These findings were similar to previous reports documenting homopolymeric and deletion error rates with the Roche 454 GS Titanium technology. In no case, however, did variability associated with sequencing methods or sample preparations create inconsistencies with our current phylogenetic results or the subsequent molecular epidemiological evidence gleaned from these data.Conclusions: Implementation of a validated pipeline for NGS data acquisition and analysis provides highly reproducible results that are stable and predictable for molecular epidemiological applications. When draft genomes are collected at 15×-20× coverage and passed through a quality filter as part of a data analysis pipeline, including sub-passaged replicates defined by a few SNPs, they can be accurately placed in a phylogenetic context. This reproducibility applies to all levels within and between serovars of Salmonella suggesting that investigators using these methods can have confidence in their conclusions.",
author = "Allard, {Marc W.} and Yan Luo and Errol Strain and Cong Li and Keys, {Christine E.} and Insook Son and Robert Stones and Musser, {Steven M.} and Brown, {Eric Wayne}",
year = "2012",
month = "1",
day = "19",
doi = "10.1186/1471-2164-13-32",
language = "English (US)",
volume = "13",
journal = "BMC Genomics",
issn = "1471-2164",
publisher = "BioMed Central",
number = "1",

}

Allard, MW, Luo, Y, Strain, E, Li, C, Keys, CE, Son, I, Stones, R, Musser, SM & Brown, EW 2012, 'High resolution clustering of Salmonella enterica serovar Montevideo strains using a next-generation sequencing approach', BMC genomics, vol. 13, no. 1, 32. https://doi.org/10.1186/1471-2164-13-32

High resolution clustering of Salmonella enterica serovar Montevideo strains using a next-generation sequencing approach. / Allard, Marc W.; Luo, Yan; Strain, Errol; Li, Cong; Keys, Christine E.; Son, Insook; Stones, Robert; Musser, Steven M.; Brown, Eric Wayne.

In: BMC genomics, Vol. 13, No. 1, 32, 19.01.2012.

Research output: Contribution to journalArticle

TY - JOUR

T1 - High resolution clustering of Salmonella enterica serovar Montevideo strains using a next-generation sequencing approach

AU - Allard, Marc W.

AU - Luo, Yan

AU - Strain, Errol

AU - Li, Cong

AU - Keys, Christine E.

AU - Son, Insook

AU - Stones, Robert

AU - Musser, Steven M.

AU - Brown, Eric Wayne

PY - 2012/1/19

Y1 - 2012/1/19

N2 - Background: Next-Generation Sequencing (NGS) is increasingly being used as a molecular epidemiologic tool for discerning ancestry and traceback of the most complicated, difficult to resolve bacterial pathogens. Making a linkage between possible food sources and clinical isolates requires distinguishing the suspected pathogen from an environmental background and placing the variation observed into the wider context of variation occurring within a serovar and among other closely related foodborne pathogens. Equally important is the need to validate these high resolution molecular tools for use in molecular epidemiologic traceback. Such efforts include the examination of strain cluster stability as well as the cumulative genetic effects of sub-culturing on these clusters. Numerous isolates of S. Montevideo were shot-gun sequenced including diverse lineage representatives as well as numerous replicate clones to determine how much variability is due to bias, sequencing error, and or the culturing of isolates. All new draft genomes were compared to 34 S. Montevideo isolates previously published during an NGS-based molecular epidemiological case study.Results: Intraserovar lineages of S. Montevideo differ by thousands of SNPs, that are only slightly less than the number of SNPs observed between S. Montevideo and other distinct serovars. Much less variability was discovered within an individual S. Montevideo clade implicated in a recent foodborne outbreak as well as among individual NGS replicates. These findings were similar to previous reports documenting homopolymeric and deletion error rates with the Roche 454 GS Titanium technology. In no case, however, did variability associated with sequencing methods or sample preparations create inconsistencies with our current phylogenetic results or the subsequent molecular epidemiological evidence gleaned from these data.Conclusions: Implementation of a validated pipeline for NGS data acquisition and analysis provides highly reproducible results that are stable and predictable for molecular epidemiological applications. When draft genomes are collected at 15×-20× coverage and passed through a quality filter as part of a data analysis pipeline, including sub-passaged replicates defined by a few SNPs, they can be accurately placed in a phylogenetic context. This reproducibility applies to all levels within and between serovars of Salmonella suggesting that investigators using these methods can have confidence in their conclusions.

AB - Background: Next-Generation Sequencing (NGS) is increasingly being used as a molecular epidemiologic tool for discerning ancestry and traceback of the most complicated, difficult to resolve bacterial pathogens. Making a linkage between possible food sources and clinical isolates requires distinguishing the suspected pathogen from an environmental background and placing the variation observed into the wider context of variation occurring within a serovar and among other closely related foodborne pathogens. Equally important is the need to validate these high resolution molecular tools for use in molecular epidemiologic traceback. Such efforts include the examination of strain cluster stability as well as the cumulative genetic effects of sub-culturing on these clusters. Numerous isolates of S. Montevideo were shot-gun sequenced including diverse lineage representatives as well as numerous replicate clones to determine how much variability is due to bias, sequencing error, and or the culturing of isolates. All new draft genomes were compared to 34 S. Montevideo isolates previously published during an NGS-based molecular epidemiological case study.Results: Intraserovar lineages of S. Montevideo differ by thousands of SNPs, that are only slightly less than the number of SNPs observed between S. Montevideo and other distinct serovars. Much less variability was discovered within an individual S. Montevideo clade implicated in a recent foodborne outbreak as well as among individual NGS replicates. These findings were similar to previous reports documenting homopolymeric and deletion error rates with the Roche 454 GS Titanium technology. In no case, however, did variability associated with sequencing methods or sample preparations create inconsistencies with our current phylogenetic results or the subsequent molecular epidemiological evidence gleaned from these data.Conclusions: Implementation of a validated pipeline for NGS data acquisition and analysis provides highly reproducible results that are stable and predictable for molecular epidemiological applications. When draft genomes are collected at 15×-20× coverage and passed through a quality filter as part of a data analysis pipeline, including sub-passaged replicates defined by a few SNPs, they can be accurately placed in a phylogenetic context. This reproducibility applies to all levels within and between serovars of Salmonella suggesting that investigators using these methods can have confidence in their conclusions.

UR - http://www.scopus.com/inward/record.url?scp=84862792042&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84862792042&partnerID=8YFLogxK

U2 - 10.1186/1471-2164-13-32

DO - 10.1186/1471-2164-13-32

M3 - Article

C2 - 22260654

AN - SCOPUS:84862792042

VL - 13

JO - BMC Genomics

JF - BMC Genomics

SN - 1471-2164

IS - 1

M1 - 32

ER -