Smith-Waterman peak alignment for comprehensive two-dimensional gas chromatography-mass spectrometry

Seongho Kim, Imhoi Koo, Aiqin Fang, Xiang Zhang

Research output: Contribution to journalArticle

20 Citations (Scopus)

Abstract

Background: Comprehensive two-dimensional gas chromatography coupled with mass spectrometry (GC × GC-MS) is a powerful technique which has gained increasing attention over the last two decades. The GC × GC-MS provides much increased separation capacity, chemical selectivity and sensitivity for complex sample analysis and brings more accurate information about compound retention times and mass spectra. Despite these advantages, the retention times of the resolved peaks on the two-dimensional gas chromatographic columns are always shifted due to experimental variations, introducing difficulty in the data processing for metabolomics analysis. Therefore, the retention time variation must be adjusted in order to compare multiple metabolic profiles obtained from different conditions.Results: We developed novel peak alignment algorithms for both homogeneous (acquired under the identical experimental conditions) and heterogeneous (acquired under the different experimental conditions) GC × GC-MS data using modified Smith-Waterman local alignment algorithms along with mass spectral similarity. Compared with literature reported algorithms, the proposed algorithms eliminated the detection of landmark peaks and the usage of retention time transformation. Furthermore, an automated peak alignment software package was established by implementing a likelihood function for optimal peak alignment.Conclusions: The proposed Smith-Waterman local alignment-based algorithms are capable of aligning both the homogeneous and heterogeneous data of multiple GC × GC-MS experiments without the transformation of retention times and the selection of landmark peaks. An optimal version of the SW-based algorithms was also established based on the associated likelihood function for the automatic peak alignment. The proposed alignment algorithms outperform the literature reported alignment method by analyzing the experiment data of a mixture of compound standards and a metabolite extract of mouse plasma with spiked-in compound standards.

Original languageEnglish (US)
Article number235
JournalBMC bioinformatics
Volume12
DOIs
StatePublished - Jun 15 2011

Fingerprint

Gas Chromatography
Mass Spectrometry
Gas chromatography
Gas Chromatography-Mass Spectrometry
Mass spectrometry
Alignment
Likelihood Functions
Landmarks
Likelihood Function
Metabolomics
Metabolome
Metabolites
Selectivity
Software Package
Software packages
Software
Experiment
Gases
Mouse
Plasma

All Science Journal Classification (ASJC) codes

  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Applied Mathematics
  • Structural Biology

Cite this

@article{6f811f7022c348a180e7bc5736c8d311,
title = "Smith-Waterman peak alignment for comprehensive two-dimensional gas chromatography-mass spectrometry",
abstract = "Background: Comprehensive two-dimensional gas chromatography coupled with mass spectrometry (GC × GC-MS) is a powerful technique which has gained increasing attention over the last two decades. The GC × GC-MS provides much increased separation capacity, chemical selectivity and sensitivity for complex sample analysis and brings more accurate information about compound retention times and mass spectra. Despite these advantages, the retention times of the resolved peaks on the two-dimensional gas chromatographic columns are always shifted due to experimental variations, introducing difficulty in the data processing for metabolomics analysis. Therefore, the retention time variation must be adjusted in order to compare multiple metabolic profiles obtained from different conditions.Results: We developed novel peak alignment algorithms for both homogeneous (acquired under the identical experimental conditions) and heterogeneous (acquired under the different experimental conditions) GC × GC-MS data using modified Smith-Waterman local alignment algorithms along with mass spectral similarity. Compared with literature reported algorithms, the proposed algorithms eliminated the detection of landmark peaks and the usage of retention time transformation. Furthermore, an automated peak alignment software package was established by implementing a likelihood function for optimal peak alignment.Conclusions: The proposed Smith-Waterman local alignment-based algorithms are capable of aligning both the homogeneous and heterogeneous data of multiple GC × GC-MS experiments without the transformation of retention times and the selection of landmark peaks. An optimal version of the SW-based algorithms was also established based on the associated likelihood function for the automatic peak alignment. The proposed alignment algorithms outperform the literature reported alignment method by analyzing the experiment data of a mixture of compound standards and a metabolite extract of mouse plasma with spiked-in compound standards.",
author = "Seongho Kim and Imhoi Koo and Aiqin Fang and Xiang Zhang",
year = "2011",
month = "6",
day = "15",
doi = "10.1186/1471-2105-12-235",
language = "English (US)",
volume = "12",
journal = "BMC Bioinformatics",
issn = "1471-2105",
publisher = "BioMed Central",

}

Smith-Waterman peak alignment for comprehensive two-dimensional gas chromatography-mass spectrometry. / Kim, Seongho; Koo, Imhoi; Fang, Aiqin; Zhang, Xiang.

In: BMC bioinformatics, Vol. 12, 235, 15.06.2011.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Smith-Waterman peak alignment for comprehensive two-dimensional gas chromatography-mass spectrometry

AU - Kim, Seongho

AU - Koo, Imhoi

AU - Fang, Aiqin

AU - Zhang, Xiang

PY - 2011/6/15

Y1 - 2011/6/15

N2 - Background: Comprehensive two-dimensional gas chromatography coupled with mass spectrometry (GC × GC-MS) is a powerful technique which has gained increasing attention over the last two decades. The GC × GC-MS provides much increased separation capacity, chemical selectivity and sensitivity for complex sample analysis and brings more accurate information about compound retention times and mass spectra. Despite these advantages, the retention times of the resolved peaks on the two-dimensional gas chromatographic columns are always shifted due to experimental variations, introducing difficulty in the data processing for metabolomics analysis. Therefore, the retention time variation must be adjusted in order to compare multiple metabolic profiles obtained from different conditions.Results: We developed novel peak alignment algorithms for both homogeneous (acquired under the identical experimental conditions) and heterogeneous (acquired under the different experimental conditions) GC × GC-MS data using modified Smith-Waterman local alignment algorithms along with mass spectral similarity. Compared with literature reported algorithms, the proposed algorithms eliminated the detection of landmark peaks and the usage of retention time transformation. Furthermore, an automated peak alignment software package was established by implementing a likelihood function for optimal peak alignment.Conclusions: The proposed Smith-Waterman local alignment-based algorithms are capable of aligning both the homogeneous and heterogeneous data of multiple GC × GC-MS experiments without the transformation of retention times and the selection of landmark peaks. An optimal version of the SW-based algorithms was also established based on the associated likelihood function for the automatic peak alignment. The proposed alignment algorithms outperform the literature reported alignment method by analyzing the experiment data of a mixture of compound standards and a metabolite extract of mouse plasma with spiked-in compound standards.

AB - Background: Comprehensive two-dimensional gas chromatography coupled with mass spectrometry (GC × GC-MS) is a powerful technique which has gained increasing attention over the last two decades. The GC × GC-MS provides much increased separation capacity, chemical selectivity and sensitivity for complex sample analysis and brings more accurate information about compound retention times and mass spectra. Despite these advantages, the retention times of the resolved peaks on the two-dimensional gas chromatographic columns are always shifted due to experimental variations, introducing difficulty in the data processing for metabolomics analysis. Therefore, the retention time variation must be adjusted in order to compare multiple metabolic profiles obtained from different conditions.Results: We developed novel peak alignment algorithms for both homogeneous (acquired under the identical experimental conditions) and heterogeneous (acquired under the different experimental conditions) GC × GC-MS data using modified Smith-Waterman local alignment algorithms along with mass spectral similarity. Compared with literature reported algorithms, the proposed algorithms eliminated the detection of landmark peaks and the usage of retention time transformation. Furthermore, an automated peak alignment software package was established by implementing a likelihood function for optimal peak alignment.Conclusions: The proposed Smith-Waterman local alignment-based algorithms are capable of aligning both the homogeneous and heterogeneous data of multiple GC × GC-MS experiments without the transformation of retention times and the selection of landmark peaks. An optimal version of the SW-based algorithms was also established based on the associated likelihood function for the automatic peak alignment. The proposed alignment algorithms outperform the literature reported alignment method by analyzing the experiment data of a mixture of compound standards and a metabolite extract of mouse plasma with spiked-in compound standards.

UR - http://www.scopus.com/inward/record.url?scp=79958771050&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79958771050&partnerID=8YFLogxK

U2 - 10.1186/1471-2105-12-235

DO - 10.1186/1471-2105-12-235

M3 - Article

C2 - 21676240

AN - SCOPUS:79958771050

VL - 12

JO - BMC Bioinformatics

JF - BMC Bioinformatics

SN - 1471-2105

M1 - 235

ER -