Constructing metabolic association networks using high-dimensional mass spectrometry data

Imhoi Koo, Xiaoli Wei, Xue Shi, Zhanxiang Zhou, Seongho Kim, Xiang Zhang

Research output: Contribution to journalArticle

Abstract

The goal of metabolic association networks is to identify topology of a metabolic network for a better understanding of molecular mechanisms. An accurate metabolic association network enables investigation of the functional behavior of metabolites in a cell or tissue. Gaussian Graphical model (GGM)-based methods have been widely used in genomics to infer biological networks. However, the performance of various GGM-based methods for the construction of metabolic association networks remains unknown in metabolomics. The performance of principal component regression (PCR), independent component regression (ICR), shrinkage covariance estimate (SCE), partial least squares regression (PLSR), and extrinsic similarity (ES) methods in constructing metabolic association networks was compared by estimating partial correlation coefficient matrices when the number of variables is larger than the sample size. To do this, the sample size and the network density (complexity) were considered as variables for network construction. Simulation studies show that PCR and ICR are more stable to the sample size and the network density than SCE and PLSR in terms of F1 scores. These methods were further applied to the analysis of experimental metabolomics data acquired from metabolite extract of mouse liver. For the simulated data, the proposed methods PCR and ICR outperform other methods when the network density is large, while PLSR and SCE perform better when the network density is small. As for the experimental metabolomics data, PCR and ICR discover more significant edges and perform better than PLSR and SCE when the discovered edges are evaluated using KEGG pathway. These results suggest that the metabolic network might be more complex and therefore, PCR and ICR have the advantage over PLSR and SCE in constructing the metabolic association networks.

Original languageEnglish (US)
Pages (from-to)193-202
Number of pages10
JournalChemometrics and Intelligent Laboratory Systems
Volume138
DOIs
StatePublished - Jul 9 2014

Fingerprint

Mass spectrometry
Metabolites
Liver
Topology
Tissue
Metabolomics
Metabolic Networks and Pathways

All Science Journal Classification (ASJC) codes

  • Analytical Chemistry
  • Software
  • Computer Science Applications
  • Spectroscopy
  • Process Chemistry and Technology

Cite this

Koo, Imhoi ; Wei, Xiaoli ; Shi, Xue ; Zhou, Zhanxiang ; Kim, Seongho ; Zhang, Xiang. / Constructing metabolic association networks using high-dimensional mass spectrometry data. In: Chemometrics and Intelligent Laboratory Systems. 2014 ; Vol. 138. pp. 193-202.
@article{6f852acd263147b5b59ecda7332211b1,
title = "Constructing metabolic association networks using high-dimensional mass spectrometry data",
abstract = "The goal of metabolic association networks is to identify topology of a metabolic network for a better understanding of molecular mechanisms. An accurate metabolic association network enables investigation of the functional behavior of metabolites in a cell or tissue. Gaussian Graphical model (GGM)-based methods have been widely used in genomics to infer biological networks. However, the performance of various GGM-based methods for the construction of metabolic association networks remains unknown in metabolomics. The performance of principal component regression (PCR), independent component regression (ICR), shrinkage covariance estimate (SCE), partial least squares regression (PLSR), and extrinsic similarity (ES) methods in constructing metabolic association networks was compared by estimating partial correlation coefficient matrices when the number of variables is larger than the sample size. To do this, the sample size and the network density (complexity) were considered as variables for network construction. Simulation studies show that PCR and ICR are more stable to the sample size and the network density than SCE and PLSR in terms of F1 scores. These methods were further applied to the analysis of experimental metabolomics data acquired from metabolite extract of mouse liver. For the simulated data, the proposed methods PCR and ICR outperform other methods when the network density is large, while PLSR and SCE perform better when the network density is small. As for the experimental metabolomics data, PCR and ICR discover more significant edges and perform better than PLSR and SCE when the discovered edges are evaluated using KEGG pathway. These results suggest that the metabolic network might be more complex and therefore, PCR and ICR have the advantage over PLSR and SCE in constructing the metabolic association networks.",
author = "Imhoi Koo and Xiaoli Wei and Xue Shi and Zhanxiang Zhou and Seongho Kim and Xiang Zhang",
year = "2014",
month = "7",
day = "9",
doi = "10.1016/j.chemolab.2014.07.002",
language = "English (US)",
volume = "138",
pages = "193--202",
journal = "Chemometrics and Intelligent Laboratory Systems",
issn = "0169-7439",
publisher = "Elsevier",

}

Constructing metabolic association networks using high-dimensional mass spectrometry data. / Koo, Imhoi; Wei, Xiaoli; Shi, Xue; Zhou, Zhanxiang; Kim, Seongho; Zhang, Xiang.

In: Chemometrics and Intelligent Laboratory Systems, Vol. 138, 09.07.2014, p. 193-202.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Constructing metabolic association networks using high-dimensional mass spectrometry data

AU - Koo, Imhoi

AU - Wei, Xiaoli

AU - Shi, Xue

AU - Zhou, Zhanxiang

AU - Kim, Seongho

AU - Zhang, Xiang

PY - 2014/7/9

Y1 - 2014/7/9

N2 - The goal of metabolic association networks is to identify topology of a metabolic network for a better understanding of molecular mechanisms. An accurate metabolic association network enables investigation of the functional behavior of metabolites in a cell or tissue. Gaussian Graphical model (GGM)-based methods have been widely used in genomics to infer biological networks. However, the performance of various GGM-based methods for the construction of metabolic association networks remains unknown in metabolomics. The performance of principal component regression (PCR), independent component regression (ICR), shrinkage covariance estimate (SCE), partial least squares regression (PLSR), and extrinsic similarity (ES) methods in constructing metabolic association networks was compared by estimating partial correlation coefficient matrices when the number of variables is larger than the sample size. To do this, the sample size and the network density (complexity) were considered as variables for network construction. Simulation studies show that PCR and ICR are more stable to the sample size and the network density than SCE and PLSR in terms of F1 scores. These methods were further applied to the analysis of experimental metabolomics data acquired from metabolite extract of mouse liver. For the simulated data, the proposed methods PCR and ICR outperform other methods when the network density is large, while PLSR and SCE perform better when the network density is small. As for the experimental metabolomics data, PCR and ICR discover more significant edges and perform better than PLSR and SCE when the discovered edges are evaluated using KEGG pathway. These results suggest that the metabolic network might be more complex and therefore, PCR and ICR have the advantage over PLSR and SCE in constructing the metabolic association networks.

AB - The goal of metabolic association networks is to identify topology of a metabolic network for a better understanding of molecular mechanisms. An accurate metabolic association network enables investigation of the functional behavior of metabolites in a cell or tissue. Gaussian Graphical model (GGM)-based methods have been widely used in genomics to infer biological networks. However, the performance of various GGM-based methods for the construction of metabolic association networks remains unknown in metabolomics. The performance of principal component regression (PCR), independent component regression (ICR), shrinkage covariance estimate (SCE), partial least squares regression (PLSR), and extrinsic similarity (ES) methods in constructing metabolic association networks was compared by estimating partial correlation coefficient matrices when the number of variables is larger than the sample size. To do this, the sample size and the network density (complexity) were considered as variables for network construction. Simulation studies show that PCR and ICR are more stable to the sample size and the network density than SCE and PLSR in terms of F1 scores. These methods were further applied to the analysis of experimental metabolomics data acquired from metabolite extract of mouse liver. For the simulated data, the proposed methods PCR and ICR outperform other methods when the network density is large, while PLSR and SCE perform better when the network density is small. As for the experimental metabolomics data, PCR and ICR discover more significant edges and perform better than PLSR and SCE when the discovered edges are evaluated using KEGG pathway. These results suggest that the metabolic network might be more complex and therefore, PCR and ICR have the advantage over PLSR and SCE in constructing the metabolic association networks.

UR - http://www.scopus.com/inward/record.url?scp=84907715655&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84907715655&partnerID=8YFLogxK

U2 - 10.1016/j.chemolab.2014.07.002

DO - 10.1016/j.chemolab.2014.07.002

M3 - Article

VL - 138

SP - 193

EP - 202

JO - Chemometrics and Intelligent Laboratory Systems

JF - Chemometrics and Intelligent Laboratory Systems

SN - 0169-7439

ER -