Class Prediction and Feature Selection with Linear Optimization for Metagenomic Count Data

Zhenqiu Liu, Dechang Chen, Li Sheng, Amy Y. Liu

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

The amount of metagenomic data is growing rapidly while the computational methods for metagenome analysis are still in their infancy. It is important to develop novel statistical learning tools for the prediction of associations between bacterial communities and disease phenotypes and for the detection of differentially abundant features. In this study, we presented a novel statistical learning method for simultaneous association prediction and feature selection with metagenomic samples from two or multiple treatment populations on the basis of count data. We developed a linear programming based support vector machine with L1 and joint L1,∞ penalties for binary and multiclass classifications with metagenomic count data (metalinprog). We evaluated the performance of our method on several real and simulation datasets. The proposed method can simultaneously identify features and predict classes with the metagenomic count data.

Original languageEnglish (US)
Article numbere53253
JournalPloS one
Volume8
Issue number3
DOIs
StatePublished - Mar 26 2013

Fingerprint

Metagenomics
Feature extraction
prediction
Computational methods
Linear programming
Support vector machines
Metagenome
Learning
Linear Programming
learning
linear programming
infancy
methodology
bacterial communities
Joints
Phenotype
phenotype
Population

All Science Journal Classification (ASJC) codes

  • Biochemistry, Genetics and Molecular Biology(all)
  • Agricultural and Biological Sciences(all)

Cite this

Liu, Zhenqiu ; Chen, Dechang ; Sheng, Li ; Liu, Amy Y. / Class Prediction and Feature Selection with Linear Optimization for Metagenomic Count Data. In: PloS one. 2013 ; Vol. 8, No. 3.
@article{af24678ea9ed40598eba795e9dd9814b,
title = "Class Prediction and Feature Selection with Linear Optimization for Metagenomic Count Data",
abstract = "The amount of metagenomic data is growing rapidly while the computational methods for metagenome analysis are still in their infancy. It is important to develop novel statistical learning tools for the prediction of associations between bacterial communities and disease phenotypes and for the detection of differentially abundant features. In this study, we presented a novel statistical learning method for simultaneous association prediction and feature selection with metagenomic samples from two or multiple treatment populations on the basis of count data. We developed a linear programming based support vector machine with L1 and joint L1,∞ penalties for binary and multiclass classifications with metagenomic count data (metalinprog). We evaluated the performance of our method on several real and simulation datasets. The proposed method can simultaneously identify features and predict classes with the metagenomic count data.",
author = "Zhenqiu Liu and Dechang Chen and Li Sheng and Liu, {Amy Y.}",
year = "2013",
month = "3",
day = "26",
doi = "10.1371/journal.pone.0053253",
language = "English (US)",
volume = "8",
journal = "PLoS One",
issn = "1932-6203",
publisher = "Public Library of Science",
number = "3",

}

Class Prediction and Feature Selection with Linear Optimization for Metagenomic Count Data. / Liu, Zhenqiu; Chen, Dechang; Sheng, Li; Liu, Amy Y.

In: PloS one, Vol. 8, No. 3, e53253, 26.03.2013.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Class Prediction and Feature Selection with Linear Optimization for Metagenomic Count Data

AU - Liu, Zhenqiu

AU - Chen, Dechang

AU - Sheng, Li

AU - Liu, Amy Y.

PY - 2013/3/26

Y1 - 2013/3/26

N2 - The amount of metagenomic data is growing rapidly while the computational methods for metagenome analysis are still in their infancy. It is important to develop novel statistical learning tools for the prediction of associations between bacterial communities and disease phenotypes and for the detection of differentially abundant features. In this study, we presented a novel statistical learning method for simultaneous association prediction and feature selection with metagenomic samples from two or multiple treatment populations on the basis of count data. We developed a linear programming based support vector machine with L1 and joint L1,∞ penalties for binary and multiclass classifications with metagenomic count data (metalinprog). We evaluated the performance of our method on several real and simulation datasets. The proposed method can simultaneously identify features and predict classes with the metagenomic count data.

AB - The amount of metagenomic data is growing rapidly while the computational methods for metagenome analysis are still in their infancy. It is important to develop novel statistical learning tools for the prediction of associations between bacterial communities and disease phenotypes and for the detection of differentially abundant features. In this study, we presented a novel statistical learning method for simultaneous association prediction and feature selection with metagenomic samples from two or multiple treatment populations on the basis of count data. We developed a linear programming based support vector machine with L1 and joint L1,∞ penalties for binary and multiclass classifications with metagenomic count data (metalinprog). We evaluated the performance of our method on several real and simulation datasets. The proposed method can simultaneously identify features and predict classes with the metagenomic count data.

UR - http://www.scopus.com/inward/record.url?scp=84875440137&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84875440137&partnerID=8YFLogxK

U2 - 10.1371/journal.pone.0053253

DO - 10.1371/journal.pone.0053253

M3 - Article

C2 - 23555553

AN - SCOPUS:84875440137

VL - 8

JO - PLoS One

JF - PLoS One

SN - 1932-6203

IS - 3

M1 - e53253

ER -