PlasmoSEP

Predicting surface-exposed proteins on the malaria parasite using semisupervised self-training and expert-annotated data

Yasser El-Manzalawy, Elyse E. Munoz, Scott E. Lindner, Vasant Honavar

Research output: Contribution to journalArticle

6 Citations (Scopus)

Abstract

Accurate and comprehensive identification of surface-exposed proteins (SEPs) in parasites is a key step in developing novel subunit vaccines. However, the reliability of MS-based high-throughput methods for proteome-wide mapping of SEPs continues to be limited due to high rates of false positives (i.e., proteins mistakenly identified as surface exposed) as well as false negatives (i.e., SEPs not detected due to low expression or other technical limitations). We propose a framework called PlasmoSEP for the reliable identification of SEPs using a novel semisupervised learning algorithm that combines SEPs identified by high-throughput experiments and expert annotation of high-throughput data to augment labeled data for training a predictive model. Our experiments using high-throughput data from the Plasmodium falciparum surface-exposed proteome provide several novel high-confidence predictions of SEPs in P. falciparum and also confirm expert annotations for several others. Furthermore, PlasmoSEP predicts that 25 of 37 experimentally identified SEPs in Plasmodium yoelii salivary gland sporozoites are likely to be SEPs. Finally, PlasmoSEP predicts several novel SEPs in P. yoelii and Plasmodium vivax malaria parasites that can be validated for further vaccine studies. Our computational framework can be easily adapted to improve the interpretation of data from high-throughput studies.

Original languageEnglish (US)
Pages (from-to)2967-2976
Number of pages10
JournalProteomics
Volume16
Issue number23
DOIs
StatePublished - Dec 1 2016

Fingerprint

Malaria
Membrane Proteins
Parasites
Proteins
Throughput
Plasmodium yoelii
Proteome
Plasmodium falciparum
Vivax Malaria
Sporozoites
Subunit Vaccines
Salivary Glands
Vaccines
Learning algorithms
Experiments

All Science Journal Classification (ASJC) codes

  • Biochemistry
  • Molecular Biology

Cite this

@article{5b03c70644e540e18f5b978fa4c61c0c,
title = "PlasmoSEP: Predicting surface-exposed proteins on the malaria parasite using semisupervised self-training and expert-annotated data",
abstract = "Accurate and comprehensive identification of surface-exposed proteins (SEPs) in parasites is a key step in developing novel subunit vaccines. However, the reliability of MS-based high-throughput methods for proteome-wide mapping of SEPs continues to be limited due to high rates of false positives (i.e., proteins mistakenly identified as surface exposed) as well as false negatives (i.e., SEPs not detected due to low expression or other technical limitations). We propose a framework called PlasmoSEP for the reliable identification of SEPs using a novel semisupervised learning algorithm that combines SEPs identified by high-throughput experiments and expert annotation of high-throughput data to augment labeled data for training a predictive model. Our experiments using high-throughput data from the Plasmodium falciparum surface-exposed proteome provide several novel high-confidence predictions of SEPs in P. falciparum and also confirm expert annotations for several others. Furthermore, PlasmoSEP predicts that 25 of 37 experimentally identified SEPs in Plasmodium yoelii salivary gland sporozoites are likely to be SEPs. Finally, PlasmoSEP predicts several novel SEPs in P. yoelii and Plasmodium vivax malaria parasites that can be validated for further vaccine studies. Our computational framework can be easily adapted to improve the interpretation of data from high-throughput studies.",
author = "Yasser El-Manzalawy and Munoz, {Elyse E.} and Lindner, {Scott E.} and Vasant Honavar",
year = "2016",
month = "12",
day = "1",
doi = "10.1002/pmic.201600249",
language = "English (US)",
volume = "16",
pages = "2967--2976",
journal = "Proteomics",
issn = "1615-9853",
publisher = "Wiley-VCH Verlag",
number = "23",

}

PlasmoSEP : Predicting surface-exposed proteins on the malaria parasite using semisupervised self-training and expert-annotated data. / El-Manzalawy, Yasser; Munoz, Elyse E.; Lindner, Scott E.; Honavar, Vasant.

In: Proteomics, Vol. 16, No. 23, 01.12.2016, p. 2967-2976.

Research output: Contribution to journalArticle

TY - JOUR

T1 - PlasmoSEP

T2 - Predicting surface-exposed proteins on the malaria parasite using semisupervised self-training and expert-annotated data

AU - El-Manzalawy, Yasser

AU - Munoz, Elyse E.

AU - Lindner, Scott E.

AU - Honavar, Vasant

PY - 2016/12/1

Y1 - 2016/12/1

N2 - Accurate and comprehensive identification of surface-exposed proteins (SEPs) in parasites is a key step in developing novel subunit vaccines. However, the reliability of MS-based high-throughput methods for proteome-wide mapping of SEPs continues to be limited due to high rates of false positives (i.e., proteins mistakenly identified as surface exposed) as well as false negatives (i.e., SEPs not detected due to low expression or other technical limitations). We propose a framework called PlasmoSEP for the reliable identification of SEPs using a novel semisupervised learning algorithm that combines SEPs identified by high-throughput experiments and expert annotation of high-throughput data to augment labeled data for training a predictive model. Our experiments using high-throughput data from the Plasmodium falciparum surface-exposed proteome provide several novel high-confidence predictions of SEPs in P. falciparum and also confirm expert annotations for several others. Furthermore, PlasmoSEP predicts that 25 of 37 experimentally identified SEPs in Plasmodium yoelii salivary gland sporozoites are likely to be SEPs. Finally, PlasmoSEP predicts several novel SEPs in P. yoelii and Plasmodium vivax malaria parasites that can be validated for further vaccine studies. Our computational framework can be easily adapted to improve the interpretation of data from high-throughput studies.

AB - Accurate and comprehensive identification of surface-exposed proteins (SEPs) in parasites is a key step in developing novel subunit vaccines. However, the reliability of MS-based high-throughput methods for proteome-wide mapping of SEPs continues to be limited due to high rates of false positives (i.e., proteins mistakenly identified as surface exposed) as well as false negatives (i.e., SEPs not detected due to low expression or other technical limitations). We propose a framework called PlasmoSEP for the reliable identification of SEPs using a novel semisupervised learning algorithm that combines SEPs identified by high-throughput experiments and expert annotation of high-throughput data to augment labeled data for training a predictive model. Our experiments using high-throughput data from the Plasmodium falciparum surface-exposed proteome provide several novel high-confidence predictions of SEPs in P. falciparum and also confirm expert annotations for several others. Furthermore, PlasmoSEP predicts that 25 of 37 experimentally identified SEPs in Plasmodium yoelii salivary gland sporozoites are likely to be SEPs. Finally, PlasmoSEP predicts several novel SEPs in P. yoelii and Plasmodium vivax malaria parasites that can be validated for further vaccine studies. Our computational framework can be easily adapted to improve the interpretation of data from high-throughput studies.

UR - http://www.scopus.com/inward/record.url?scp=85001968740&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85001968740&partnerID=8YFLogxK

U2 - 10.1002/pmic.201600249

DO - 10.1002/pmic.201600249

M3 - Article

VL - 16

SP - 2967

EP - 2976

JO - Proteomics

JF - Proteomics

SN - 1615-9853

IS - 23

ER -