Mixture of experts models to exploit global sequence similarity on biomolecular sequence labeling

Cornelia Caragea, Jivko Sinapov, Drena Dobbs, Vasant Honavar

Research output: Contribution to journalArticle

4 Citations (Scopus)

Abstract

Background: Identification of functionally important sites in biomolecular sequences has broad applications ranging from rational drug design to the analysis of metabolic and signal transduction networks. Experimental determination of such sites lags far behind the number of known biomolecular sequences. Hence, there is a need to develop reliable computational methods for identifying functionally important sites from biomolecular sequences. Results: We present a mixture of experts approach to biomolecular sequence labeling that takes into account the global similarity between biomolecular sequences. Our approach combines unsupervised and supervised learning techniques. Given a set of sequences and a similarity measure defined on pairs of sequences, we learn a mixture of experts model by using spectral clustering to learn the hierarchical structure of the model and by using bayesian techniques to combine the predictions of the experts. We evaluate our approach on two biomolecular sequence labeling problems: RNA-protein and DNA-protein interface prediction problems. The results of our experiments show that global sequence similarity can be exploited to improve the performance of classifiers trained to label biomolecular sequence data. Conclusion: The mixture of experts model helps improve the performance of machine learning methods for identifying functionally important sites in biomolecular sequences.

Original languageEnglish (US)
Article numberS4
JournalBMC bioinformatics
Volume10
Issue numberSUPPL. 4
DOIs
StatePublished - Apr 29 2009

Fingerprint

Mixture of Experts
Labeling
Drug Design
Proteins
Cluster Analysis
Signal Transduction
Signal transduction
Unsupervised learning
Supervised learning
Learning
RNA
Computational methods
Learning systems
Labels
DNA
Classifiers
Model
Pharmaceutical Preparations
Similarity
Experiments

All Science Journal Classification (ASJC) codes

  • Structural Biology
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Applied Mathematics

Cite this

Caragea, Cornelia ; Sinapov, Jivko ; Dobbs, Drena ; Honavar, Vasant. / Mixture of experts models to exploit global sequence similarity on biomolecular sequence labeling. In: BMC bioinformatics. 2009 ; Vol. 10, No. SUPPL. 4.
@article{5a1674c0b63346ee9ac9d5effae514e1,
title = "Mixture of experts models to exploit global sequence similarity on biomolecular sequence labeling",
abstract = "Background: Identification of functionally important sites in biomolecular sequences has broad applications ranging from rational drug design to the analysis of metabolic and signal transduction networks. Experimental determination of such sites lags far behind the number of known biomolecular sequences. Hence, there is a need to develop reliable computational methods for identifying functionally important sites from biomolecular sequences. Results: We present a mixture of experts approach to biomolecular sequence labeling that takes into account the global similarity between biomolecular sequences. Our approach combines unsupervised and supervised learning techniques. Given a set of sequences and a similarity measure defined on pairs of sequences, we learn a mixture of experts model by using spectral clustering to learn the hierarchical structure of the model and by using bayesian techniques to combine the predictions of the experts. We evaluate our approach on two biomolecular sequence labeling problems: RNA-protein and DNA-protein interface prediction problems. The results of our experiments show that global sequence similarity can be exploited to improve the performance of classifiers trained to label biomolecular sequence data. Conclusion: The mixture of experts model helps improve the performance of machine learning methods for identifying functionally important sites in biomolecular sequences.",
author = "Cornelia Caragea and Jivko Sinapov and Drena Dobbs and Vasant Honavar",
year = "2009",
month = "4",
day = "29",
doi = "10.1186/1471-2105-10-S4-S4",
language = "English (US)",
volume = "10",
journal = "BMC Bioinformatics",
issn = "1471-2105",
publisher = "BioMed Central",
number = "SUPPL. 4",

}

Mixture of experts models to exploit global sequence similarity on biomolecular sequence labeling. / Caragea, Cornelia; Sinapov, Jivko; Dobbs, Drena; Honavar, Vasant.

In: BMC bioinformatics, Vol. 10, No. SUPPL. 4, S4, 29.04.2009.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Mixture of experts models to exploit global sequence similarity on biomolecular sequence labeling

AU - Caragea, Cornelia

AU - Sinapov, Jivko

AU - Dobbs, Drena

AU - Honavar, Vasant

PY - 2009/4/29

Y1 - 2009/4/29

N2 - Background: Identification of functionally important sites in biomolecular sequences has broad applications ranging from rational drug design to the analysis of metabolic and signal transduction networks. Experimental determination of such sites lags far behind the number of known biomolecular sequences. Hence, there is a need to develop reliable computational methods for identifying functionally important sites from biomolecular sequences. Results: We present a mixture of experts approach to biomolecular sequence labeling that takes into account the global similarity between biomolecular sequences. Our approach combines unsupervised and supervised learning techniques. Given a set of sequences and a similarity measure defined on pairs of sequences, we learn a mixture of experts model by using spectral clustering to learn the hierarchical structure of the model and by using bayesian techniques to combine the predictions of the experts. We evaluate our approach on two biomolecular sequence labeling problems: RNA-protein and DNA-protein interface prediction problems. The results of our experiments show that global sequence similarity can be exploited to improve the performance of classifiers trained to label biomolecular sequence data. Conclusion: The mixture of experts model helps improve the performance of machine learning methods for identifying functionally important sites in biomolecular sequences.

AB - Background: Identification of functionally important sites in biomolecular sequences has broad applications ranging from rational drug design to the analysis of metabolic and signal transduction networks. Experimental determination of such sites lags far behind the number of known biomolecular sequences. Hence, there is a need to develop reliable computational methods for identifying functionally important sites from biomolecular sequences. Results: We present a mixture of experts approach to biomolecular sequence labeling that takes into account the global similarity between biomolecular sequences. Our approach combines unsupervised and supervised learning techniques. Given a set of sequences and a similarity measure defined on pairs of sequences, we learn a mixture of experts model by using spectral clustering to learn the hierarchical structure of the model and by using bayesian techniques to combine the predictions of the experts. We evaluate our approach on two biomolecular sequence labeling problems: RNA-protein and DNA-protein interface prediction problems. The results of our experiments show that global sequence similarity can be exploited to improve the performance of classifiers trained to label biomolecular sequence data. Conclusion: The mixture of experts model helps improve the performance of machine learning methods for identifying functionally important sites in biomolecular sequences.

UR - http://www.scopus.com/inward/record.url?scp=65449119563&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=65449119563&partnerID=8YFLogxK

U2 - 10.1186/1471-2105-10-S4-S4

DO - 10.1186/1471-2105-10-S4-S4

M3 - Article

C2 - 19426452

AN - SCOPUS:65449119563

VL - 10

JO - BMC Bioinformatics

JF - BMC Bioinformatics

SN - 1471-2105

IS - SUPPL. 4

M1 - S4

ER -