Gene Function Prediction and Functional Network

The Role of Gene Ontology

Erliang Zeng, Chris Ding, Kalai Mathee, Lisa Schneper, Giri Narasimhan

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

Almost every cellular process requires the interactions of pairs or larger complexes of proteins. The organization of genes into networks has played an important role in characterizing the functions of individual genes and the interplay between various cellular processes. The Gene Ontology (GO) project has integrated information from multiple data sources to annotate genes to specific biological process. Recently, the semantic similarity (SS) between GO terms has been investigated and used to derive semantic similarity between genes. Such semantic similarity provides us with a new perspective to predict protein functions and to generate functional gene networks. In this chapter, we focus on investigating the semantic similarity between genes and its applications. We have proposed a novel method to evaluate the support for PPI data based on gene ontology information. If the semantic similarity between genes is computed using gene ontology information and using Resniks formula, then our results show that we can model the PPI data as a mixture model predicated on the assumption that true protein-protein interactions will have higher support than the false positives in the data. Thus semantic similarity between genes serves as a metric of support for PPI data. Taking it one step further, new function prediction approaches are also being proposed with the help of the proposed metric of the support for the PPI data. These new function prediction approaches outperform their conventional counterparts. New evaluation methods are also proposed. In another application, we present a novel approach to automatically generate a functional network of yeast genes using Gene Ontology (GO) annotations. An semantic similarity (SS) is calculated between pairs of genes. This SS score is then used to predict linkages between genes, to generate a functional network. Functional networks predicted by SS and other methods are compared. The network predicted by SS scores outperforms those generated by other methods in the following aspects: automatic removal of a functional bias in network training reference sets, improved precision and recall across the network, and higher correlation between a genes lethality and centrality in the network. We illustrate that the resulting network can be applied to generate coherent function modules and their associations. We conclude that determination of semantic similarity between genes based upon GO information can be used to generate a functional network of yeast genes that is comparable or improved with respect to those that are directly based on integrated heterogeneous genomic and proteomic data.

Original languageEnglish (US)
Title of host publicationData Mining
Subtitle of host publicationFoundations and Intelligent Paradigms: Volume 3:Medical,Health, Social, Biological and other Applications
EditorsDawn Holmes, Lakhmi Jain
Pages123-162
Number of pages40
DOIs
StatePublished - Dec 1 2012

Publication series

NameIntelligent Systems Reference Library
Volume25
ISSN (Print)1868-4394
ISSN (Electronic)1868-4408

Fingerprint

ontology
Ontology
Genes
semantics
Semantics
Prediction
Gene
Proteins
Yeast
interaction
Semantic similarity
organization
trend

All Science Journal Classification (ASJC) codes

  • Computer Science(all)
  • Information Systems and Management
  • Library and Information Sciences

Cite this

Zeng, E., Ding, C., Mathee, K., Schneper, L., & Narasimhan, G. (2012). Gene Function Prediction and Functional Network: The Role of Gene Ontology. In D. Holmes, & L. Jain (Eds.), Data Mining: Foundations and Intelligent Paradigms: Volume 3:Medical,Health, Social, Biological and other Applications (pp. 123-162). (Intelligent Systems Reference Library; Vol. 25). https://doi.org/10.1007/978-3-642-23151-3_7
Zeng, Erliang ; Ding, Chris ; Mathee, Kalai ; Schneper, Lisa ; Narasimhan, Giri. / Gene Function Prediction and Functional Network : The Role of Gene Ontology. Data Mining: Foundations and Intelligent Paradigms: Volume 3:Medical,Health, Social, Biological and other Applications. editor / Dawn Holmes ; Lakhmi Jain. 2012. pp. 123-162 (Intelligent Systems Reference Library).
@inbook{a3b82c1638dd4c668913edb189bcf105,
title = "Gene Function Prediction and Functional Network: The Role of Gene Ontology",
abstract = "Almost every cellular process requires the interactions of pairs or larger complexes of proteins. The organization of genes into networks has played an important role in characterizing the functions of individual genes and the interplay between various cellular processes. The Gene Ontology (GO) project has integrated information from multiple data sources to annotate genes to specific biological process. Recently, the semantic similarity (SS) between GO terms has been investigated and used to derive semantic similarity between genes. Such semantic similarity provides us with a new perspective to predict protein functions and to generate functional gene networks. In this chapter, we focus on investigating the semantic similarity between genes and its applications. We have proposed a novel method to evaluate the support for PPI data based on gene ontology information. If the semantic similarity between genes is computed using gene ontology information and using Resniks formula, then our results show that we can model the PPI data as a mixture model predicated on the assumption that true protein-protein interactions will have higher support than the false positives in the data. Thus semantic similarity between genes serves as a metric of support for PPI data. Taking it one step further, new function prediction approaches are also being proposed with the help of the proposed metric of the support for the PPI data. These new function prediction approaches outperform their conventional counterparts. New evaluation methods are also proposed. In another application, we present a novel approach to automatically generate a functional network of yeast genes using Gene Ontology (GO) annotations. An semantic similarity (SS) is calculated between pairs of genes. This SS score is then used to predict linkages between genes, to generate a functional network. Functional networks predicted by SS and other methods are compared. The network predicted by SS scores outperforms those generated by other methods in the following aspects: automatic removal of a functional bias in network training reference sets, improved precision and recall across the network, and higher correlation between a genes lethality and centrality in the network. We illustrate that the resulting network can be applied to generate coherent function modules and their associations. We conclude that determination of semantic similarity between genes based upon GO information can be used to generate a functional network of yeast genes that is comparable or improved with respect to those that are directly based on integrated heterogeneous genomic and proteomic data.",
author = "Erliang Zeng and Chris Ding and Kalai Mathee and Lisa Schneper and Giri Narasimhan",
year = "2012",
month = "12",
day = "1",
doi = "10.1007/978-3-642-23151-3_7",
language = "English (US)",
isbn = "9783642231506",
series = "Intelligent Systems Reference Library",
pages = "123--162",
editor = "Dawn Holmes and Lakhmi Jain",
booktitle = "Data Mining",

}

Zeng, E, Ding, C, Mathee, K, Schneper, L & Narasimhan, G 2012, Gene Function Prediction and Functional Network: The Role of Gene Ontology. in D Holmes & L Jain (eds), Data Mining: Foundations and Intelligent Paradigms: Volume 3:Medical,Health, Social, Biological and other Applications. Intelligent Systems Reference Library, vol. 25, pp. 123-162. https://doi.org/10.1007/978-3-642-23151-3_7

Gene Function Prediction and Functional Network : The Role of Gene Ontology. / Zeng, Erliang; Ding, Chris; Mathee, Kalai; Schneper, Lisa; Narasimhan, Giri.

Data Mining: Foundations and Intelligent Paradigms: Volume 3:Medical,Health, Social, Biological and other Applications. ed. / Dawn Holmes; Lakhmi Jain. 2012. p. 123-162 (Intelligent Systems Reference Library; Vol. 25).

Research output: Chapter in Book/Report/Conference proceedingChapter

TY - CHAP

T1 - Gene Function Prediction and Functional Network

T2 - The Role of Gene Ontology

AU - Zeng, Erliang

AU - Ding, Chris

AU - Mathee, Kalai

AU - Schneper, Lisa

AU - Narasimhan, Giri

PY - 2012/12/1

Y1 - 2012/12/1

N2 - Almost every cellular process requires the interactions of pairs or larger complexes of proteins. The organization of genes into networks has played an important role in characterizing the functions of individual genes and the interplay between various cellular processes. The Gene Ontology (GO) project has integrated information from multiple data sources to annotate genes to specific biological process. Recently, the semantic similarity (SS) between GO terms has been investigated and used to derive semantic similarity between genes. Such semantic similarity provides us with a new perspective to predict protein functions and to generate functional gene networks. In this chapter, we focus on investigating the semantic similarity between genes and its applications. We have proposed a novel method to evaluate the support for PPI data based on gene ontology information. If the semantic similarity between genes is computed using gene ontology information and using Resniks formula, then our results show that we can model the PPI data as a mixture model predicated on the assumption that true protein-protein interactions will have higher support than the false positives in the data. Thus semantic similarity between genes serves as a metric of support for PPI data. Taking it one step further, new function prediction approaches are also being proposed with the help of the proposed metric of the support for the PPI data. These new function prediction approaches outperform their conventional counterparts. New evaluation methods are also proposed. In another application, we present a novel approach to automatically generate a functional network of yeast genes using Gene Ontology (GO) annotations. An semantic similarity (SS) is calculated between pairs of genes. This SS score is then used to predict linkages between genes, to generate a functional network. Functional networks predicted by SS and other methods are compared. The network predicted by SS scores outperforms those generated by other methods in the following aspects: automatic removal of a functional bias in network training reference sets, improved precision and recall across the network, and higher correlation between a genes lethality and centrality in the network. We illustrate that the resulting network can be applied to generate coherent function modules and their associations. We conclude that determination of semantic similarity between genes based upon GO information can be used to generate a functional network of yeast genes that is comparable or improved with respect to those that are directly based on integrated heterogeneous genomic and proteomic data.

AB - Almost every cellular process requires the interactions of pairs or larger complexes of proteins. The organization of genes into networks has played an important role in characterizing the functions of individual genes and the interplay between various cellular processes. The Gene Ontology (GO) project has integrated information from multiple data sources to annotate genes to specific biological process. Recently, the semantic similarity (SS) between GO terms has been investigated and used to derive semantic similarity between genes. Such semantic similarity provides us with a new perspective to predict protein functions and to generate functional gene networks. In this chapter, we focus on investigating the semantic similarity between genes and its applications. We have proposed a novel method to evaluate the support for PPI data based on gene ontology information. If the semantic similarity between genes is computed using gene ontology information and using Resniks formula, then our results show that we can model the PPI data as a mixture model predicated on the assumption that true protein-protein interactions will have higher support than the false positives in the data. Thus semantic similarity between genes serves as a metric of support for PPI data. Taking it one step further, new function prediction approaches are also being proposed with the help of the proposed metric of the support for the PPI data. These new function prediction approaches outperform their conventional counterparts. New evaluation methods are also proposed. In another application, we present a novel approach to automatically generate a functional network of yeast genes using Gene Ontology (GO) annotations. An semantic similarity (SS) is calculated between pairs of genes. This SS score is then used to predict linkages between genes, to generate a functional network. Functional networks predicted by SS and other methods are compared. The network predicted by SS scores outperforms those generated by other methods in the following aspects: automatic removal of a functional bias in network training reference sets, improved precision and recall across the network, and higher correlation between a genes lethality and centrality in the network. We illustrate that the resulting network can be applied to generate coherent function modules and their associations. We conclude that determination of semantic similarity between genes based upon GO information can be used to generate a functional network of yeast genes that is comparable or improved with respect to those that are directly based on integrated heterogeneous genomic and proteomic data.

UR - http://www.scopus.com/inward/record.url?scp=84885605009&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84885605009&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-23151-3_7

DO - 10.1007/978-3-642-23151-3_7

M3 - Chapter

SN - 9783642231506

T3 - Intelligent Systems Reference Library

SP - 123

EP - 162

BT - Data Mining

A2 - Holmes, Dawn

A2 - Jain, Lakhmi

ER -

Zeng E, Ding C, Mathee K, Schneper L, Narasimhan G. Gene Function Prediction and Functional Network: The Role of Gene Ontology. In Holmes D, Jain L, editors, Data Mining: Foundations and Intelligent Paradigms: Volume 3:Medical,Health, Social, Biological and other Applications. 2012. p. 123-162. (Intelligent Systems Reference Library). https://doi.org/10.1007/978-3-642-23151-3_7