Comparison and evaluation of network clustering algorithms applied to genetic interaction networks

Lin Hou, Lin Wang, Arthur Berg, Minping Qian, Yunping Zhu, Fangting Li, Minghua Deng

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

The goal of network clustering algorithms detect dense clusters in a network, and provide a first step towards the understanding of large scale biological networks. With numerous recent advances in biotechnologies, large-scale genetic interactions are widely available, but there is a limited understanding of which clustering algorithms may be most effective. In order to address this problem, we conducted a systematic study to compare and evaluate six clustering algorithms in analyzing genetic interaction networks, and investigated influencing factors in choosing algorithms. The algorithms considered in this comparison include hierarchical clustering, topological overlap matrix, bi-clustering, Markov clustering, Bayesian discriminant analysis based community detection, and variational Bayes approach to modularity. Both experimentally identified and synthetically constructed networks were used in this comparison. The accuracy of the algorithms is measured by the Jaccard index in comparing predicted gene modules with benchmark gene sets. The results suggest that the choice differs according to the network topology and evaluation criteria. Hierarchical clustering showed to be best at predicting protein complexes, Bayesian discriminant analysis based community detection proved best under epistatic miniarray profile (EMAP) datasets, the variational Bayes approach to modularity was noticeably better than the other algorithms in the genome-scale networks.

Original languageEnglish (US)
Pages (from-to)2150-2161
Number of pages12
JournalFrontiers in Bioscience - Elite
Volume4 E
Issue number6
StatePublished - Jan 1 2012

Fingerprint

Clustering algorithms
Cluster Analysis
Genes
Discriminant analysis
Bayes Theorem
Discriminant Analysis
Biotechnology
Benchmarking
Topology
Gene Regulatory Networks
Genome
Proteins

All Science Journal Classification (ASJC) codes

  • Biochemistry, Genetics and Molecular Biology(all)
  • Immunology and Microbiology(all)

Cite this

Hou, Lin ; Wang, Lin ; Berg, Arthur ; Qian, Minping ; Zhu, Yunping ; Li, Fangting ; Deng, Minghua. / Comparison and evaluation of network clustering algorithms applied to genetic interaction networks. In: Frontiers in Bioscience - Elite. 2012 ; Vol. 4 E, No. 6. pp. 2150-2161.
@article{b622310548354e22817fba329594bb56,
title = "Comparison and evaluation of network clustering algorithms applied to genetic interaction networks",
abstract = "The goal of network clustering algorithms detect dense clusters in a network, and provide a first step towards the understanding of large scale biological networks. With numerous recent advances in biotechnologies, large-scale genetic interactions are widely available, but there is a limited understanding of which clustering algorithms may be most effective. In order to address this problem, we conducted a systematic study to compare and evaluate six clustering algorithms in analyzing genetic interaction networks, and investigated influencing factors in choosing algorithms. The algorithms considered in this comparison include hierarchical clustering, topological overlap matrix, bi-clustering, Markov clustering, Bayesian discriminant analysis based community detection, and variational Bayes approach to modularity. Both experimentally identified and synthetically constructed networks were used in this comparison. The accuracy of the algorithms is measured by the Jaccard index in comparing predicted gene modules with benchmark gene sets. The results suggest that the choice differs according to the network topology and evaluation criteria. Hierarchical clustering showed to be best at predicting protein complexes, Bayesian discriminant analysis based community detection proved best under epistatic miniarray profile (EMAP) datasets, the variational Bayes approach to modularity was noticeably better than the other algorithms in the genome-scale networks.",
author = "Lin Hou and Lin Wang and Arthur Berg and Minping Qian and Yunping Zhu and Fangting Li and Minghua Deng",
year = "2012",
month = "1",
day = "1",
language = "English (US)",
volume = "4 E",
pages = "2150--2161",
journal = "Frontiers in Bioscience - Elite",
issn = "1945-0494",
publisher = "Frontiers in Bioscience",
number = "6",

}

Hou, L, Wang, L, Berg, A, Qian, M, Zhu, Y, Li, F & Deng, M 2012, 'Comparison and evaluation of network clustering algorithms applied to genetic interaction networks', Frontiers in Bioscience - Elite, vol. 4 E, no. 6, pp. 2150-2161.

Comparison and evaluation of network clustering algorithms applied to genetic interaction networks. / Hou, Lin; Wang, Lin; Berg, Arthur; Qian, Minping; Zhu, Yunping; Li, Fangting; Deng, Minghua.

In: Frontiers in Bioscience - Elite, Vol. 4 E, No. 6, 01.01.2012, p. 2150-2161.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Comparison and evaluation of network clustering algorithms applied to genetic interaction networks

AU - Hou, Lin

AU - Wang, Lin

AU - Berg, Arthur

AU - Qian, Minping

AU - Zhu, Yunping

AU - Li, Fangting

AU - Deng, Minghua

PY - 2012/1/1

Y1 - 2012/1/1

N2 - The goal of network clustering algorithms detect dense clusters in a network, and provide a first step towards the understanding of large scale biological networks. With numerous recent advances in biotechnologies, large-scale genetic interactions are widely available, but there is a limited understanding of which clustering algorithms may be most effective. In order to address this problem, we conducted a systematic study to compare and evaluate six clustering algorithms in analyzing genetic interaction networks, and investigated influencing factors in choosing algorithms. The algorithms considered in this comparison include hierarchical clustering, topological overlap matrix, bi-clustering, Markov clustering, Bayesian discriminant analysis based community detection, and variational Bayes approach to modularity. Both experimentally identified and synthetically constructed networks were used in this comparison. The accuracy of the algorithms is measured by the Jaccard index in comparing predicted gene modules with benchmark gene sets. The results suggest that the choice differs according to the network topology and evaluation criteria. Hierarchical clustering showed to be best at predicting protein complexes, Bayesian discriminant analysis based community detection proved best under epistatic miniarray profile (EMAP) datasets, the variational Bayes approach to modularity was noticeably better than the other algorithms in the genome-scale networks.

AB - The goal of network clustering algorithms detect dense clusters in a network, and provide a first step towards the understanding of large scale biological networks. With numerous recent advances in biotechnologies, large-scale genetic interactions are widely available, but there is a limited understanding of which clustering algorithms may be most effective. In order to address this problem, we conducted a systematic study to compare and evaluate six clustering algorithms in analyzing genetic interaction networks, and investigated influencing factors in choosing algorithms. The algorithms considered in this comparison include hierarchical clustering, topological overlap matrix, bi-clustering, Markov clustering, Bayesian discriminant analysis based community detection, and variational Bayes approach to modularity. Both experimentally identified and synthetically constructed networks were used in this comparison. The accuracy of the algorithms is measured by the Jaccard index in comparing predicted gene modules with benchmark gene sets. The results suggest that the choice differs according to the network topology and evaluation criteria. Hierarchical clustering showed to be best at predicting protein complexes, Bayesian discriminant analysis based community detection proved best under epistatic miniarray profile (EMAP) datasets, the variational Bayes approach to modularity was noticeably better than the other algorithms in the genome-scale networks.

UR - http://www.scopus.com/inward/record.url?scp=84860862492&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84860862492&partnerID=8YFLogxK

M3 - Article

C2 - 22202027

AN - SCOPUS:84860862492

VL - 4 E

SP - 2150

EP - 2161

JO - Frontiers in Bioscience - Elite

JF - Frontiers in Bioscience - Elite

SN - 1945-0494

IS - 6

ER -