Microarray data mining: A novel optimization-based approach to uncover biologically coherent structures

Meng P. Tan, Erin N. Smith, James R. Broach, Christodoulos A. Floudas

Research output: Contribution to journalArticle

25 Citations (Scopus)

Abstract

Background: DNA microarray technology allows for the measurement of genome-wide expression patterns. Within the resultant mass of data lies the problem of analyzing and presenting information on this genomic scale, and a first step towards the rapid and comprehensive interpretation of this data is gene clustering with respect to the expression patterns. Classifying genes into clusters can lead to interesting biological insights. In this study, we describe an iterative clustering approach to uncover biologically coherent structures from DNA microarray data based on a novel clustering algorithm EP_GOS_Clust. Results: We apply our proposed iterative algorithm to three sets of experimental DNA microarray data from experiments with the yeast Saccharomyces cerevisiae and show that the proposed iterative approach improves biological coherence. Comparison with other clustering techniques suggests that our iterative algorithm provides superior performance with regard to biological coherence. An important consequence of our approach is that an increasing proportion of genes find membership in clusters of high biological coherence and that the average cluster specificity improves. Conclusion: The results from these clustering experiments provide a robust basis for extracting motifs and trans-acting factors that determine particular patterns of expression. In addition, the biological coherence of the clusters is iteratively assessed independently of the clustering. Thus, this method will not be severely impacted by functional annotations that are missing, inaccurate, or sparse.

Original languageEnglish (US)
Article number268
JournalBMC bioinformatics
Volume9
DOIs
StatePublished - Jun 6 2008

Fingerprint

Coherent Structures
Data Mining
Microarrays
Microarray Data
Cluster Analysis
Data mining
Genes
DNA Microarray
Clustering
DNA
Optimization
Oligonucleotide Array Sequence Analysis
Yeast
Gene
Iterative Algorithm
Trans-Activators
Clustering algorithms
Saccharomyces Cerevisiae
Inaccurate
Experiments

All Science Journal Classification (ASJC) codes

  • Structural Biology
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Applied Mathematics

Cite this

@article{26bafa0a652e47b6af14a42942c8c5a2,
title = "Microarray data mining: A novel optimization-based approach to uncover biologically coherent structures",
abstract = "Background: DNA microarray technology allows for the measurement of genome-wide expression patterns. Within the resultant mass of data lies the problem of analyzing and presenting information on this genomic scale, and a first step towards the rapid and comprehensive interpretation of this data is gene clustering with respect to the expression patterns. Classifying genes into clusters can lead to interesting biological insights. In this study, we describe an iterative clustering approach to uncover biologically coherent structures from DNA microarray data based on a novel clustering algorithm EP_GOS_Clust. Results: We apply our proposed iterative algorithm to three sets of experimental DNA microarray data from experiments with the yeast Saccharomyces cerevisiae and show that the proposed iterative approach improves biological coherence. Comparison with other clustering techniques suggests that our iterative algorithm provides superior performance with regard to biological coherence. An important consequence of our approach is that an increasing proportion of genes find membership in clusters of high biological coherence and that the average cluster specificity improves. Conclusion: The results from these clustering experiments provide a robust basis for extracting motifs and trans-acting factors that determine particular patterns of expression. In addition, the biological coherence of the clusters is iteratively assessed independently of the clustering. Thus, this method will not be severely impacted by functional annotations that are missing, inaccurate, or sparse.",
author = "Tan, {Meng P.} and Smith, {Erin N.} and Broach, {James R.} and Floudas, {Christodoulos A.}",
year = "2008",
month = "6",
day = "6",
doi = "10.1186/1471-2105-9-268",
language = "English (US)",
volume = "9",
journal = "BMC Bioinformatics",
issn = "1471-2105",
publisher = "BioMed Central",

}

Microarray data mining : A novel optimization-based approach to uncover biologically coherent structures. / Tan, Meng P.; Smith, Erin N.; Broach, James R.; Floudas, Christodoulos A.

In: BMC bioinformatics, Vol. 9, 268, 06.06.2008.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Microarray data mining

T2 - A novel optimization-based approach to uncover biologically coherent structures

AU - Tan, Meng P.

AU - Smith, Erin N.

AU - Broach, James R.

AU - Floudas, Christodoulos A.

PY - 2008/6/6

Y1 - 2008/6/6

N2 - Background: DNA microarray technology allows for the measurement of genome-wide expression patterns. Within the resultant mass of data lies the problem of analyzing and presenting information on this genomic scale, and a first step towards the rapid and comprehensive interpretation of this data is gene clustering with respect to the expression patterns. Classifying genes into clusters can lead to interesting biological insights. In this study, we describe an iterative clustering approach to uncover biologically coherent structures from DNA microarray data based on a novel clustering algorithm EP_GOS_Clust. Results: We apply our proposed iterative algorithm to three sets of experimental DNA microarray data from experiments with the yeast Saccharomyces cerevisiae and show that the proposed iterative approach improves biological coherence. Comparison with other clustering techniques suggests that our iterative algorithm provides superior performance with regard to biological coherence. An important consequence of our approach is that an increasing proportion of genes find membership in clusters of high biological coherence and that the average cluster specificity improves. Conclusion: The results from these clustering experiments provide a robust basis for extracting motifs and trans-acting factors that determine particular patterns of expression. In addition, the biological coherence of the clusters is iteratively assessed independently of the clustering. Thus, this method will not be severely impacted by functional annotations that are missing, inaccurate, or sparse.

AB - Background: DNA microarray technology allows for the measurement of genome-wide expression patterns. Within the resultant mass of data lies the problem of analyzing and presenting information on this genomic scale, and a first step towards the rapid and comprehensive interpretation of this data is gene clustering with respect to the expression patterns. Classifying genes into clusters can lead to interesting biological insights. In this study, we describe an iterative clustering approach to uncover biologically coherent structures from DNA microarray data based on a novel clustering algorithm EP_GOS_Clust. Results: We apply our proposed iterative algorithm to three sets of experimental DNA microarray data from experiments with the yeast Saccharomyces cerevisiae and show that the proposed iterative approach improves biological coherence. Comparison with other clustering techniques suggests that our iterative algorithm provides superior performance with regard to biological coherence. An important consequence of our approach is that an increasing proportion of genes find membership in clusters of high biological coherence and that the average cluster specificity improves. Conclusion: The results from these clustering experiments provide a robust basis for extracting motifs and trans-acting factors that determine particular patterns of expression. In addition, the biological coherence of the clusters is iteratively assessed independently of the clustering. Thus, this method will not be severely impacted by functional annotations that are missing, inaccurate, or sparse.

UR - http://www.scopus.com/inward/record.url?scp=46649092734&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=46649092734&partnerID=8YFLogxK

U2 - 10.1186/1471-2105-9-268

DO - 10.1186/1471-2105-9-268

M3 - Article

C2 - 18538024

AN - SCOPUS:46649092734

VL - 9

JO - BMC Bioinformatics

JF - BMC Bioinformatics

SN - 1471-2105

M1 - 268

ER -