Simultaneous classification and feature clustering using discriminant vector quantization with applications to microarray data analysis

Jia Li, Hongyuan Zha

Research output: Chapter in Book/Report/Conference proceedingConference contribution

12 Citations (Scopus)

Abstract

In many applications of supervised learning, automatic feature clustering is often desirable for a better understanding of the interaction among the various features as well as the interplay between the features and the class labels. In addition, for high dimensional data sets, feature clustering has the potential for improvement in classification accuracy and reduction in computational complexity. In this paper, a method is developed for simultaneous classification and feature clustering by extending discriminant vector quantization (DVQ), a prototype classification method derived from the principle of minimum description length using source coding techniques. The method incorporates feature clustering with classification performed by fusing features in the same clusters. To illustrate its effectiveness, the method has been applied to microarray gene expression data for human lymphoma classification. It is demonstrated that incorporating feature clustering improves classification accuracy, and the clusters generated match well with biological meaningful gene expression signature groups.

Original languageEnglish (US)
Title of host publicationProceedings - IEEE Computer Society Bioinformatics Conference, CSB 2002
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages246-255
Number of pages10
ISBN (Electronic)076951653X, 9780769516530
DOIs
StatePublished - Jan 1 2002
Event1st International IEEE Computer Society Bioinformatics Conference, CSB 2002 - Stanford, United States
Duration: Aug 14 2002Aug 16 2002

Other

Other1st International IEEE Computer Society Bioinformatics Conference, CSB 2002
CountryUnited States
CityStanford
Period8/14/028/16/02

Fingerprint

Vector quantization
Microarray Analysis
Microarrays
Cluster Analysis
Gene expression
Supervised learning
Transcriptome
Labels
Computational complexity
Lymphoma
Learning
Gene Expression

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Software
  • Biomedical Engineering
  • Health Informatics

Cite this

Li, J., & Zha, H. (2002). Simultaneous classification and feature clustering using discriminant vector quantization with applications to microarray data analysis. In Proceedings - IEEE Computer Society Bioinformatics Conference, CSB 2002 (pp. 246-255). [1039347] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/CSB.2002.1039347
Li, Jia ; Zha, Hongyuan. / Simultaneous classification and feature clustering using discriminant vector quantization with applications to microarray data analysis. Proceedings - IEEE Computer Society Bioinformatics Conference, CSB 2002. Institute of Electrical and Electronics Engineers Inc., 2002. pp. 246-255
@inproceedings{d005647ada7e4a23947eac19c4c88c53,
title = "Simultaneous classification and feature clustering using discriminant vector quantization with applications to microarray data analysis",
abstract = "In many applications of supervised learning, automatic feature clustering is often desirable for a better understanding of the interaction among the various features as well as the interplay between the features and the class labels. In addition, for high dimensional data sets, feature clustering has the potential for improvement in classification accuracy and reduction in computational complexity. In this paper, a method is developed for simultaneous classification and feature clustering by extending discriminant vector quantization (DVQ), a prototype classification method derived from the principle of minimum description length using source coding techniques. The method incorporates feature clustering with classification performed by fusing features in the same clusters. To illustrate its effectiveness, the method has been applied to microarray gene expression data for human lymphoma classification. It is demonstrated that incorporating feature clustering improves classification accuracy, and the clusters generated match well with biological meaningful gene expression signature groups.",
author = "Jia Li and Hongyuan Zha",
year = "2002",
month = "1",
day = "1",
doi = "10.1109/CSB.2002.1039347",
language = "English (US)",
pages = "246--255",
booktitle = "Proceedings - IEEE Computer Society Bioinformatics Conference, CSB 2002",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
address = "United States",

}

Li, J & Zha, H 2002, Simultaneous classification and feature clustering using discriminant vector quantization with applications to microarray data analysis. in Proceedings - IEEE Computer Society Bioinformatics Conference, CSB 2002., 1039347, Institute of Electrical and Electronics Engineers Inc., pp. 246-255, 1st International IEEE Computer Society Bioinformatics Conference, CSB 2002, Stanford, United States, 8/14/02. https://doi.org/10.1109/CSB.2002.1039347

Simultaneous classification and feature clustering using discriminant vector quantization with applications to microarray data analysis. / Li, Jia; Zha, Hongyuan.

Proceedings - IEEE Computer Society Bioinformatics Conference, CSB 2002. Institute of Electrical and Electronics Engineers Inc., 2002. p. 246-255 1039347.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Simultaneous classification and feature clustering using discriminant vector quantization with applications to microarray data analysis

AU - Li, Jia

AU - Zha, Hongyuan

PY - 2002/1/1

Y1 - 2002/1/1

N2 - In many applications of supervised learning, automatic feature clustering is often desirable for a better understanding of the interaction among the various features as well as the interplay between the features and the class labels. In addition, for high dimensional data sets, feature clustering has the potential for improvement in classification accuracy and reduction in computational complexity. In this paper, a method is developed for simultaneous classification and feature clustering by extending discriminant vector quantization (DVQ), a prototype classification method derived from the principle of minimum description length using source coding techniques. The method incorporates feature clustering with classification performed by fusing features in the same clusters. To illustrate its effectiveness, the method has been applied to microarray gene expression data for human lymphoma classification. It is demonstrated that incorporating feature clustering improves classification accuracy, and the clusters generated match well with biological meaningful gene expression signature groups.

AB - In many applications of supervised learning, automatic feature clustering is often desirable for a better understanding of the interaction among the various features as well as the interplay between the features and the class labels. In addition, for high dimensional data sets, feature clustering has the potential for improvement in classification accuracy and reduction in computational complexity. In this paper, a method is developed for simultaneous classification and feature clustering by extending discriminant vector quantization (DVQ), a prototype classification method derived from the principle of minimum description length using source coding techniques. The method incorporates feature clustering with classification performed by fusing features in the same clusters. To illustrate its effectiveness, the method has been applied to microarray gene expression data for human lymphoma classification. It is demonstrated that incorporating feature clustering improves classification accuracy, and the clusters generated match well with biological meaningful gene expression signature groups.

UR - http://www.scopus.com/inward/record.url?scp=18244369245&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=18244369245&partnerID=8YFLogxK

U2 - 10.1109/CSB.2002.1039347

DO - 10.1109/CSB.2002.1039347

M3 - Conference contribution

SP - 246

EP - 255

BT - Proceedings - IEEE Computer Society Bioinformatics Conference, CSB 2002

PB - Institute of Electrical and Electronics Engineers Inc.

ER -

Li J, Zha H. Simultaneous classification and feature clustering using discriminant vector quantization with applications to microarray data analysis. In Proceedings - IEEE Computer Society Bioinformatics Conference, CSB 2002. Institute of Electrical and Electronics Engineers Inc. 2002. p. 246-255. 1039347 https://doi.org/10.1109/CSB.2002.1039347