TY - GEN
T1 - Simultaneous classification and feature clustering using discriminant vector quantization with applications to microarray data analysis
AU - Li, Jia
AU - Zha, Hongyuan
PY - 2002/1/1
Y1 - 2002/1/1
N2 - In many applications of supervised learning, automatic feature clustering is often desirable for a better understanding of the interaction among the various features as well as the interplay between the features and the class labels. In addition, for high dimensional data sets, feature clustering has the potential for improvement in classification accuracy and reduction in computational complexity. In this paper, a method is developed for simultaneous classification and feature clustering by extending discriminant vector quantization (DVQ), a prototype classification method derived from the principle of minimum description length using source coding techniques. The method incorporates feature clustering with classification performed by fusing features in the same clusters. To illustrate its effectiveness, the method has been applied to microarray gene expression data for human lymphoma classification. It is demonstrated that incorporating feature clustering improves classification accuracy, and the clusters generated match well with biological meaningful gene expression signature groups.
AB - In many applications of supervised learning, automatic feature clustering is often desirable for a better understanding of the interaction among the various features as well as the interplay between the features and the class labels. In addition, for high dimensional data sets, feature clustering has the potential for improvement in classification accuracy and reduction in computational complexity. In this paper, a method is developed for simultaneous classification and feature clustering by extending discriminant vector quantization (DVQ), a prototype classification method derived from the principle of minimum description length using source coding techniques. The method incorporates feature clustering with classification performed by fusing features in the same clusters. To illustrate its effectiveness, the method has been applied to microarray gene expression data for human lymphoma classification. It is demonstrated that incorporating feature clustering improves classification accuracy, and the clusters generated match well with biological meaningful gene expression signature groups.
UR - http://www.scopus.com/inward/record.url?scp=18244369245&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=18244369245&partnerID=8YFLogxK
U2 - 10.1109/CSB.2002.1039347
DO - 10.1109/CSB.2002.1039347
M3 - Conference contribution
C2 - 15838141
AN - SCOPUS:18244369245
T3 - Proceedings - IEEE Computer Society Bioinformatics Conference, CSB 2002
SP - 246
EP - 255
BT - Proceedings - IEEE Computer Society Bioinformatics Conference, CSB 2002
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 1st International IEEE Computer Society Bioinformatics Conference, CSB 2002
Y2 - 14 August 2002 through 16 August 2002
ER -