Kernel PCA for feature extraction with information complexity

Zhenqiu Liu, Hamparsum Bozdogan

Research output: Chapter in Book/Report/Conference proceedingChapter

1 Scopus citations


In this paper, we deal with modelling or extracting information from an unlabelled data sample. In many real world applications appropriate preprocessing transformations of high dimensional input data can increase overall performance of algorithms. Feature extraction tries to find a compact description of the interesting features of the data. This can be useful for visualization of higher dimensional data in two or three dimensions or for data compression. It can also be applied as a preprocessing step that enables reducing the dimension of the data to be handled by a subsequent model. In this paper, we mainly concentrate on kernel PCA for feature selection in a higher dimensional feature space. We first introduce the usefulness of EM algorithm for standard PCA. We then present the kernel PCA. Kernel PCA is a nonlinear extension of PCA based on the kernel transformation (Scholkopf, Smola, and Muller 1997). It requires the eigenvalue decomposition of a so-called kernel matrix of size N×N. In this contribution we propose an expectation maximization approach for performing kernel principal component analysis. Moreover we will introduce an online algorithm of EM for PCA. We show this to be a computationally efficient method especially when the number of data points is large. The information criteria of Bozdogan together with others are used to decide the number of eigenvalues.

Original languageEnglish (US)
Title of host publicationStatistical Data Mining and Knowledge Discovery
PublisherCRC Press
Number of pages14
ISBN (Electronic)9780203497159
ISBN (Print)9781584883449
StatePublished - Jan 1 2003

All Science Journal Classification (ASJC) codes

  • Economics, Econometrics and Finance(all)
  • Business, Management and Accounting(all)
  • Computer Science(all)


Dive into the research topics of 'Kernel PCA for feature extraction with information complexity'. Together they form a unique fingerprint.

Cite this