Clustering based on a multilayer mixture model

Research output: Contribution to journalArticle

37 Scopus citations

Abstract

In model-based clustering, the density of each cluster is usually assumed to be a certain basic parametric distribution, for example, the normal distribution. In practice, it is often difficult to decide which parametric distribution is suitable to characterize a cluster, especially for multivariate data. Moreover, the densities of individual clusters may be multimodal themselves, and therefore cannot be accurately modeled by basic parametric distributions. This article explores a clustering approach that models each cluster by a mixture of normals. The resulting overall model is a multilayer mixture of normals. Algorithms to estimate the model and perform clustering are developed based on the classification maximum likelihood (CML) and mixture maximum likelihood (MML) criteria. BIC and ICL-BIC are examined for choosing the number of normal components per cluster. Experiments on both simulated and real data are presented.

Original languageEnglish (US)
Pages (from-to)547-568
Number of pages22
JournalJournal of Computational and Graphical Statistics
Volume14
Issue number3
DOIs
StatePublished - Sep 1 2005

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Discrete Mathematics and Combinatorics
  • Statistics, Probability and Uncertainty

Fingerprint Dive into the research topics of 'Clustering based on a multilayer mixture model'. Together they form a unique fingerprint.

  • Cite this