A deterministic, annealing-based approach for learning and model selection in finite mixture models

    Research output: Contribution to journalArticle

    3 Citations (Scopus)

    Abstract

    We address the longstanding problem of learning and model selection in finite mixtures. A common approach is to generate solutions of varying number of components (via the Expectation-Maximization (EM) algorithm) and then select the best model in the sense of a cost such as the Bayesian Information Criterion (BIC). A recent alternative uses component-wise EM (CEM) and, further, integrates model selection within CEM. Both approaches are susceptible to finding poor solutions, the first due to initialization sensitivity of EM and the second due to the sequential (greedy) nature of CEM. Deterministic annealing for clustering (DA) and mixture modeling (DAEM) provide potential for avoiding local optima. However, these methods do not encompass model selection. We propose a new technique with positive attributes of all these methods: it integrates learning and model selection, performs batch optimization over components, and has the character of DA, with the optimization performed over a sequence of decreasing temperatures. Unlike standard DA, with the partition entropy reduced as the temperature is lowered, our approach reduces entropy of binary random variables that express whether each component is active or inactive. At low temperature, the method achieves explicit model order selection. Experiments demonstrate favorable performance of our method, compared with several alternatives. We also give an interesting stochastic generative model interpretation for our method.

    Original languageEnglish (US)
    JournalProceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing
    Volume5
    StatePublished - 2004

    Fingerprint

    learning
    Annealing
    annealing
    Entropy
    entropy
    Stochastic models
    optimization
    Random variables
    random variables
    Temperature
    partitions
    costs
    temperature
    sensitivity
    Costs
    Experiments

    All Science Journal Classification (ASJC) codes

    • Electrical and Electronic Engineering
    • Signal Processing
    • Acoustics and Ultrasonics

    Cite this

    @article{2050a56bfcc3417a8f49339d6be0e93c,
    title = "A deterministic, annealing-based approach for learning and model selection in finite mixture models",
    abstract = "We address the longstanding problem of learning and model selection in finite mixtures. A common approach is to generate solutions of varying number of components (via the Expectation-Maximization (EM) algorithm) and then select the best model in the sense of a cost such as the Bayesian Information Criterion (BIC). A recent alternative uses component-wise EM (CEM) and, further, integrates model selection within CEM. Both approaches are susceptible to finding poor solutions, the first due to initialization sensitivity of EM and the second due to the sequential (greedy) nature of CEM. Deterministic annealing for clustering (DA) and mixture modeling (DAEM) provide potential for avoiding local optima. However, these methods do not encompass model selection. We propose a new technique with positive attributes of all these methods: it integrates learning and model selection, performs batch optimization over components, and has the character of DA, with the optimization performed over a sequence of decreasing temperatures. Unlike standard DA, with the partition entropy reduced as the temperature is lowered, our approach reduces entropy of binary random variables that express whether each component is active or inactive. At low temperature, the method achieves explicit model order selection. Experiments demonstrate favorable performance of our method, compared with several alternatives. We also give an interesting stochastic generative model interpretation for our method.",
    author = "Qi Zhao and Miller, {David Jonathan}",
    year = "2004",
    language = "English (US)",
    volume = "5",
    journal = "Proceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing",
    issn = "0736-7791",
    publisher = "Institute of Electrical and Electronics Engineers Inc.",

    }

    TY - JOUR

    T1 - A deterministic, annealing-based approach for learning and model selection in finite mixture models

    AU - Zhao, Qi

    AU - Miller, David Jonathan

    PY - 2004

    Y1 - 2004

    N2 - We address the longstanding problem of learning and model selection in finite mixtures. A common approach is to generate solutions of varying number of components (via the Expectation-Maximization (EM) algorithm) and then select the best model in the sense of a cost such as the Bayesian Information Criterion (BIC). A recent alternative uses component-wise EM (CEM) and, further, integrates model selection within CEM. Both approaches are susceptible to finding poor solutions, the first due to initialization sensitivity of EM and the second due to the sequential (greedy) nature of CEM. Deterministic annealing for clustering (DA) and mixture modeling (DAEM) provide potential for avoiding local optima. However, these methods do not encompass model selection. We propose a new technique with positive attributes of all these methods: it integrates learning and model selection, performs batch optimization over components, and has the character of DA, with the optimization performed over a sequence of decreasing temperatures. Unlike standard DA, with the partition entropy reduced as the temperature is lowered, our approach reduces entropy of binary random variables that express whether each component is active or inactive. At low temperature, the method achieves explicit model order selection. Experiments demonstrate favorable performance of our method, compared with several alternatives. We also give an interesting stochastic generative model interpretation for our method.

    AB - We address the longstanding problem of learning and model selection in finite mixtures. A common approach is to generate solutions of varying number of components (via the Expectation-Maximization (EM) algorithm) and then select the best model in the sense of a cost such as the Bayesian Information Criterion (BIC). A recent alternative uses component-wise EM (CEM) and, further, integrates model selection within CEM. Both approaches are susceptible to finding poor solutions, the first due to initialization sensitivity of EM and the second due to the sequential (greedy) nature of CEM. Deterministic annealing for clustering (DA) and mixture modeling (DAEM) provide potential for avoiding local optima. However, these methods do not encompass model selection. We propose a new technique with positive attributes of all these methods: it integrates learning and model selection, performs batch optimization over components, and has the character of DA, with the optimization performed over a sequence of decreasing temperatures. Unlike standard DA, with the partition entropy reduced as the temperature is lowered, our approach reduces entropy of binary random variables that express whether each component is active or inactive. At low temperature, the method achieves explicit model order selection. Experiments demonstrate favorable performance of our method, compared with several alternatives. We also give an interesting stochastic generative model interpretation for our method.

    UR - http://www.scopus.com/inward/record.url?scp=4544269277&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=4544269277&partnerID=8YFLogxK

    M3 - Article

    VL - 5

    JO - Proceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing

    JF - Proceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing

    SN - 0736-7791

    ER -