Mixture modeling with pairwise, instance-level class constraints

    Research output: Contribution to journalArticlepeer-review

    21 Scopus citations

    Abstract

    The goal of semisupervised clustering/mixture modeling is to learn the underlying groups comprising a given data set when there is also some form of instance-level supervision available, usually in the form of labels or pairwise sample constraints. Most prior work with constraints assumes the number of classes is known, with each learned cluster assumed to be a class and, hence, subject to the given class constraints. When the number of classes is unknown or when the one-cluster-per-class assumption is not valid, the use of constraints may actually be deleterious to learning the ground-truth data groups. We address this by (1) allowing allocation of multiple mixture components to individual classes and (2) estimating both the number of components and the number of classes. We also address new class discovery, with components void of constraints treated as putative unknown classes. For both real-world and synthetic data, our method is shown to accurately estimate the number of classes and to give favorable comparison with the recent approach of Shental, Bar-Hillel, Hertz, and Weinshall (2003).

    Original languageEnglish (US)
    Pages (from-to)2482-2507
    Number of pages26
    JournalNeural computation
    Volume17
    Issue number11
    DOIs
    StatePublished - Nov 2005

    All Science Journal Classification (ASJC) codes

    • Arts and Humanities (miscellaneous)
    • Cognitive Neuroscience

    Fingerprint Dive into the research topics of 'Mixture modeling with pairwise, instance-level class constraints'. Together they form a unique fingerprint.

    Cite this