Sparse topic models by parameter sharing

Hossein Soleimani, David Jonathan Miller

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    We propose a sparse Bayesian topic model, based on parameter sharing, for modeling text corpora. In Latent Dirichlet Allocation (LDA), each topic models all words, even though many words are not topic-specific, i.e. have similar occurrence frequencies across different topics. We propose a sparser approach by introducing a universal shared model, used by each topic to model the subset of words that are not topic-specific. A Bernoulli random variable is associated with each word under every topic, determining whether that word is modeled topic-specifically, with a free parameter, or by the shared model, with a common parameter. Results of our experiments show that our model achieves sparser topic presence in documents and higher test likelihood than LDA.

    Original languageEnglish (US)
    Title of host publicationIEEE International Workshop on Machine Learning for Signal Processing, MLSP
    EditorsTulay Adali, Jan Larsen, Mamadou Mboup, Eric Moreau
    PublisherIEEE Computer Society
    ISBN (Electronic)9781479936946
    DOIs
    StatePublished - Jan 1 2014
    Event2014 24th IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2014 - Reims, France
    Duration: Sep 21 2014Sep 24 2014

    Other

    Other2014 24th IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2014
    CountryFrance
    CityReims
    Period9/21/149/24/14

    Fingerprint

    Random variables
    Experiments

    All Science Journal Classification (ASJC) codes

    • Human-Computer Interaction
    • Signal Processing

    Cite this

    Soleimani, H., & Miller, D. J. (2014). Sparse topic models by parameter sharing. In T. Adali, J. Larsen, M. Mboup, & E. Moreau (Eds.), IEEE International Workshop on Machine Learning for Signal Processing, MLSP [6958911] IEEE Computer Society. https://doi.org/10.1109/MLSP.2014.6958911
    Soleimani, Hossein ; Miller, David Jonathan. / Sparse topic models by parameter sharing. IEEE International Workshop on Machine Learning for Signal Processing, MLSP. editor / Tulay Adali ; Jan Larsen ; Mamadou Mboup ; Eric Moreau. IEEE Computer Society, 2014.
    @inproceedings{ae76a179348c4c06afcb472ead439e55,
    title = "Sparse topic models by parameter sharing",
    abstract = "We propose a sparse Bayesian topic model, based on parameter sharing, for modeling text corpora. In Latent Dirichlet Allocation (LDA), each topic models all words, even though many words are not topic-specific, i.e. have similar occurrence frequencies across different topics. We propose a sparser approach by introducing a universal shared model, used by each topic to model the subset of words that are not topic-specific. A Bernoulli random variable is associated with each word under every topic, determining whether that word is modeled topic-specifically, with a free parameter, or by the shared model, with a common parameter. Results of our experiments show that our model achieves sparser topic presence in documents and higher test likelihood than LDA.",
    author = "Hossein Soleimani and Miller, {David Jonathan}",
    year = "2014",
    month = "1",
    day = "1",
    doi = "10.1109/MLSP.2014.6958911",
    language = "English (US)",
    editor = "Tulay Adali and Jan Larsen and Mamadou Mboup and Eric Moreau",
    booktitle = "IEEE International Workshop on Machine Learning for Signal Processing, MLSP",
    publisher = "IEEE Computer Society",
    address = "United States",

    }

    Soleimani, H & Miller, DJ 2014, Sparse topic models by parameter sharing. in T Adali, J Larsen, M Mboup & E Moreau (eds), IEEE International Workshop on Machine Learning for Signal Processing, MLSP., 6958911, IEEE Computer Society, 2014 24th IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2014, Reims, France, 9/21/14. https://doi.org/10.1109/MLSP.2014.6958911

    Sparse topic models by parameter sharing. / Soleimani, Hossein; Miller, David Jonathan.

    IEEE International Workshop on Machine Learning for Signal Processing, MLSP. ed. / Tulay Adali; Jan Larsen; Mamadou Mboup; Eric Moreau. IEEE Computer Society, 2014. 6958911.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    TY - GEN

    T1 - Sparse topic models by parameter sharing

    AU - Soleimani, Hossein

    AU - Miller, David Jonathan

    PY - 2014/1/1

    Y1 - 2014/1/1

    N2 - We propose a sparse Bayesian topic model, based on parameter sharing, for modeling text corpora. In Latent Dirichlet Allocation (LDA), each topic models all words, even though many words are not topic-specific, i.e. have similar occurrence frequencies across different topics. We propose a sparser approach by introducing a universal shared model, used by each topic to model the subset of words that are not topic-specific. A Bernoulli random variable is associated with each word under every topic, determining whether that word is modeled topic-specifically, with a free parameter, or by the shared model, with a common parameter. Results of our experiments show that our model achieves sparser topic presence in documents and higher test likelihood than LDA.

    AB - We propose a sparse Bayesian topic model, based on parameter sharing, for modeling text corpora. In Latent Dirichlet Allocation (LDA), each topic models all words, even though many words are not topic-specific, i.e. have similar occurrence frequencies across different topics. We propose a sparser approach by introducing a universal shared model, used by each topic to model the subset of words that are not topic-specific. A Bernoulli random variable is associated with each word under every topic, determining whether that word is modeled topic-specifically, with a free parameter, or by the shared model, with a common parameter. Results of our experiments show that our model achieves sparser topic presence in documents and higher test likelihood than LDA.

    UR - http://www.scopus.com/inward/record.url?scp=84912553417&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=84912553417&partnerID=8YFLogxK

    U2 - 10.1109/MLSP.2014.6958911

    DO - 10.1109/MLSP.2014.6958911

    M3 - Conference contribution

    BT - IEEE International Workshop on Machine Learning for Signal Processing, MLSP

    A2 - Adali, Tulay

    A2 - Larsen, Jan

    A2 - Mboup, Mamadou

    A2 - Moreau, Eric

    PB - IEEE Computer Society

    ER -

    Soleimani H, Miller DJ. Sparse topic models by parameter sharing. In Adali T, Larsen J, Mboup M, Moreau E, editors, IEEE International Workshop on Machine Learning for Signal Processing, MLSP. IEEE Computer Society. 2014. 6958911 https://doi.org/10.1109/MLSP.2014.6958911