Composite Likelihood Inference in a Discrete Latent Variable Model for Two-Way “Clustering-by-Segmentation” Problems

Francesco Bartolucci, Francesca Chiaromonte, Prabhani Kuruppumullage Don, Bruce G. Lindsay

Research output: Contribution to journalArticle

Abstract

We consider a discrete latent variable model for two-way data arrays, which allows one to simultaneously produce clusters along one of the data dimensions (e.g., exchangeable observational units or features) and contiguous groups, or segments, along the other (e.g., consecutively ordered times or locations). The model relies on a hidden Markov structure but, given its complexity, cannot be estimated by full maximum likelihood. Therefore, we introduce a composite likelihood methodology based on considering different subsets of the data. The proposed approach is illustrated by simulation, and with an application to genomic data.

Original languageEnglish (US)
Pages (from-to)388-402
Number of pages15
JournalJournal of Computational and Graphical Statistics
Volume26
Issue number2
DOIs
StatePublished - Apr 3 2017

Fingerprint

Composite Likelihood
Latent Variable Models
Likelihood Inference
Discrete Variables
Discrete Model
Segmentation
Clustering
Maximum Likelihood
Genomics
Unit
Subset
Methodology
Inference
Latent variable models
Simulation
Maximum likelihood

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Statistics, Probability and Uncertainty
  • Discrete Mathematics and Combinatorics

Cite this

@article{b89c7119a807422a9f07c6685173cc3b,
title = "Composite Likelihood Inference in a Discrete Latent Variable Model for Two-Way “Clustering-by-Segmentation” Problems",
abstract = "We consider a discrete latent variable model for two-way data arrays, which allows one to simultaneously produce clusters along one of the data dimensions (e.g., exchangeable observational units or features) and contiguous groups, or segments, along the other (e.g., consecutively ordered times or locations). The model relies on a hidden Markov structure but, given its complexity, cannot be estimated by full maximum likelihood. Therefore, we introduce a composite likelihood methodology based on considering different subsets of the data. The proposed approach is illustrated by simulation, and with an application to genomic data.",
author = "Francesco Bartolucci and Francesca Chiaromonte and Don, {Prabhani Kuruppumullage} and Lindsay, {Bruce G.}",
year = "2017",
month = "4",
day = "3",
doi = "10.1080/10618600.2016.1172018",
language = "English (US)",
volume = "26",
pages = "388--402",
journal = "Journal of Computational and Graphical Statistics",
issn = "1061-8600",
publisher = "American Statistical Association",
number = "2",

}

Composite Likelihood Inference in a Discrete Latent Variable Model for Two-Way “Clustering-by-Segmentation” Problems. / Bartolucci, Francesco; Chiaromonte, Francesca; Don, Prabhani Kuruppumullage; Lindsay, Bruce G.

In: Journal of Computational and Graphical Statistics, Vol. 26, No. 2, 03.04.2017, p. 388-402.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Composite Likelihood Inference in a Discrete Latent Variable Model for Two-Way “Clustering-by-Segmentation” Problems

AU - Bartolucci, Francesco

AU - Chiaromonte, Francesca

AU - Don, Prabhani Kuruppumullage

AU - Lindsay, Bruce G.

PY - 2017/4/3

Y1 - 2017/4/3

N2 - We consider a discrete latent variable model for two-way data arrays, which allows one to simultaneously produce clusters along one of the data dimensions (e.g., exchangeable observational units or features) and contiguous groups, or segments, along the other (e.g., consecutively ordered times or locations). The model relies on a hidden Markov structure but, given its complexity, cannot be estimated by full maximum likelihood. Therefore, we introduce a composite likelihood methodology based on considering different subsets of the data. The proposed approach is illustrated by simulation, and with an application to genomic data.

AB - We consider a discrete latent variable model for two-way data arrays, which allows one to simultaneously produce clusters along one of the data dimensions (e.g., exchangeable observational units or features) and contiguous groups, or segments, along the other (e.g., consecutively ordered times or locations). The model relies on a hidden Markov structure but, given its complexity, cannot be estimated by full maximum likelihood. Therefore, we introduce a composite likelihood methodology based on considering different subsets of the data. The proposed approach is illustrated by simulation, and with an application to genomic data.

UR - http://www.scopus.com/inward/record.url?scp=85018706333&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85018706333&partnerID=8YFLogxK

U2 - 10.1080/10618600.2016.1172018

DO - 10.1080/10618600.2016.1172018

M3 - Article

AN - SCOPUS:85018706333

VL - 26

SP - 388

EP - 402

JO - Journal of Computational and Graphical Statistics

JF - Journal of Computational and Graphical Statistics

SN - 1061-8600

IS - 2

ER -