Goodness-of-Fit Testing for Latent Class Models

Linda Marie Collins, Penny L. Fidler, Stuart E. Wugalter, Jeffrey D. Long

Research output: Contribution to journalArticle

108 Citations (Scopus)

Abstract

Latent class models with sparse contingency tables can present problems for model comparison and selection, because under these conditions the distributions of goodness-of-fit indices are often unknown. This causes inaccuracies both in hypothesis testing and in model comparisons based on normed indices. In order to assess the extent of this problem, we carried out a simulation investigating the distributions of the likelihood ratio statistic G2, the Pearson statistic X2, and a new goodness-of-fit index suggested by Read and Cressie (1988). There were substantial deviations between the expectation of the chi-squared distribution and the means of the G2 and Read and Cressie distributions. In general, the mean of the distribution of a statistic was closer to the expectation of the chi-squared distribution when the average cell expectation was large, there were fewer indicator items, and the latent class measurement parameters were less extreme. It was found that the mean of the X2 distribution is generally closer to the expectation of the chi-squared distribution than are the means of the other two indices we examined, but the standard deviation of the X2 distribution is considerably larger than that of the other two indices and larger than the standard deviation of the chi-squared distribution. We argue that a possible solution is to forgo reliance on theoretical distributions for expectations and quantiles of goodness-of-fit statistics. Instead, Monte Carlo sampling (Noreen, 1989) can be used to arrive at an empirical central or noncentral distribution.

Original languageEnglish (US)
Pages (from-to)375-389
Number of pages15
JournalMultivariate Behavioral Research
Volume28
Issue number3
DOIs
StatePublished - Jul 1 1993

Fingerprint

Latent Class Model
Goodness of fit
Chi-squared distribution
Testing
Model Comparison
Standard deviation
Statistic
Latent Class
Monte Carlo Sampling
Likelihood Ratio Statistic
Goodness
Contingency Table
Hypothesis Testing
Quantile
Model Selection
Extremes
Deviation
Statistics
Unknown
Cell

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Experimental and Cognitive Psychology
  • Arts and Humanities (miscellaneous)

Cite this

Collins, Linda Marie ; Fidler, Penny L. ; Wugalter, Stuart E. ; Long, Jeffrey D. / Goodness-of-Fit Testing for Latent Class Models. In: Multivariate Behavioral Research. 1993 ; Vol. 28, No. 3. pp. 375-389.
@article{1c9cee565f684eaa83a4f849375ee54b,
title = "Goodness-of-Fit Testing for Latent Class Models",
abstract = "Latent class models with sparse contingency tables can present problems for model comparison and selection, because under these conditions the distributions of goodness-of-fit indices are often unknown. This causes inaccuracies both in hypothesis testing and in model comparisons based on normed indices. In order to assess the extent of this problem, we carried out a simulation investigating the distributions of the likelihood ratio statistic G2, the Pearson statistic X2, and a new goodness-of-fit index suggested by Read and Cressie (1988). There were substantial deviations between the expectation of the chi-squared distribution and the means of the G2 and Read and Cressie distributions. In general, the mean of the distribution of a statistic was closer to the expectation of the chi-squared distribution when the average cell expectation was large, there were fewer indicator items, and the latent class measurement parameters were less extreme. It was found that the mean of the X2 distribution is generally closer to the expectation of the chi-squared distribution than are the means of the other two indices we examined, but the standard deviation of the X2 distribution is considerably larger than that of the other two indices and larger than the standard deviation of the chi-squared distribution. We argue that a possible solution is to forgo reliance on theoretical distributions for expectations and quantiles of goodness-of-fit statistics. Instead, Monte Carlo sampling (Noreen, 1989) can be used to arrive at an empirical central or noncentral distribution.",
author = "Collins, {Linda Marie} and Fidler, {Penny L.} and Wugalter, {Stuart E.} and Long, {Jeffrey D.}",
year = "1993",
month = "7",
day = "1",
doi = "10.1207/s15327906mbr2803_4",
language = "English (US)",
volume = "28",
pages = "375--389",
journal = "Multivariate Behavioral Research",
issn = "0027-3171",
publisher = "Psychology Press Ltd",
number = "3",

}

Goodness-of-Fit Testing for Latent Class Models. / Collins, Linda Marie; Fidler, Penny L.; Wugalter, Stuart E.; Long, Jeffrey D.

In: Multivariate Behavioral Research, Vol. 28, No. 3, 01.07.1993, p. 375-389.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Goodness-of-Fit Testing for Latent Class Models

AU - Collins, Linda Marie

AU - Fidler, Penny L.

AU - Wugalter, Stuart E.

AU - Long, Jeffrey D.

PY - 1993/7/1

Y1 - 1993/7/1

N2 - Latent class models with sparse contingency tables can present problems for model comparison and selection, because under these conditions the distributions of goodness-of-fit indices are often unknown. This causes inaccuracies both in hypothesis testing and in model comparisons based on normed indices. In order to assess the extent of this problem, we carried out a simulation investigating the distributions of the likelihood ratio statistic G2, the Pearson statistic X2, and a new goodness-of-fit index suggested by Read and Cressie (1988). There were substantial deviations between the expectation of the chi-squared distribution and the means of the G2 and Read and Cressie distributions. In general, the mean of the distribution of a statistic was closer to the expectation of the chi-squared distribution when the average cell expectation was large, there were fewer indicator items, and the latent class measurement parameters were less extreme. It was found that the mean of the X2 distribution is generally closer to the expectation of the chi-squared distribution than are the means of the other two indices we examined, but the standard deviation of the X2 distribution is considerably larger than that of the other two indices and larger than the standard deviation of the chi-squared distribution. We argue that a possible solution is to forgo reliance on theoretical distributions for expectations and quantiles of goodness-of-fit statistics. Instead, Monte Carlo sampling (Noreen, 1989) can be used to arrive at an empirical central or noncentral distribution.

AB - Latent class models with sparse contingency tables can present problems for model comparison and selection, because under these conditions the distributions of goodness-of-fit indices are often unknown. This causes inaccuracies both in hypothesis testing and in model comparisons based on normed indices. In order to assess the extent of this problem, we carried out a simulation investigating the distributions of the likelihood ratio statistic G2, the Pearson statistic X2, and a new goodness-of-fit index suggested by Read and Cressie (1988). There were substantial deviations between the expectation of the chi-squared distribution and the means of the G2 and Read and Cressie distributions. In general, the mean of the distribution of a statistic was closer to the expectation of the chi-squared distribution when the average cell expectation was large, there were fewer indicator items, and the latent class measurement parameters were less extreme. It was found that the mean of the X2 distribution is generally closer to the expectation of the chi-squared distribution than are the means of the other two indices we examined, but the standard deviation of the X2 distribution is considerably larger than that of the other two indices and larger than the standard deviation of the chi-squared distribution. We argue that a possible solution is to forgo reliance on theoretical distributions for expectations and quantiles of goodness-of-fit statistics. Instead, Monte Carlo sampling (Noreen, 1989) can be used to arrive at an empirical central or noncentral distribution.

UR - http://www.scopus.com/inward/record.url?scp=21344496087&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=21344496087&partnerID=8YFLogxK

U2 - 10.1207/s15327906mbr2803_4

DO - 10.1207/s15327906mbr2803_4

M3 - Article

AN - SCOPUS:21344496087

VL - 28

SP - 375

EP - 389

JO - Multivariate Behavioral Research

JF - Multivariate Behavioral Research

SN - 0027-3171

IS - 3

ER -