Latent class models with sparse contingency tables can present problems for model comparison and selection, because under these conditions the distributions of goodness-of-fit indices are often unknown. This causes inaccuracies both in hypothesis testing and in model comparisons based on normed indices. In order to assess the extent of this problem, we carried out a simulation investigating the distributions of the likelihood ratio statistic G2, the Pearson statistic X2, and a new goodness-of-fit index suggested by Read and Cressie (1988). There were substantial deviations between the expectation of the chi-squared distribution and the means of the G2 and Read and Cressie distributions. In general, the mean of the distribution of a statistic was closer to the expectation of the chi-squared distribution when the average cell expectation was large, there were fewer indicator items, and the latent class measurement parameters were less extreme. It was found that the mean of the X2 distribution is generally closer to the expectation of the chi-squared distribution than are the means of the other two indices we examined, but the standard deviation of the X2 distribution is considerably larger than that of the other two indices and larger than the standard deviation of the chi-squared distribution. We argue that a possible solution is to forgo reliance on theoretical distributions for expectations and quantiles of goodness-of-fit statistics. Instead, Monte Carlo sampling (Noreen, 1989) can be used to arrive at an empirical central or noncentral distribution.
All Science Journal Classification (ASJC) codes
- Statistics and Probability
- Experimental and Cognitive Psychology
- Arts and Humanities (miscellaneous)