Goodness-of-fit testing: The thresholding approach

Min Hee Kim, Michael G. Akritas

Research output: Contribution to journalArticle

Abstract

The classical Pearson's chi-square test for goodness-of-fit has found extensive applications in areas such as contingency tables and, recently, multiple testing. Mann and Wald [(1942), 'On the Choice of the Number of Class Intervals in the Application of the Chi Square Test', The Annals of Mathematical Statistics, 13, 306-317] were the first to establish the power advantages of letting the number n bin of bins tend to infinity with n, and found n bin=n 2/5 to be the optimal rate. For a corresponding development in the area of contingency tables, see Holst [(1972), 'Asymptotic Normality and Efficiency for Certain Goodness-of-Fit Tests', Biometrika, 59, 137-145], Morris [(1975), 'Central Limit Theorems for Multinomial Sums', The Annals of Statistics, 3, 165-188], and Koehler and Larntz [(1980), 'An Empirical Investigation of Goodness-of-Fit Statistics for Sparse Multinomials', Journal of the American Statistical Association, 75, 336-344]. In this paper, we consider the use of thresholding methods to further improve on the power of Pearson's chi-square test. An alternative statistic, based on the cell averages, is also studied. The Fourier or wavelet transformation is used to ensure power enhancement in both high- and low-signal-to-noise ratio alternatives. Simulations suggest that application of order thresholding (Kim, M.H., and Akritas, M.G. (2010), 'Order Thresholding', The Annals of Statistics, 38, 2314-2350) achieves accurate type I error rates, and competitive power.

Original languageEnglish (US)
Pages (from-to)119-138
Number of pages20
JournalJournal of Nonparametric Statistics
Volume24
Issue number1
DOIs
StatePublished - Mar 1 2012

Fingerprint

Thresholding
Goodness of fit
Chi-squared test
Statistics
Testing
Contingency Table
Wavelet Transformation
Asymptotic Efficiency
Multiple Testing
Optimal Rates
Type I Error Rate
Fourier Transformation
Alternatives
Goodness of Fit Test
Asymptotic Normality
Central limit theorem
Statistic
Enhancement
Infinity
Tend

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Statistics, Probability and Uncertainty

Cite this

@article{795482925d3c45baba622ba7e42e85ce,
title = "Goodness-of-fit testing: The thresholding approach",
abstract = "The classical Pearson's chi-square test for goodness-of-fit has found extensive applications in areas such as contingency tables and, recently, multiple testing. Mann and Wald [(1942), 'On the Choice of the Number of Class Intervals in the Application of the Chi Square Test', The Annals of Mathematical Statistics, 13, 306-317] were the first to establish the power advantages of letting the number n bin of bins tend to infinity with n, and found n bin=n 2/5 to be the optimal rate. For a corresponding development in the area of contingency tables, see Holst [(1972), 'Asymptotic Normality and Efficiency for Certain Goodness-of-Fit Tests', Biometrika, 59, 137-145], Morris [(1975), 'Central Limit Theorems for Multinomial Sums', The Annals of Statistics, 3, 165-188], and Koehler and Larntz [(1980), 'An Empirical Investigation of Goodness-of-Fit Statistics for Sparse Multinomials', Journal of the American Statistical Association, 75, 336-344]. In this paper, we consider the use of thresholding methods to further improve on the power of Pearson's chi-square test. An alternative statistic, based on the cell averages, is also studied. The Fourier or wavelet transformation is used to ensure power enhancement in both high- and low-signal-to-noise ratio alternatives. Simulations suggest that application of order thresholding (Kim, M.H., and Akritas, M.G. (2010), 'Order Thresholding', The Annals of Statistics, 38, 2314-2350) achieves accurate type I error rates, and competitive power.",
author = "Kim, {Min Hee} and Akritas, {Michael G.}",
year = "2012",
month = "3",
day = "1",
doi = "10.1080/10485252.2011.606367",
language = "English (US)",
volume = "24",
pages = "119--138",
journal = "Journal of Nonparametric Statistics",
issn = "1048-5252",
publisher = "Taylor and Francis Ltd.",
number = "1",

}

Goodness-of-fit testing : The thresholding approach. / Kim, Min Hee; Akritas, Michael G.

In: Journal of Nonparametric Statistics, Vol. 24, No. 1, 01.03.2012, p. 119-138.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Goodness-of-fit testing

T2 - The thresholding approach

AU - Kim, Min Hee

AU - Akritas, Michael G.

PY - 2012/3/1

Y1 - 2012/3/1

N2 - The classical Pearson's chi-square test for goodness-of-fit has found extensive applications in areas such as contingency tables and, recently, multiple testing. Mann and Wald [(1942), 'On the Choice of the Number of Class Intervals in the Application of the Chi Square Test', The Annals of Mathematical Statistics, 13, 306-317] were the first to establish the power advantages of letting the number n bin of bins tend to infinity with n, and found n bin=n 2/5 to be the optimal rate. For a corresponding development in the area of contingency tables, see Holst [(1972), 'Asymptotic Normality and Efficiency for Certain Goodness-of-Fit Tests', Biometrika, 59, 137-145], Morris [(1975), 'Central Limit Theorems for Multinomial Sums', The Annals of Statistics, 3, 165-188], and Koehler and Larntz [(1980), 'An Empirical Investigation of Goodness-of-Fit Statistics for Sparse Multinomials', Journal of the American Statistical Association, 75, 336-344]. In this paper, we consider the use of thresholding methods to further improve on the power of Pearson's chi-square test. An alternative statistic, based on the cell averages, is also studied. The Fourier or wavelet transformation is used to ensure power enhancement in both high- and low-signal-to-noise ratio alternatives. Simulations suggest that application of order thresholding (Kim, M.H., and Akritas, M.G. (2010), 'Order Thresholding', The Annals of Statistics, 38, 2314-2350) achieves accurate type I error rates, and competitive power.

AB - The classical Pearson's chi-square test for goodness-of-fit has found extensive applications in areas such as contingency tables and, recently, multiple testing. Mann and Wald [(1942), 'On the Choice of the Number of Class Intervals in the Application of the Chi Square Test', The Annals of Mathematical Statistics, 13, 306-317] were the first to establish the power advantages of letting the number n bin of bins tend to infinity with n, and found n bin=n 2/5 to be the optimal rate. For a corresponding development in the area of contingency tables, see Holst [(1972), 'Asymptotic Normality and Efficiency for Certain Goodness-of-Fit Tests', Biometrika, 59, 137-145], Morris [(1975), 'Central Limit Theorems for Multinomial Sums', The Annals of Statistics, 3, 165-188], and Koehler and Larntz [(1980), 'An Empirical Investigation of Goodness-of-Fit Statistics for Sparse Multinomials', Journal of the American Statistical Association, 75, 336-344]. In this paper, we consider the use of thresholding methods to further improve on the power of Pearson's chi-square test. An alternative statistic, based on the cell averages, is also studied. The Fourier or wavelet transformation is used to ensure power enhancement in both high- and low-signal-to-noise ratio alternatives. Simulations suggest that application of order thresholding (Kim, M.H., and Akritas, M.G. (2010), 'Order Thresholding', The Annals of Statistics, 38, 2314-2350) achieves accurate type I error rates, and competitive power.

UR - http://www.scopus.com/inward/record.url?scp=84863127918&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84863127918&partnerID=8YFLogxK

U2 - 10.1080/10485252.2011.606367

DO - 10.1080/10485252.2011.606367

M3 - Article

AN - SCOPUS:84863127918

VL - 24

SP - 119

EP - 138

JO - Journal of Nonparametric Statistics

JF - Journal of Nonparametric Statistics

SN - 1048-5252

IS - 1

ER -