An Optimal Semiparametric Method for Two-group Classification

Seungchul Baek, Osamu Komori, Yanyuan Ma

Research output: Contribution to journalArticlepeer-review

Abstract

In the classical discriminant analysis, when two multivariate normal distributions with equal variance–covariance matrices are assumed for two groups, the classical linear discriminant function is optimal with respect to maximizing the standardized difference between the means of two groups. However, for a typical case-control study, the distributional assumption for the case group often needs to be relaxed in practice. Komori et al. (Generalized t-statistic for two-group classification. Biometrics 2015, 71: 404–416) proposed the generalized t-statistic to obtain a linear discriminant function, which allows for heterogeneity of case group. Their procedure has an optimality property in the class of consideration. We perform a further study of the problem and show that additional improvement is achievable. The approach we propose does not require a parametric distributional assumption on the case group. We further show that the new estimator is efficient, in that no further improvement is possible to construct the linear discriminant function more efficiently. We conduct simulation studies and real data examples to illustrate the finite sample performance and the gain that it produces in comparison with existing methods.

Original languageEnglish (US)
Pages (from-to)806-846
Number of pages41
JournalScandinavian Journal of Statistics
Volume45
Issue number3
DOIs
StatePublished - Sep 2018

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Statistics, Probability and Uncertainty

Fingerprint Dive into the research topics of 'An Optimal Semiparametric Method for Two-group Classification'. Together they form a unique fingerprint.

Cite this