Feature Screening for Ultrahigh Dimensional Categorical Data With Applications

Danyang Huang, Runze Li, Hansheng Wang

Research output: Contribution to journalArticle

20 Scopus citations

Abstract

Ultrahigh dimensional data with both categorical responses and categorical covariates are frequently encountered in the analysis of big data, for which feature screening has become an indispensable statistical tool. We propose a Pearson chi-square based feature screening procedure for categorical response with ultrahigh dimensional categorical covariates. The proposed procedure can be directly applied for detection of important interaction effects. We further show that the proposed procedure possesses screening consistency property in the terminology of Fan and Lv (2008). We investigate the finite sample performance of the proposed procedure by Monte Carlo simulation studies and illustrate the proposed method by two empirical datasets.

Original languageEnglish (US)
Pages (from-to)237-244
Number of pages8
JournalJournal of Business and Economic Statistics
Volume32
Issue number2
DOIs
StatePublished - Apr 3 2014

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Social Sciences (miscellaneous)
  • Economics and Econometrics
  • Statistics, Probability and Uncertainty

Fingerprint Dive into the research topics of 'Feature Screening for Ultrahigh Dimensional Categorical Data With Applications'. Together they form a unique fingerprint.

  • Cite this