Assessing influence in variable selection problems

Christian Léger, Naomi Altman

Research output: Contribution to journalArticlepeer-review

14 Scopus citations

Abstract

Variable selection techniques are often used in combination with multiple linear regression to produce a parsimonious model that fits the data well. It is clearly undesirable for the final model to depend strongly on the inclusion of a few influential cases in the data set. This article discusses a measure of influence of single cases on the final model, based on a similar measure used in ordinary multiple regression. When variables are selected objectively, deletion of individual cases can strongly affect the choice of model. The influence of individual cases on the parameters of the selected model are often assessed as part of the model building process. However, such conditional measures fail to evaluate the influence of the cases on the variable selection process. Modern computing environments make it feasible to use an unconditional criterion to determine the influence of each case on the selection procedure. A number of examples are discussed to illustrate the differences between these approaches. Heuristics are developed to explain the examples. We conclude that, although the conditional approach gives valuable information about the selected model, the use of the unconditional approach can lead to greater insight about the influence of individual observations on the process of model selection.

Original languageEnglish (US)
Pages (from-to)547-556
Number of pages10
JournalJournal of the American Statistical Association
Volume88
Issue number422
DOIs
StatePublished - Jun 1993

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Statistics, Probability and Uncertainty

Fingerprint Dive into the research topics of 'Assessing influence in variable selection problems'. Together they form a unique fingerprint.

Cite this