Analysis with missing data in drug prevention research.

J. W. Graham, S. M. Hofer, A. M. Piccinin

Research output: Contribution to journalReview article

89 Citations (Scopus)

Abstract

Missing data problems have been a thorn in the side of prevention researchers for years. Although some solutions for these problems have been available in the statistical literature, these solutions have not found their way into mainstream prevention research. This chapter is meant to serve as an introduction to the systematic application of the missing data analysis solutions presented recently by Little and Rubin (1987) and others. The chapter does not describe a complete strategy, but it is relevant for (1) missing data analysis with continuous (but not categorical) data, (2) data that are reasonably normally distributed, and (3) solutions for missing data problems for analyses related to the general linear model in particular, analyses that use (or can use) a covariance matrix as input. The examples in the chapter come from drug prevention research. The chapter discusses (1) the problem of wanting to ask respondents more questions than most individuals can answer; (2) the problem of attrition and some solutions; and (3) the problem of special measurement procedures that are too expensive or time consuming to obtain for all subjects. The authors end with several conclusions: Whenever possible, researchers should use the Expectation-Maximization (EM) algorithm (or other maximum likelihood procedure, including the multiple-group structural equation-modeling procedure or, where appropriate, multiple imputation, for analyses involving missing data [the chapter provides concrete examples]); If researchers must use other analyses, they should keep in mind that these others produce biased results and should not be relied upon for final analyses; When data are missing, the appropriate missing data analysis procedures do not generate something out of nothing but do make the most out of the data available; When data are missing, researchers should work hard (especially when planning a study) to find the cause of missingness and include the cause in the analysis models; and Researchers should sample the cases originally missing (whenever possible) and adjust EM algorithm parameter estimates accordingly.

Original languageEnglish (US)
Pages (from-to)13-63
Number of pages51
JournalNIDA Research Monograph Series
Volume142
StatePublished - 1994

Fingerprint

Research Personnel
Research
Pharmaceutical Preparations
Linear Models

All Science Journal Classification (ASJC) codes

  • Medicine (miscellaneous)

Cite this

Graham, J. W., Hofer, S. M., & Piccinin, A. M. (1994). Analysis with missing data in drug prevention research. NIDA Research Monograph Series, 142, 13-63.
Graham, J. W. ; Hofer, S. M. ; Piccinin, A. M. / Analysis with missing data in drug prevention research. In: NIDA Research Monograph Series. 1994 ; Vol. 142. pp. 13-63.
@article{41cff7e168df4059880eb006cd179a6a,
title = "Analysis with missing data in drug prevention research.",
abstract = "Missing data problems have been a thorn in the side of prevention researchers for years. Although some solutions for these problems have been available in the statistical literature, these solutions have not found their way into mainstream prevention research. This chapter is meant to serve as an introduction to the systematic application of the missing data analysis solutions presented recently by Little and Rubin (1987) and others. The chapter does not describe a complete strategy, but it is relevant for (1) missing data analysis with continuous (but not categorical) data, (2) data that are reasonably normally distributed, and (3) solutions for missing data problems for analyses related to the general linear model in particular, analyses that use (or can use) a covariance matrix as input. The examples in the chapter come from drug prevention research. The chapter discusses (1) the problem of wanting to ask respondents more questions than most individuals can answer; (2) the problem of attrition and some solutions; and (3) the problem of special measurement procedures that are too expensive or time consuming to obtain for all subjects. The authors end with several conclusions: Whenever possible, researchers should use the Expectation-Maximization (EM) algorithm (or other maximum likelihood procedure, including the multiple-group structural equation-modeling procedure or, where appropriate, multiple imputation, for analyses involving missing data [the chapter provides concrete examples]); If researchers must use other analyses, they should keep in mind that these others produce biased results and should not be relied upon for final analyses; When data are missing, the appropriate missing data analysis procedures do not generate something out of nothing but do make the most out of the data available; When data are missing, researchers should work hard (especially when planning a study) to find the cause of missingness and include the cause in the analysis models; and Researchers should sample the cases originally missing (whenever possible) and adjust EM algorithm parameter estimates accordingly.",
author = "Graham, {J. W.} and Hofer, {S. M.} and Piccinin, {A. M.}",
year = "1994",
language = "English (US)",
volume = "142",
pages = "13--63",
journal = "NIDA Research Monograph Series",
issn = "1046-9516",
publisher = "National Institute on Drug Abuse",

}

Graham, JW, Hofer, SM & Piccinin, AM 1994, 'Analysis with missing data in drug prevention research.', NIDA Research Monograph Series, vol. 142, pp. 13-63.

Analysis with missing data in drug prevention research. / Graham, J. W.; Hofer, S. M.; Piccinin, A. M.

In: NIDA Research Monograph Series, Vol. 142, 1994, p. 13-63.

Research output: Contribution to journalReview article

TY - JOUR

T1 - Analysis with missing data in drug prevention research.

AU - Graham, J. W.

AU - Hofer, S. M.

AU - Piccinin, A. M.

PY - 1994

Y1 - 1994

N2 - Missing data problems have been a thorn in the side of prevention researchers for years. Although some solutions for these problems have been available in the statistical literature, these solutions have not found their way into mainstream prevention research. This chapter is meant to serve as an introduction to the systematic application of the missing data analysis solutions presented recently by Little and Rubin (1987) and others. The chapter does not describe a complete strategy, but it is relevant for (1) missing data analysis with continuous (but not categorical) data, (2) data that are reasonably normally distributed, and (3) solutions for missing data problems for analyses related to the general linear model in particular, analyses that use (or can use) a covariance matrix as input. The examples in the chapter come from drug prevention research. The chapter discusses (1) the problem of wanting to ask respondents more questions than most individuals can answer; (2) the problem of attrition and some solutions; and (3) the problem of special measurement procedures that are too expensive or time consuming to obtain for all subjects. The authors end with several conclusions: Whenever possible, researchers should use the Expectation-Maximization (EM) algorithm (or other maximum likelihood procedure, including the multiple-group structural equation-modeling procedure or, where appropriate, multiple imputation, for analyses involving missing data [the chapter provides concrete examples]); If researchers must use other analyses, they should keep in mind that these others produce biased results and should not be relied upon for final analyses; When data are missing, the appropriate missing data analysis procedures do not generate something out of nothing but do make the most out of the data available; When data are missing, researchers should work hard (especially when planning a study) to find the cause of missingness and include the cause in the analysis models; and Researchers should sample the cases originally missing (whenever possible) and adjust EM algorithm parameter estimates accordingly.

AB - Missing data problems have been a thorn in the side of prevention researchers for years. Although some solutions for these problems have been available in the statistical literature, these solutions have not found their way into mainstream prevention research. This chapter is meant to serve as an introduction to the systematic application of the missing data analysis solutions presented recently by Little and Rubin (1987) and others. The chapter does not describe a complete strategy, but it is relevant for (1) missing data analysis with continuous (but not categorical) data, (2) data that are reasonably normally distributed, and (3) solutions for missing data problems for analyses related to the general linear model in particular, analyses that use (or can use) a covariance matrix as input. The examples in the chapter come from drug prevention research. The chapter discusses (1) the problem of wanting to ask respondents more questions than most individuals can answer; (2) the problem of attrition and some solutions; and (3) the problem of special measurement procedures that are too expensive or time consuming to obtain for all subjects. The authors end with several conclusions: Whenever possible, researchers should use the Expectation-Maximization (EM) algorithm (or other maximum likelihood procedure, including the multiple-group structural equation-modeling procedure or, where appropriate, multiple imputation, for analyses involving missing data [the chapter provides concrete examples]); If researchers must use other analyses, they should keep in mind that these others produce biased results and should not be relied upon for final analyses; When data are missing, the appropriate missing data analysis procedures do not generate something out of nothing but do make the most out of the data available; When data are missing, researchers should work hard (especially when planning a study) to find the cause of missingness and include the cause in the analysis models; and Researchers should sample the cases originally missing (whenever possible) and adjust EM algorithm parameter estimates accordingly.

UR - http://www.scopus.com/inward/record.url?scp=0028671841&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0028671841&partnerID=8YFLogxK

M3 - Review article

C2 - 9243532

AN - SCOPUS:0028671841

VL - 142

SP - 13

EP - 63

JO - NIDA Research Monograph Series

JF - NIDA Research Monograph Series

SN - 1046-9516

ER -