Statistical models for longitudinal zero-inflated count data with applications to the substance abuse field

Anne Buu, Runze Li, Xianming Tan, Robert A. Zucker

Research output: Contribution to journalArticle

38 Citations (Scopus)

Abstract

This study fills in the current knowledge gaps in statistical analysis of longitudinal zero-inflated count data by providing a comprehensive review and comparison of the hurdle and zero-inflated Poisson models in terms of the conceptual framework, computational advantage, and performance under different real data situations. The design of simulations represents the special features of a well-known longitudinal study of alcoholism so that the results can be generalizable to the substance abuse field. When the hurdle model is more natural under the conceptual framework of the data, the zero-inflated Poisson model tends to produce inaccurate estimates. Model performance improves with larger sample sizes, lower proportions of missing data, and lower correlations between covariates. The simulation also shows that the computational strength of the hurdle model disappears when random effects are included.

Original languageEnglish (US)
Pages (from-to)4074-4086
Number of pages13
JournalStatistics in Medicine
Volume31
Issue number29
DOIs
StatePublished - Dec 20 2012

Fingerprint

Count Data
Statistical Models
Sample Size
Alcoholism
Statistical Model
Substance-Related Disorders
Longitudinal Studies
Poisson Model
Zero
Longitudinal Study
Performance Model
Inaccurate
Random Effects
Missing Data
Statistical Analysis
Covariates
Simulation
Proportion
Tend
Model

All Science Journal Classification (ASJC) codes

  • Epidemiology
  • Statistics and Probability

Cite this

Buu, Anne ; Li, Runze ; Tan, Xianming ; Zucker, Robert A. / Statistical models for longitudinal zero-inflated count data with applications to the substance abuse field. In: Statistics in Medicine. 2012 ; Vol. 31, No. 29. pp. 4074-4086.
@article{1f264f8e6cac40eaa914e96ab64d8090,
title = "Statistical models for longitudinal zero-inflated count data with applications to the substance abuse field",
abstract = "This study fills in the current knowledge gaps in statistical analysis of longitudinal zero-inflated count data by providing a comprehensive review and comparison of the hurdle and zero-inflated Poisson models in terms of the conceptual framework, computational advantage, and performance under different real data situations. The design of simulations represents the special features of a well-known longitudinal study of alcoholism so that the results can be generalizable to the substance abuse field. When the hurdle model is more natural under the conceptual framework of the data, the zero-inflated Poisson model tends to produce inaccurate estimates. Model performance improves with larger sample sizes, lower proportions of missing data, and lower correlations between covariates. The simulation also shows that the computational strength of the hurdle model disappears when random effects are included.",
author = "Anne Buu and Runze Li and Xianming Tan and Zucker, {Robert A.}",
year = "2012",
month = "12",
day = "20",
doi = "10.1002/sim.5510",
language = "English (US)",
volume = "31",
pages = "4074--4086",
journal = "Statistics in Medicine",
issn = "0277-6715",
publisher = "John Wiley and Sons Ltd",
number = "29",

}

Statistical models for longitudinal zero-inflated count data with applications to the substance abuse field. / Buu, Anne; Li, Runze; Tan, Xianming; Zucker, Robert A.

In: Statistics in Medicine, Vol. 31, No. 29, 20.12.2012, p. 4074-4086.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Statistical models for longitudinal zero-inflated count data with applications to the substance abuse field

AU - Buu, Anne

AU - Li, Runze

AU - Tan, Xianming

AU - Zucker, Robert A.

PY - 2012/12/20

Y1 - 2012/12/20

N2 - This study fills in the current knowledge gaps in statistical analysis of longitudinal zero-inflated count data by providing a comprehensive review and comparison of the hurdle and zero-inflated Poisson models in terms of the conceptual framework, computational advantage, and performance under different real data situations. The design of simulations represents the special features of a well-known longitudinal study of alcoholism so that the results can be generalizable to the substance abuse field. When the hurdle model is more natural under the conceptual framework of the data, the zero-inflated Poisson model tends to produce inaccurate estimates. Model performance improves with larger sample sizes, lower proportions of missing data, and lower correlations between covariates. The simulation also shows that the computational strength of the hurdle model disappears when random effects are included.

AB - This study fills in the current knowledge gaps in statistical analysis of longitudinal zero-inflated count data by providing a comprehensive review and comparison of the hurdle and zero-inflated Poisson models in terms of the conceptual framework, computational advantage, and performance under different real data situations. The design of simulations represents the special features of a well-known longitudinal study of alcoholism so that the results can be generalizable to the substance abuse field. When the hurdle model is more natural under the conceptual framework of the data, the zero-inflated Poisson model tends to produce inaccurate estimates. Model performance improves with larger sample sizes, lower proportions of missing data, and lower correlations between covariates. The simulation also shows that the computational strength of the hurdle model disappears when random effects are included.

UR - http://www.scopus.com/inward/record.url?scp=84870057268&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84870057268&partnerID=8YFLogxK

U2 - 10.1002/sim.5510

DO - 10.1002/sim.5510

M3 - Article

C2 - 22826194

AN - SCOPUS:84870057268

VL - 31

SP - 4074

EP - 4086

JO - Statistics in Medicine

JF - Statistics in Medicine

SN - 0277-6715

IS - 29

ER -