Dimension reduction and estimation in the secondary analysis of case-control studies

Liang Liang, Raymond Carroll, Yanyuan Ma

Research output: Contribution to journalArticle

Abstract

Studying the relationship between covariates based on retrospective data is the main purpose of secondary analysis, an area of increasing interest. We examine the secondary analysis problem when multiple covariates are available, while only a regression mean model is specified. Despite the completely parametric modeling of the regression mean function, the case-control nature of the data requires special treatment and semiparametric efficient estimation generates various nonparametric estimation problems with multivariate covariates. We devise a dimension reduction approach that fits with the specified primary and secondary models in the original problem setting, and use reweighting to adjust for the case-control nature of the data, even when the disease rate in the source population is unknown. The resulting estimator is both locally efficient and robust against the misspecification of the regression error distribution, which can be heteroscedastic as well as non-Gaussian. We demonstrate the advantage of our method over several existing methods, both analytically and numerically.

Original languageEnglish (US)
Pages (from-to)1782-1821
Number of pages40
JournalElectronic Journal of Statistics
Volume12
Issue number1
DOIs
StatePublished - Jan 1 2018

Fingerprint

Case-control Study
Dimension Reduction
Covariates
Case-control
Regression
Parametric Modeling
Semiparametric Estimation
Efficient Estimation
Misspecification
Nonparametric Estimation
Estimator
Unknown
Model
Demonstrate
Dimension reduction

All Science Journal Classification (ASJC) codes

  • Statistics and Probability

Cite this

@article{6af9e71f41ac4bf9a30bb7326d15ef03,
title = "Dimension reduction and estimation in the secondary analysis of case-control studies",
abstract = "Studying the relationship between covariates based on retrospective data is the main purpose of secondary analysis, an area of increasing interest. We examine the secondary analysis problem when multiple covariates are available, while only a regression mean model is specified. Despite the completely parametric modeling of the regression mean function, the case-control nature of the data requires special treatment and semiparametric efficient estimation generates various nonparametric estimation problems with multivariate covariates. We devise a dimension reduction approach that fits with the specified primary and secondary models in the original problem setting, and use reweighting to adjust for the case-control nature of the data, even when the disease rate in the source population is unknown. The resulting estimator is both locally efficient and robust against the misspecification of the regression error distribution, which can be heteroscedastic as well as non-Gaussian. We demonstrate the advantage of our method over several existing methods, both analytically and numerically.",
author = "Liang Liang and Raymond Carroll and Yanyuan Ma",
year = "2018",
month = "1",
day = "1",
doi = "10.1214/18-EJS1446",
language = "English (US)",
volume = "12",
pages = "1782--1821",
journal = "Electronic Journal of Statistics",
issn = "1935-7524",
publisher = "Institute of Mathematical Statistics",
number = "1",

}

Dimension reduction and estimation in the secondary analysis of case-control studies. / Liang, Liang; Carroll, Raymond; Ma, Yanyuan.

In: Electronic Journal of Statistics, Vol. 12, No. 1, 01.01.2018, p. 1782-1821.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Dimension reduction and estimation in the secondary analysis of case-control studies

AU - Liang, Liang

AU - Carroll, Raymond

AU - Ma, Yanyuan

PY - 2018/1/1

Y1 - 2018/1/1

N2 - Studying the relationship between covariates based on retrospective data is the main purpose of secondary analysis, an area of increasing interest. We examine the secondary analysis problem when multiple covariates are available, while only a regression mean model is specified. Despite the completely parametric modeling of the regression mean function, the case-control nature of the data requires special treatment and semiparametric efficient estimation generates various nonparametric estimation problems with multivariate covariates. We devise a dimension reduction approach that fits with the specified primary and secondary models in the original problem setting, and use reweighting to adjust for the case-control nature of the data, even when the disease rate in the source population is unknown. The resulting estimator is both locally efficient and robust against the misspecification of the regression error distribution, which can be heteroscedastic as well as non-Gaussian. We demonstrate the advantage of our method over several existing methods, both analytically and numerically.

AB - Studying the relationship between covariates based on retrospective data is the main purpose of secondary analysis, an area of increasing interest. We examine the secondary analysis problem when multiple covariates are available, while only a regression mean model is specified. Despite the completely parametric modeling of the regression mean function, the case-control nature of the data requires special treatment and semiparametric efficient estimation generates various nonparametric estimation problems with multivariate covariates. We devise a dimension reduction approach that fits with the specified primary and secondary models in the original problem setting, and use reweighting to adjust for the case-control nature of the data, even when the disease rate in the source population is unknown. The resulting estimator is both locally efficient and robust against the misspecification of the regression error distribution, which can be heteroscedastic as well as non-Gaussian. We demonstrate the advantage of our method over several existing methods, both analytically and numerically.

UR - http://www.scopus.com/inward/record.url?scp=85048477782&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85048477782&partnerID=8YFLogxK

U2 - 10.1214/18-EJS1446

DO - 10.1214/18-EJS1446

M3 - Article

C2 - 30100949

AN - SCOPUS:85048477782

VL - 12

SP - 1782

EP - 1821

JO - Electronic Journal of Statistics

JF - Electronic Journal of Statistics

SN - 1935-7524

IS - 1

ER -