An application of conditional logistic regression and multifactor dimensionality reduction for detecting gene-gene interactions on risk of myocardial infarction

The importance of model validation

Christopher S. Coffey, Patricia R. Hebert, Marylyn Deriggi Ritchie, Harlan M. Krumholz, J. Michael Gaziano, Paul M. Ridker, Nancy J. Brown, Douglas E. Vaughan, Jason H. Moore

Research output: Contribution to journalArticle

109 Citations (Scopus)

Abstract

Background: To examine interactions among the angiotensin converting enzyme (ACE) insertion/deletion, plasminogen activator inhibitor-1 (PAI-1) 4G/5G, and tissue plasminogen activator (t-PA) insertion/deletion gene polymorphisms on risk of myocardial infarction using data from 343 matched case-control pairs from the Physicians Health Study. We examined the data using both conditional logistic regression and the multifactor dimensionality reduction (MDR) method. One advantage of the MDR method is that it provides an internal prediction error for validation. We summarize our use of this internal prediction error for model validation. Results: The overall results for the two methods were consistent, with both suggesting an interaction between the ACE I/D and PAI-1 4G/5G polymorphisms. However, using ten-fold cross validation, the 46% prediction error for the final MDR model was not significantly lower than that expected by chance. Conclusions: The significant interaction initially observed does not validate and may represent a type I error. As data-driven analytic methods continue to be developed and used to examine complex genetic interactions, it will become increasingly important to stress model validation in order to ensure that significant effects represent true relationships rather than chance findings.

Original languageEnglish (US)
Article number49
JournalBMC bioinformatics
Volume5
DOIs
StatePublished - Apr 30 2004

Fingerprint

Multifactor Dimensionality Reduction
Conditional Logistic Regression
Myocardial Infarction
Model Validation
Dimensionality Reduction
Logistics
Prediction Error
Genes
Logistic Models
Angiotensin
Gene
Plasminogen Activator Inhibitor 1
Peptidyl-Dipeptidase A
Polymorphism
Reduction Method
Interaction
Deletion
Inhibitor
Insertion
Enzymes

All Science Journal Classification (ASJC) codes

  • Structural Biology
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Applied Mathematics

Cite this

Coffey, Christopher S. ; Hebert, Patricia R. ; Ritchie, Marylyn Deriggi ; Krumholz, Harlan M. ; Gaziano, J. Michael ; Ridker, Paul M. ; Brown, Nancy J. ; Vaughan, Douglas E. ; Moore, Jason H. / An application of conditional logistic regression and multifactor dimensionality reduction for detecting gene-gene interactions on risk of myocardial infarction : The importance of model validation. In: BMC bioinformatics. 2004 ; Vol. 5.
@article{477b4376e7854a5bac781205fa09cf19,
title = "An application of conditional logistic regression and multifactor dimensionality reduction for detecting gene-gene interactions on risk of myocardial infarction: The importance of model validation",
abstract = "Background: To examine interactions among the angiotensin converting enzyme (ACE) insertion/deletion, plasminogen activator inhibitor-1 (PAI-1) 4G/5G, and tissue plasminogen activator (t-PA) insertion/deletion gene polymorphisms on risk of myocardial infarction using data from 343 matched case-control pairs from the Physicians Health Study. We examined the data using both conditional logistic regression and the multifactor dimensionality reduction (MDR) method. One advantage of the MDR method is that it provides an internal prediction error for validation. We summarize our use of this internal prediction error for model validation. Results: The overall results for the two methods were consistent, with both suggesting an interaction between the ACE I/D and PAI-1 4G/5G polymorphisms. However, using ten-fold cross validation, the 46{\%} prediction error for the final MDR model was not significantly lower than that expected by chance. Conclusions: The significant interaction initially observed does not validate and may represent a type I error. As data-driven analytic methods continue to be developed and used to examine complex genetic interactions, it will become increasingly important to stress model validation in order to ensure that significant effects represent true relationships rather than chance findings.",
author = "Coffey, {Christopher S.} and Hebert, {Patricia R.} and Ritchie, {Marylyn Deriggi} and Krumholz, {Harlan M.} and Gaziano, {J. Michael} and Ridker, {Paul M.} and Brown, {Nancy J.} and Vaughan, {Douglas E.} and Moore, {Jason H.}",
year = "2004",
month = "4",
day = "30",
doi = "10.1186/1471-2105-5-49",
language = "English (US)",
volume = "5",
journal = "BMC Bioinformatics",
issn = "1471-2105",
publisher = "BioMed Central",

}

An application of conditional logistic regression and multifactor dimensionality reduction for detecting gene-gene interactions on risk of myocardial infarction : The importance of model validation. / Coffey, Christopher S.; Hebert, Patricia R.; Ritchie, Marylyn Deriggi; Krumholz, Harlan M.; Gaziano, J. Michael; Ridker, Paul M.; Brown, Nancy J.; Vaughan, Douglas E.; Moore, Jason H.

In: BMC bioinformatics, Vol. 5, 49, 30.04.2004.

Research output: Contribution to journalArticle

TY - JOUR

T1 - An application of conditional logistic regression and multifactor dimensionality reduction for detecting gene-gene interactions on risk of myocardial infarction

T2 - The importance of model validation

AU - Coffey, Christopher S.

AU - Hebert, Patricia R.

AU - Ritchie, Marylyn Deriggi

AU - Krumholz, Harlan M.

AU - Gaziano, J. Michael

AU - Ridker, Paul M.

AU - Brown, Nancy J.

AU - Vaughan, Douglas E.

AU - Moore, Jason H.

PY - 2004/4/30

Y1 - 2004/4/30

N2 - Background: To examine interactions among the angiotensin converting enzyme (ACE) insertion/deletion, plasminogen activator inhibitor-1 (PAI-1) 4G/5G, and tissue plasminogen activator (t-PA) insertion/deletion gene polymorphisms on risk of myocardial infarction using data from 343 matched case-control pairs from the Physicians Health Study. We examined the data using both conditional logistic regression and the multifactor dimensionality reduction (MDR) method. One advantage of the MDR method is that it provides an internal prediction error for validation. We summarize our use of this internal prediction error for model validation. Results: The overall results for the two methods were consistent, with both suggesting an interaction between the ACE I/D and PAI-1 4G/5G polymorphisms. However, using ten-fold cross validation, the 46% prediction error for the final MDR model was not significantly lower than that expected by chance. Conclusions: The significant interaction initially observed does not validate and may represent a type I error. As data-driven analytic methods continue to be developed and used to examine complex genetic interactions, it will become increasingly important to stress model validation in order to ensure that significant effects represent true relationships rather than chance findings.

AB - Background: To examine interactions among the angiotensin converting enzyme (ACE) insertion/deletion, plasminogen activator inhibitor-1 (PAI-1) 4G/5G, and tissue plasminogen activator (t-PA) insertion/deletion gene polymorphisms on risk of myocardial infarction using data from 343 matched case-control pairs from the Physicians Health Study. We examined the data using both conditional logistic regression and the multifactor dimensionality reduction (MDR) method. One advantage of the MDR method is that it provides an internal prediction error for validation. We summarize our use of this internal prediction error for model validation. Results: The overall results for the two methods were consistent, with both suggesting an interaction between the ACE I/D and PAI-1 4G/5G polymorphisms. However, using ten-fold cross validation, the 46% prediction error for the final MDR model was not significantly lower than that expected by chance. Conclusions: The significant interaction initially observed does not validate and may represent a type I error. As data-driven analytic methods continue to be developed and used to examine complex genetic interactions, it will become increasingly important to stress model validation in order to ensure that significant effects represent true relationships rather than chance findings.

UR - http://www.scopus.com/inward/record.url?scp=2942517830&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=2942517830&partnerID=8YFLogxK

U2 - 10.1186/1471-2105-5-49

DO - 10.1186/1471-2105-5-49

M3 - Article

VL - 5

JO - BMC Bioinformatics

JF - BMC Bioinformatics

SN - 1471-2105

M1 - 49

ER -