Alternative contingency table measures improve the power and detection of multifactor dimensionality reduction

William S. Bush, Todd L. Edwards, Scott M. Dudek, Brett A. McKinney, Marylyn D. Ritchie

Research output: Contribution to journalArticle

42 Citations (Scopus)

Abstract

Background: Multifactor Dimensionality Reduction (MDR) has been introduced previously as a non-parametric statistical method for detecting gene-gene interactions. MDR performs a dimensional reduction by assigning multi-locus genotypes to either high- or low-risk groups and measuring the percentage of cases and controls incorrectly labelled by this classification - the classification error. The combination of variables that produces the lowest classification error is selected as the best or most fit model. The correctly and incorrectly labelled cases and controls can be expressed as a two-way contingency table. We sought to improve the ability of MDR to detect gene-gene interactions by replacing classification error with a different measure to score model quality. Results: In this study, we compare the detection and power of MDR using a variety of measures for two-way contingency table analysis. We simulated 40 genetic models, varying the number of disease loci in the model (2-5), allele frequencies of the disease loci (.2/.8 or .4/.6) and the broad-sense heritability of the model (.05-.3). Overall, detection using NMI was 65.36% across all models, and specific detection was 59.4% versus detection using classification error at 62% and specific detection was 52.2%. Conclusion: Of the 10 measures evaluated, the likelihood ratio and normalized mutual information (NMI) are measures that consistently improve the detection and power of MDR in simulated data over using classification error. These measures also reduce the inclusion of spurious variables in a multi-locus model. Thus, MDR, which has already been demonstrated as a powerful tool for detecting gene-gene interactions, can be improved with the use of alternative fitness functions.

Original languageEnglish (US)
Article number238
JournalBMC bioinformatics
Volume9
DOIs
StatePublished - May 16 2008

Fingerprint

Multifactor Dimensionality Reduction
Contingency Table
Dimensionality Reduction
Genes
Gene
Alternatives
Locus
Mutual Information
Model
Interaction
Heritability
Genetic Models
Dimensional Reduction
Gene Frequency
Nonparametric Methods
Likelihood Ratio
Fitness Function
Genotype
Statistical method
Percentage

All Science Journal Classification (ASJC) codes

  • Structural Biology
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Applied Mathematics

Cite this

Bush, William S. ; Edwards, Todd L. ; Dudek, Scott M. ; McKinney, Brett A. ; Ritchie, Marylyn D. / Alternative contingency table measures improve the power and detection of multifactor dimensionality reduction. In: BMC bioinformatics. 2008 ; Vol. 9.
@article{dd6c961adace4d35b4d2bb9f852b8a59,
title = "Alternative contingency table measures improve the power and detection of multifactor dimensionality reduction",
abstract = "Background: Multifactor Dimensionality Reduction (MDR) has been introduced previously as a non-parametric statistical method for detecting gene-gene interactions. MDR performs a dimensional reduction by assigning multi-locus genotypes to either high- or low-risk groups and measuring the percentage of cases and controls incorrectly labelled by this classification - the classification error. The combination of variables that produces the lowest classification error is selected as the best or most fit model. The correctly and incorrectly labelled cases and controls can be expressed as a two-way contingency table. We sought to improve the ability of MDR to detect gene-gene interactions by replacing classification error with a different measure to score model quality. Results: In this study, we compare the detection and power of MDR using a variety of measures for two-way contingency table analysis. We simulated 40 genetic models, varying the number of disease loci in the model (2-5), allele frequencies of the disease loci (.2/.8 or .4/.6) and the broad-sense heritability of the model (.05-.3). Overall, detection using NMI was 65.36{\%} across all models, and specific detection was 59.4{\%} versus detection using classification error at 62{\%} and specific detection was 52.2{\%}. Conclusion: Of the 10 measures evaluated, the likelihood ratio and normalized mutual information (NMI) are measures that consistently improve the detection and power of MDR in simulated data over using classification error. These measures also reduce the inclusion of spurious variables in a multi-locus model. Thus, MDR, which has already been demonstrated as a powerful tool for detecting gene-gene interactions, can be improved with the use of alternative fitness functions.",
author = "Bush, {William S.} and Edwards, {Todd L.} and Dudek, {Scott M.} and McKinney, {Brett A.} and Ritchie, {Marylyn D.}",
year = "2008",
month = "5",
day = "16",
doi = "10.1186/1471-2105-9-238",
language = "English (US)",
volume = "9",
journal = "BMC Bioinformatics",
issn = "1471-2105",
publisher = "BioMed Central",

}

Alternative contingency table measures improve the power and detection of multifactor dimensionality reduction. / Bush, William S.; Edwards, Todd L.; Dudek, Scott M.; McKinney, Brett A.; Ritchie, Marylyn D.

In: BMC bioinformatics, Vol. 9, 238, 16.05.2008.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Alternative contingency table measures improve the power and detection of multifactor dimensionality reduction

AU - Bush, William S.

AU - Edwards, Todd L.

AU - Dudek, Scott M.

AU - McKinney, Brett A.

AU - Ritchie, Marylyn D.

PY - 2008/5/16

Y1 - 2008/5/16

N2 - Background: Multifactor Dimensionality Reduction (MDR) has been introduced previously as a non-parametric statistical method for detecting gene-gene interactions. MDR performs a dimensional reduction by assigning multi-locus genotypes to either high- or low-risk groups and measuring the percentage of cases and controls incorrectly labelled by this classification - the classification error. The combination of variables that produces the lowest classification error is selected as the best or most fit model. The correctly and incorrectly labelled cases and controls can be expressed as a two-way contingency table. We sought to improve the ability of MDR to detect gene-gene interactions by replacing classification error with a different measure to score model quality. Results: In this study, we compare the detection and power of MDR using a variety of measures for two-way contingency table analysis. We simulated 40 genetic models, varying the number of disease loci in the model (2-5), allele frequencies of the disease loci (.2/.8 or .4/.6) and the broad-sense heritability of the model (.05-.3). Overall, detection using NMI was 65.36% across all models, and specific detection was 59.4% versus detection using classification error at 62% and specific detection was 52.2%. Conclusion: Of the 10 measures evaluated, the likelihood ratio and normalized mutual information (NMI) are measures that consistently improve the detection and power of MDR in simulated data over using classification error. These measures also reduce the inclusion of spurious variables in a multi-locus model. Thus, MDR, which has already been demonstrated as a powerful tool for detecting gene-gene interactions, can be improved with the use of alternative fitness functions.

AB - Background: Multifactor Dimensionality Reduction (MDR) has been introduced previously as a non-parametric statistical method for detecting gene-gene interactions. MDR performs a dimensional reduction by assigning multi-locus genotypes to either high- or low-risk groups and measuring the percentage of cases and controls incorrectly labelled by this classification - the classification error. The combination of variables that produces the lowest classification error is selected as the best or most fit model. The correctly and incorrectly labelled cases and controls can be expressed as a two-way contingency table. We sought to improve the ability of MDR to detect gene-gene interactions by replacing classification error with a different measure to score model quality. Results: In this study, we compare the detection and power of MDR using a variety of measures for two-way contingency table analysis. We simulated 40 genetic models, varying the number of disease loci in the model (2-5), allele frequencies of the disease loci (.2/.8 or .4/.6) and the broad-sense heritability of the model (.05-.3). Overall, detection using NMI was 65.36% across all models, and specific detection was 59.4% versus detection using classification error at 62% and specific detection was 52.2%. Conclusion: Of the 10 measures evaluated, the likelihood ratio and normalized mutual information (NMI) are measures that consistently improve the detection and power of MDR in simulated data over using classification error. These measures also reduce the inclusion of spurious variables in a multi-locus model. Thus, MDR, which has already been demonstrated as a powerful tool for detecting gene-gene interactions, can be improved with the use of alternative fitness functions.

UR - http://www.scopus.com/inward/record.url?scp=44949257101&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=44949257101&partnerID=8YFLogxK

U2 - 10.1186/1471-2105-9-238

DO - 10.1186/1471-2105-9-238

M3 - Article

C2 - 18485205

AN - SCOPUS:44949257101

VL - 9

JO - BMC Bioinformatics

JF - BMC Bioinformatics

SN - 1471-2105

M1 - 238

ER -