Rule based classifier for the analysis of gene-gene and gene-environment interactions in genetic association studies

Thorsten Lehr, Jing Yuan, Dirk Zeumer, Supriya Jayadev, Marylyn D. Ritchie

Research output: Contribution to journalArticlepeer-review

15 Scopus citations


Background: Several methods have been presented for the analysis of complex interactions between genetic polymorphisms and/or environmental factors. Despite the available methods, there is still a need for alternative methods, because no single method will perform well in all scenarios. The aim of this work was to evaluate the performance of three selected rule based classifier algorithms, RIPPER, RIDOR and PART, for the analysis of genetic association studies. Methods: Overall, 42 datasets were simulated with three different case-control models, a varying number of subjects (300, 600), SNPs (500, 1500, 3000) and noise (5%, 10%, 20%). The algorithms were applied to each of the datasets with a set of algorithm-specific settings. Results were further investigated with respect to a) the Model, b) the Rules, and c) the Attribute level. Data analysis was performed using WEKA, SAS and PERL. Results: The RIPPER algorithm discovered the true case-control model at least once in >33% of the datasets. The RIDOR and PART algorithm performed poorly for model detection. The RIPPER, RIDOR and PART algorithm discovered the true case-control rules in more than 83%, 83% and 44% of the datasets, respectively. All three algorithms were able to detect the attributes utilized in the respective case-control models in most datasets. Conclusions: The current analyses substantiate the utility of rule based classifiers such as RIPPER, RIDOR and PART for the detection of gene-gene/gene-environment interactions in genetic association studies. These classifiers could provide a valuable new method, complementing existing approaches, in the analysis of genetic association studies. The methods provide an advantage in being able to handle both categorical and continuous variable types. Further, because the outputs of the analyses are easy to interpret, the rule based classifier approach could quickly generate testable hypotheses for additional evaluation. Since the algorithms are computationally inexpensive, they may serve as valuable tools for preselection of attributes to be used in more complex, computationally intensive approaches. Whether used in isolation or in conjunction with other tools, rule based classifiers are an important addition to the armamentarium of tools available for analyses of complex genetic association studies.

Original languageEnglish (US)
Article number4
JournalBioData Mining
Issue number1
StatePublished - 2011

All Science Journal Classification (ASJC) codes

  • Biochemistry
  • Molecular Biology
  • Genetics
  • Computer Science Applications
  • Computational Theory and Mathematics
  • Computational Mathematics


Dive into the research topics of 'Rule based classifier for the analysis of gene-gene and gene-environment interactions in genetic association studies'. Together they form a unique fingerprint.

Cite this