Comparison of approaches for machine-learning optimization of neural networks for detecting gene-gene interactions in genetic epidemiology

Alison A. Motsinger-Reif, Scott M. Dudek, Lance W. Hahn, Marylyn D. Ritchie

Research output: Contribution to journalArticle

72 Scopus citations

Abstract

The detection of genotypes that predict common, complex disease is a challenge for human geneticists. The phenomenon of epistasis, or gene-gene interactions, is particularly problematic for traditional statistical techniques. Additionally, the explosion of genetic information makes exhaustive searches of multilocus combinations computationally infeasible. To address these challenges, neural networks (NN), a pattern recognition method, have been used. One limitation of the NN approach is that its success is dependent on the architecture of the network. To solve this, machine-learning approaches have been suggested to evolve the best NN architecture for a particular data set. In this study we provide a detailed technical description of the use of grammatical evolution to optimize neural networks (GENN) for use in genetic association studies. We compare the performance of GENN to that of a previous machine-learning NN application - genetic programming neural networks in both simulated and real data. We show that GENN greatly outperforms genetic programming neural networks in data sets with a large number of single nucleotide polymorphisms. Additionally, we demonstrate that GENN has high power to detect disease-risk loci in a range of high-order epistatic models. Finally, we demonstrate the scalability of the GENN method with increasing numbers of variables - as many as 500,000 single nucleotide polymorphisms.

Original languageEnglish (US)
Pages (from-to)325-340
Number of pages16
JournalGenetic Epidemiology
Volume32
Issue number4
DOIs
StatePublished - May 1 2008

    Fingerprint

All Science Journal Classification (ASJC) codes

  • Epidemiology
  • Genetics(clinical)

Cite this