Background: With the advent of increasingly efficient means to obtain genetic information, a great insurgence of data has resulted, leading to the need for methods for analyzing this data beyond that of traditional parametric statistical approaches. Recently we introduced Grammatical Evolution Neural Network (GENN), a machine-learning approach to detect gene-gene or gene-environment interactions, also known as epistasis, in high dimensional genetic epidemiological data. GENN has been shown to be highly successful in a range of simulated data, but the impact of error common to real data is unknown. In the current study, we examine the power of GENN to detect interesting interactions in the presence of noise due to genotyping error, missing data, phenocopy, and genetic heterogeneity. Additionally, we compare the performance of GENN to that of another computational method - Multifactor Dimensionality Reduction (MDR). Findings: GENN is extremely robust to missing data and genotyping error. Phenocopy in a dataset reduces the power of both GENN and MDR. GENN is reasonably robust to genetic heterogeneity and find that in some cases GENN has substantially higher power than MDR to detect functional loci in the presence of genetic heterogeneity. Conclusion: GENN is a promising method to detect gene-gene interaction, even in the presence of common types of error found in real data.
All Science Journal Classification (ASJC) codes
- Biochemistry, Genetics and Molecular Biology(all)