ATHENA: Identifying interactions between different levels of genomic data associated with cancer clinical outcomes using grammatical evolution neural network

Dokyoon Kim, Ruowang Li, Scott M. Dudek, Marylyn Deriggi Ritchie

Research output: Contribution to journalArticle

24 Citations (Scopus)

Abstract

Background: Gene expression profiles have been broadly used in cancer research as a diagnostic or prognostic signature for the clinical outcome prediction such as stage, grade, metastatic status, recurrence, and patient survival, as well as to potentially improve patient management. However, emerging evidence shows that gene expression-based prediction varies between independent data sets. One possible explanation of this effect is that previous studies were focused on identifying genes with large main effects associated with clinical outcomes. Thus, non-linear interactions without large individual main effects would be missed. The other possible explanation is that gene expression as a single level of genomic data is insufficient to explain the clinical outcomes of interest since cancer can be dysregulated by multiple alterations through genome, epigenome, transcriptome, and proteome levels. In order to overcome the variability of diagnostic or prognostic predictors from gene expression alone and to increase its predictive power, we need to integrate multi-levels of genomic data and identify interactions between them associated with clinical outcomes. Results: Here, we proposed an integrative framework for identifying interactions within/between multi-levels of genomic data associated with cancer clinical outcomes using the Grammatical Evolution Neural Networks (GENN). In order to demonstrate the validity of the proposed framework, ovarian cancer data from TCGA was used as a pilot task. We found not only interactions within a single genomic level but also interactions between multi-levels of genomic data associated with survival in ovarian cancer. Notably, the integration model from different levels of genomic data achieved 72.89% balanced accuracy and outperformed the top models with any single level of genomic data. Conclusions: Understanding the underlying tumorigenesis and progression in ovarian cancer through the global view of interactions within/between different levels of genomic data is expected to provide guidance for improved prognostic biomarkers and individualized therapies.

Original languageEnglish (US)
Article number23
JournalBioData Mining
Volume6
Issue number1
DOIs
StatePublished - Dec 20 2013

Fingerprint

Grammatical Evolution
Gene expression
Ovarian Neoplasms
Genomics
Cancer
Neural Networks
Neural networks
Transcriptome
Gene Expression
Interaction
Ovarian Cancer
Neoplasms
Survival
Genes
Proteome
Carcinogenesis
Biomarkers
Main Effect
Genome
Recurrence

All Science Journal Classification (ASJC) codes

  • Biochemistry
  • Molecular Biology
  • Genetics
  • Computer Science Applications
  • Computational Theory and Mathematics
  • Computational Mathematics

Cite this

@article{d4728d0071b048db8fc1c6bc9788d0e1,
title = "ATHENA: Identifying interactions between different levels of genomic data associated with cancer clinical outcomes using grammatical evolution neural network",
abstract = "Background: Gene expression profiles have been broadly used in cancer research as a diagnostic or prognostic signature for the clinical outcome prediction such as stage, grade, metastatic status, recurrence, and patient survival, as well as to potentially improve patient management. However, emerging evidence shows that gene expression-based prediction varies between independent data sets. One possible explanation of this effect is that previous studies were focused on identifying genes with large main effects associated with clinical outcomes. Thus, non-linear interactions without large individual main effects would be missed. The other possible explanation is that gene expression as a single level of genomic data is insufficient to explain the clinical outcomes of interest since cancer can be dysregulated by multiple alterations through genome, epigenome, transcriptome, and proteome levels. In order to overcome the variability of diagnostic or prognostic predictors from gene expression alone and to increase its predictive power, we need to integrate multi-levels of genomic data and identify interactions between them associated with clinical outcomes. Results: Here, we proposed an integrative framework for identifying interactions within/between multi-levels of genomic data associated with cancer clinical outcomes using the Grammatical Evolution Neural Networks (GENN). In order to demonstrate the validity of the proposed framework, ovarian cancer data from TCGA was used as a pilot task. We found not only interactions within a single genomic level but also interactions between multi-levels of genomic data associated with survival in ovarian cancer. Notably, the integration model from different levels of genomic data achieved 72.89{\%} balanced accuracy and outperformed the top models with any single level of genomic data. Conclusions: Understanding the underlying tumorigenesis and progression in ovarian cancer through the global view of interactions within/between different levels of genomic data is expected to provide guidance for improved prognostic biomarkers and individualized therapies.",
author = "Dokyoon Kim and Ruowang Li and Dudek, {Scott M.} and Ritchie, {Marylyn Deriggi}",
year = "2013",
month = "12",
day = "20",
doi = "10.1186/1756-0381-6-23",
language = "English (US)",
volume = "6",
journal = "BioData Mining",
issn = "1756-0381",
publisher = "BioMed Central",
number = "1",

}

ATHENA : Identifying interactions between different levels of genomic data associated with cancer clinical outcomes using grammatical evolution neural network. / Kim, Dokyoon; Li, Ruowang; Dudek, Scott M.; Ritchie, Marylyn Deriggi.

In: BioData Mining, Vol. 6, No. 1, 23, 20.12.2013.

Research output: Contribution to journalArticle

TY - JOUR

T1 - ATHENA

T2 - Identifying interactions between different levels of genomic data associated with cancer clinical outcomes using grammatical evolution neural network

AU - Kim, Dokyoon

AU - Li, Ruowang

AU - Dudek, Scott M.

AU - Ritchie, Marylyn Deriggi

PY - 2013/12/20

Y1 - 2013/12/20

N2 - Background: Gene expression profiles have been broadly used in cancer research as a diagnostic or prognostic signature for the clinical outcome prediction such as stage, grade, metastatic status, recurrence, and patient survival, as well as to potentially improve patient management. However, emerging evidence shows that gene expression-based prediction varies between independent data sets. One possible explanation of this effect is that previous studies were focused on identifying genes with large main effects associated with clinical outcomes. Thus, non-linear interactions without large individual main effects would be missed. The other possible explanation is that gene expression as a single level of genomic data is insufficient to explain the clinical outcomes of interest since cancer can be dysregulated by multiple alterations through genome, epigenome, transcriptome, and proteome levels. In order to overcome the variability of diagnostic or prognostic predictors from gene expression alone and to increase its predictive power, we need to integrate multi-levels of genomic data and identify interactions between them associated with clinical outcomes. Results: Here, we proposed an integrative framework for identifying interactions within/between multi-levels of genomic data associated with cancer clinical outcomes using the Grammatical Evolution Neural Networks (GENN). In order to demonstrate the validity of the proposed framework, ovarian cancer data from TCGA was used as a pilot task. We found not only interactions within a single genomic level but also interactions between multi-levels of genomic data associated with survival in ovarian cancer. Notably, the integration model from different levels of genomic data achieved 72.89% balanced accuracy and outperformed the top models with any single level of genomic data. Conclusions: Understanding the underlying tumorigenesis and progression in ovarian cancer through the global view of interactions within/between different levels of genomic data is expected to provide guidance for improved prognostic biomarkers and individualized therapies.

AB - Background: Gene expression profiles have been broadly used in cancer research as a diagnostic or prognostic signature for the clinical outcome prediction such as stage, grade, metastatic status, recurrence, and patient survival, as well as to potentially improve patient management. However, emerging evidence shows that gene expression-based prediction varies between independent data sets. One possible explanation of this effect is that previous studies were focused on identifying genes with large main effects associated with clinical outcomes. Thus, non-linear interactions without large individual main effects would be missed. The other possible explanation is that gene expression as a single level of genomic data is insufficient to explain the clinical outcomes of interest since cancer can be dysregulated by multiple alterations through genome, epigenome, transcriptome, and proteome levels. In order to overcome the variability of diagnostic or prognostic predictors from gene expression alone and to increase its predictive power, we need to integrate multi-levels of genomic data and identify interactions between them associated with clinical outcomes. Results: Here, we proposed an integrative framework for identifying interactions within/between multi-levels of genomic data associated with cancer clinical outcomes using the Grammatical Evolution Neural Networks (GENN). In order to demonstrate the validity of the proposed framework, ovarian cancer data from TCGA was used as a pilot task. We found not only interactions within a single genomic level but also interactions between multi-levels of genomic data associated with survival in ovarian cancer. Notably, the integration model from different levels of genomic data achieved 72.89% balanced accuracy and outperformed the top models with any single level of genomic data. Conclusions: Understanding the underlying tumorigenesis and progression in ovarian cancer through the global view of interactions within/between different levels of genomic data is expected to provide guidance for improved prognostic biomarkers and individualized therapies.

UR - http://www.scopus.com/inward/record.url?scp=84890485602&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84890485602&partnerID=8YFLogxK

U2 - 10.1186/1756-0381-6-23

DO - 10.1186/1756-0381-6-23

M3 - Article

AN - SCOPUS:84890485602

VL - 6

JO - BioData Mining

JF - BioData Mining

SN - 1756-0381

IS - 1

M1 - 23

ER -