Knowledge boosting

A graph-based integration approach with multi-omics data and genomic knowledge for cancer clinical outcome prediction

Dokyoon Kim, Je Gun Joung, Kyung Ah Sohn, Hyunjung Shin, Yu Rang Park, Marylyn Deriggi Ritchie, Ju Han Kim

Research output: Contribution to journalArticle

32 Citations (Scopus)

Abstract

Objective: Cancer can involve gene dysregulation via multiple mechanisms, so no single level of genomic data fully elucidates tumor behavior due to the presence of numerous genomic variations within or between levels in a biological system. We have previously proposed a graph-based integration approach that combines multi-omics data including copy number alteration, methylation, miRNA, and gene expression data for predicting clinical outcome in cancer. However, genomic features likely interact with other genomic features in complex signaling or regulatory networks, since cancer is caused by alterations in pathways or complete processes. Methods: Here we propose a new graph-based framework for integrating multi-omics data and genomic knowledge to improve power in predicting clinical outcomes and elucidate interplay between different levels. To highlight the validity of our proposed framework, we used an ovarian cancer dataset from The Cancer Genome Atlas for predicting stage, grade, and survival outcomes. Results: Integrating multi-omics data with genomic knowledge to construct pre-defined features resulted in higher performance in clinical outcome prediction and higher stability. For the grade outcome, the model with gene expression data produced an area under the receiver operating characteristic curve (AUC) of 0.7866. However, models of the integration with pathway, Gene Ontology, chromosomal gene set, and motif gene set consistently outperformed the model with genomic data only, attaining AUCs of 0.7873, 0.8433, 0.8254, and 0.8179, respectively. Conclusions: Integrating multi-omics data and genomic knowledge to improve understanding of molecular pathogenesis and underlying biology in cancer should improve diagnostic and prognostic indicators and the effectiveness of therapies.

Original languageEnglish (US)
Pages (from-to)109-120
Number of pages12
JournalJournal of the American Medical Informatics Association
Volume22
Issue number1
DOIs
StatePublished - Jan 1 2015

Fingerprint

Neoplasms
Area Under Curve
Genes
Gene Expression
Gene Ontology
Atlases
MicroRNAs
ROC Curve
Ovarian Neoplasms
Methylation
Genome
Therapeutics
Datasets

All Science Journal Classification (ASJC) codes

  • Health Informatics

Cite this

Kim, Dokyoon ; Joung, Je Gun ; Sohn, Kyung Ah ; Shin, Hyunjung ; Park, Yu Rang ; Ritchie, Marylyn Deriggi ; Kim, Ju Han. / Knowledge boosting : A graph-based integration approach with multi-omics data and genomic knowledge for cancer clinical outcome prediction. In: Journal of the American Medical Informatics Association. 2015 ; Vol. 22, No. 1. pp. 109-120.
@article{fef3ff1252714b8eb919d2ab7b18f48c,
title = "Knowledge boosting: A graph-based integration approach with multi-omics data and genomic knowledge for cancer clinical outcome prediction",
abstract = "Objective: Cancer can involve gene dysregulation via multiple mechanisms, so no single level of genomic data fully elucidates tumor behavior due to the presence of numerous genomic variations within or between levels in a biological system. We have previously proposed a graph-based integration approach that combines multi-omics data including copy number alteration, methylation, miRNA, and gene expression data for predicting clinical outcome in cancer. However, genomic features likely interact with other genomic features in complex signaling or regulatory networks, since cancer is caused by alterations in pathways or complete processes. Methods: Here we propose a new graph-based framework for integrating multi-omics data and genomic knowledge to improve power in predicting clinical outcomes and elucidate interplay between different levels. To highlight the validity of our proposed framework, we used an ovarian cancer dataset from The Cancer Genome Atlas for predicting stage, grade, and survival outcomes. Results: Integrating multi-omics data with genomic knowledge to construct pre-defined features resulted in higher performance in clinical outcome prediction and higher stability. For the grade outcome, the model with gene expression data produced an area under the receiver operating characteristic curve (AUC) of 0.7866. However, models of the integration with pathway, Gene Ontology, chromosomal gene set, and motif gene set consistently outperformed the model with genomic data only, attaining AUCs of 0.7873, 0.8433, 0.8254, and 0.8179, respectively. Conclusions: Integrating multi-omics data and genomic knowledge to improve understanding of molecular pathogenesis and underlying biology in cancer should improve diagnostic and prognostic indicators and the effectiveness of therapies.",
author = "Dokyoon Kim and Joung, {Je Gun} and Sohn, {Kyung Ah} and Hyunjung Shin and Park, {Yu Rang} and Ritchie, {Marylyn Deriggi} and Kim, {Ju Han}",
year = "2015",
month = "1",
day = "1",
doi = "10.1136/amiajnl-2013-002481",
language = "English (US)",
volume = "22",
pages = "109--120",
journal = "Journal of the American Medical Informatics Association : JAMIA",
issn = "1067-5027",
publisher = "Oxford University Press",
number = "1",

}

Knowledge boosting : A graph-based integration approach with multi-omics data and genomic knowledge for cancer clinical outcome prediction. / Kim, Dokyoon; Joung, Je Gun; Sohn, Kyung Ah; Shin, Hyunjung; Park, Yu Rang; Ritchie, Marylyn Deriggi; Kim, Ju Han.

In: Journal of the American Medical Informatics Association, Vol. 22, No. 1, 01.01.2015, p. 109-120.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Knowledge boosting

T2 - A graph-based integration approach with multi-omics data and genomic knowledge for cancer clinical outcome prediction

AU - Kim, Dokyoon

AU - Joung, Je Gun

AU - Sohn, Kyung Ah

AU - Shin, Hyunjung

AU - Park, Yu Rang

AU - Ritchie, Marylyn Deriggi

AU - Kim, Ju Han

PY - 2015/1/1

Y1 - 2015/1/1

N2 - Objective: Cancer can involve gene dysregulation via multiple mechanisms, so no single level of genomic data fully elucidates tumor behavior due to the presence of numerous genomic variations within or between levels in a biological system. We have previously proposed a graph-based integration approach that combines multi-omics data including copy number alteration, methylation, miRNA, and gene expression data for predicting clinical outcome in cancer. However, genomic features likely interact with other genomic features in complex signaling or regulatory networks, since cancer is caused by alterations in pathways or complete processes. Methods: Here we propose a new graph-based framework for integrating multi-omics data and genomic knowledge to improve power in predicting clinical outcomes and elucidate interplay between different levels. To highlight the validity of our proposed framework, we used an ovarian cancer dataset from The Cancer Genome Atlas for predicting stage, grade, and survival outcomes. Results: Integrating multi-omics data with genomic knowledge to construct pre-defined features resulted in higher performance in clinical outcome prediction and higher stability. For the grade outcome, the model with gene expression data produced an area under the receiver operating characteristic curve (AUC) of 0.7866. However, models of the integration with pathway, Gene Ontology, chromosomal gene set, and motif gene set consistently outperformed the model with genomic data only, attaining AUCs of 0.7873, 0.8433, 0.8254, and 0.8179, respectively. Conclusions: Integrating multi-omics data and genomic knowledge to improve understanding of molecular pathogenesis and underlying biology in cancer should improve diagnostic and prognostic indicators and the effectiveness of therapies.

AB - Objective: Cancer can involve gene dysregulation via multiple mechanisms, so no single level of genomic data fully elucidates tumor behavior due to the presence of numerous genomic variations within or between levels in a biological system. We have previously proposed a graph-based integration approach that combines multi-omics data including copy number alteration, methylation, miRNA, and gene expression data for predicting clinical outcome in cancer. However, genomic features likely interact with other genomic features in complex signaling or regulatory networks, since cancer is caused by alterations in pathways or complete processes. Methods: Here we propose a new graph-based framework for integrating multi-omics data and genomic knowledge to improve power in predicting clinical outcomes and elucidate interplay between different levels. To highlight the validity of our proposed framework, we used an ovarian cancer dataset from The Cancer Genome Atlas for predicting stage, grade, and survival outcomes. Results: Integrating multi-omics data with genomic knowledge to construct pre-defined features resulted in higher performance in clinical outcome prediction and higher stability. For the grade outcome, the model with gene expression data produced an area under the receiver operating characteristic curve (AUC) of 0.7866. However, models of the integration with pathway, Gene Ontology, chromosomal gene set, and motif gene set consistently outperformed the model with genomic data only, attaining AUCs of 0.7873, 0.8433, 0.8254, and 0.8179, respectively. Conclusions: Integrating multi-omics data and genomic knowledge to improve understanding of molecular pathogenesis and underlying biology in cancer should improve diagnostic and prognostic indicators and the effectiveness of therapies.

UR - http://www.scopus.com/inward/record.url?scp=84929519805&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84929519805&partnerID=8YFLogxK

U2 - 10.1136/amiajnl-2013-002481

DO - 10.1136/amiajnl-2013-002481

M3 - Article

VL - 22

SP - 109

EP - 120

JO - Journal of the American Medical Informatics Association : JAMIA

JF - Journal of the American Medical Informatics Association : JAMIA

SN - 1067-5027

IS - 1

ER -