Identifying topologically associating domains and subdomains by Gaussian Mixture model and Proportion test

Wenbao Yu, Bing He, Kai Tan

Research output: Contribution to journalArticlepeer-review

26 Scopus citations

Abstract

The spatial organization of the genome plays a critical role in regulating gene expression. Recent chromatin interaction mapping studies have revealed that topologically associating domains and subdomains are fundamental building blocks of the three-dimensional genome. Identifying such hierarchical structures is a critical step toward understanding the three-dimensional structure-function relationship of the genome. Existing computational algorithms lack statistical assessment of domain predictions and are computationally inefficient for high-resolution Hi-C data. We introduce the Gaussian Mixture model And Proportion test (GMAP) algorithm to address the above-mentioned challenges. Using simulated and experimental Hi-C data, we show that domains identified by GMAP are more consistent with multiple lines of supporting evidence than three state-of-the-art methods. Application of GMAP to normal and cancer cells reveals several unique features of subdomain boundary as compared to domain boundary, including its higher dynamics across cell types and enrichment for somatic mutations in cancer.

Original languageEnglish (US)
Article number535
JournalNature communications
Volume8
Issue number1
DOIs
StatePublished - Dec 1 2017

All Science Journal Classification (ASJC) codes

  • Chemistry(all)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Physics and Astronomy(all)

Fingerprint

Dive into the research topics of 'Identifying topologically associating domains and subdomains by Gaussian Mixture model and Proportion test'. Together they form a unique fingerprint.

Cite this