HiCRep: assessing the reproducibility of Hi-C data using a stratum-adjusted correlation coefficient

Tao Yang, Feipeng Zhang, Galip Gürkan Yardımci, Fan Song, Ross C. Hardison, William Stafford Noble, Feng Yue, Qunhua Li

Research output: Contribution to journalArticle

29 Citations (Scopus)

Abstract

Hi-C is a powerful technology for studying genome-wide chromatin interactions. However, current methods for assessing Hi-C data reproducibility can produce misleading results because they ignore spatial features in Hi-C data, such as domain structure and distance dependence. We present HiCRep, a framework for assessing the reproducibility of Hi-C data that systematically accounts for these features. In particular, we introduce a novel similarity measure, the stratum adjusted correlation coefficient (SCC), for quantifying the similarity between Hi-C interaction matrices. Not only does it provide a statistically sound and reliable evaluation of reproducibility, SCC can also be used to quantify differences between Hi-C contact matrices and to determine the optimal sequencing depth for a desired resolution. The measure consistently shows higher accuracy than existing approaches in distinguishing subtle differences in reproducibility and depicting interrelationships of cell lineages. The proposed measure is straightforward to interpret and easy to compute, making it well-suited for providing standardized, interpretable, automatable, and scalable quality control. The freely available R package HiCRep implements our approach.

Original languageEnglish (US)
Pages (from-to)1939-1949
Number of pages11
JournalGenome research
Volume27
Issue number11
DOIs
StatePublished - Nov 2017

Fingerprint

Cell Lineage
Quality Control
Chromatin
Genome
Technology

All Science Journal Classification (ASJC) codes

  • Genetics
  • Genetics(clinical)

Cite this

Yang, Tao ; Zhang, Feipeng ; Yardımci, Galip Gürkan ; Song, Fan ; Hardison, Ross C. ; Noble, William Stafford ; Yue, Feng ; Li, Qunhua. / HiCRep : assessing the reproducibility of Hi-C data using a stratum-adjusted correlation coefficient. In: Genome research. 2017 ; Vol. 27, No. 11. pp. 1939-1949.
@article{7eec3c7fd21b4c258f9560f092071c99,
title = "HiCRep: assessing the reproducibility of Hi-C data using a stratum-adjusted correlation coefficient",
abstract = "Hi-C is a powerful technology for studying genome-wide chromatin interactions. However, current methods for assessing Hi-C data reproducibility can produce misleading results because they ignore spatial features in Hi-C data, such as domain structure and distance dependence. We present HiCRep, a framework for assessing the reproducibility of Hi-C data that systematically accounts for these features. In particular, we introduce a novel similarity measure, the stratum adjusted correlation coefficient (SCC), for quantifying the similarity between Hi-C interaction matrices. Not only does it provide a statistically sound and reliable evaluation of reproducibility, SCC can also be used to quantify differences between Hi-C contact matrices and to determine the optimal sequencing depth for a desired resolution. The measure consistently shows higher accuracy than existing approaches in distinguishing subtle differences in reproducibility and depicting interrelationships of cell lineages. The proposed measure is straightforward to interpret and easy to compute, making it well-suited for providing standardized, interpretable, automatable, and scalable quality control. The freely available R package HiCRep implements our approach.",
author = "Tao Yang and Feipeng Zhang and Yardımci, {Galip G{\"u}rkan} and Fan Song and Hardison, {Ross C.} and Noble, {William Stafford} and Feng Yue and Qunhua Li",
year = "2017",
month = "11",
doi = "10.1101/gr.220640.117",
language = "English (US)",
volume = "27",
pages = "1939--1949",
journal = "Genome Research",
issn = "1088-9051",
publisher = "Cold Spring Harbor Laboratory Press",
number = "11",

}

HiCRep : assessing the reproducibility of Hi-C data using a stratum-adjusted correlation coefficient. / Yang, Tao; Zhang, Feipeng; Yardımci, Galip Gürkan; Song, Fan; Hardison, Ross C.; Noble, William Stafford; Yue, Feng; Li, Qunhua.

In: Genome research, Vol. 27, No. 11, 11.2017, p. 1939-1949.

Research output: Contribution to journalArticle

TY - JOUR

T1 - HiCRep

T2 - assessing the reproducibility of Hi-C data using a stratum-adjusted correlation coefficient

AU - Yang, Tao

AU - Zhang, Feipeng

AU - Yardımci, Galip Gürkan

AU - Song, Fan

AU - Hardison, Ross C.

AU - Noble, William Stafford

AU - Yue, Feng

AU - Li, Qunhua

PY - 2017/11

Y1 - 2017/11

N2 - Hi-C is a powerful technology for studying genome-wide chromatin interactions. However, current methods for assessing Hi-C data reproducibility can produce misleading results because they ignore spatial features in Hi-C data, such as domain structure and distance dependence. We present HiCRep, a framework for assessing the reproducibility of Hi-C data that systematically accounts for these features. In particular, we introduce a novel similarity measure, the stratum adjusted correlation coefficient (SCC), for quantifying the similarity between Hi-C interaction matrices. Not only does it provide a statistically sound and reliable evaluation of reproducibility, SCC can also be used to quantify differences between Hi-C contact matrices and to determine the optimal sequencing depth for a desired resolution. The measure consistently shows higher accuracy than existing approaches in distinguishing subtle differences in reproducibility and depicting interrelationships of cell lineages. The proposed measure is straightforward to interpret and easy to compute, making it well-suited for providing standardized, interpretable, automatable, and scalable quality control. The freely available R package HiCRep implements our approach.

AB - Hi-C is a powerful technology for studying genome-wide chromatin interactions. However, current methods for assessing Hi-C data reproducibility can produce misleading results because they ignore spatial features in Hi-C data, such as domain structure and distance dependence. We present HiCRep, a framework for assessing the reproducibility of Hi-C data that systematically accounts for these features. In particular, we introduce a novel similarity measure, the stratum adjusted correlation coefficient (SCC), for quantifying the similarity between Hi-C interaction matrices. Not only does it provide a statistically sound and reliable evaluation of reproducibility, SCC can also be used to quantify differences between Hi-C contact matrices and to determine the optimal sequencing depth for a desired resolution. The measure consistently shows higher accuracy than existing approaches in distinguishing subtle differences in reproducibility and depicting interrelationships of cell lineages. The proposed measure is straightforward to interpret and easy to compute, making it well-suited for providing standardized, interpretable, automatable, and scalable quality control. The freely available R package HiCRep implements our approach.

UR - http://www.scopus.com/inward/record.url?scp=85041538366&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85041538366&partnerID=8YFLogxK

U2 - 10.1101/gr.220640.117

DO - 10.1101/gr.220640.117

M3 - Article

C2 - 28855260

AN - SCOPUS:85041538366

VL - 27

SP - 1939

EP - 1949

JO - Genome Research

JF - Genome Research

SN - 1088-9051

IS - 11

ER -