Motivation Regulatory proteins associate with the genome either by directly binding cognate DNA motifs or via protein-protein interactions with other regulators. Each recruitment mechanism may be associated with distinct motifs and may also result in distinct characteristic patterns in high-resolution protein-DNA binding assays. For example, the ChIP-exo protocol precisely characterizes protein-DNA crosslinking patterns by combining chromatin immunoprecipitation (ChIP) with 5′ → 3′ exonuclease digestion. Since different regulatory complexes will result in different protein-DNA crosslinking signatures, analysis of ChIP-exo tag enrichment patterns should enable detection of multiple protein-DNA binding modes for a given regulatory protein. However, current ChIP-exo analysis methods either treat all binding events as being of a uniform type or rely on motifs to cluster binding events into subtypes. Results To systematically detect multiple protein-DNA interaction modes in a single ChIP-exo experiment, we introduce the ChIP-exo mixture model (ChExMix). ChExMix probabilistically models the genomic locations and subtype memberships of binding events using both ChIP-exo tag distribution patterns and DNA motifs. We demonstrate that ChExMix achieves accurate detection and classification of binding event subtypes using in silico mixed ChIP-exo data. We further demonstrate the unique analysis abilities of ChExMix using a collection of ChIP-exo experiments that profile the binding of key transcription factors in MCF-7 cells. In these data, ChExMix identifies possible recruitment mechanisms of FoxA1 and ERα, thus demonstrating that ChExMix can effectively stratify ChIP-exo binding events into biologically meaningful subtypes.
All Science Journal Classification (ASJC) codes
- Statistics and Probability
- Molecular Biology
- Computer Science Applications
- Computational Theory and Mathematics
- Computational Mathematics