PROJECT SUMMARY Apart from the double-helix B-DNA structure discovered by Watson and Crick, approximately 13% of the human genome comprises sequence motifs that can form non-canonical, or non-B, DNA conformations. This project focuses on G-quadruplexes, the type of non-B DNA for which we have the strongest evidence of genome-wide formation and functionality in human cells. There are more than 700,000 putative G-quadruplex loci in the human genome. They constitute ~1% of the genome, compared to ~1.5% occupied by protein-coding exons. Recent in vivo experiments showed that G-quadruplexes regulate key cellular processes (e.g., chromatin organization and transcription). Thus we hypothesize that some groups of G-quadruplex loci evolve under purifying selection. Yet, G-quadruplexes may represent a hurdle for DNA replication. Our published preliminary results, based on the analysis of long-read sequencing data, demonstrated decreased polymerization speed and increased polymerization errors at G-quadruplex loci genome-wide. We hypothesize that the same phenomena occur in human cells and lead to increased mutagenesis at G-quadruplex loci. Building upon our published and unpublished preliminary results, this project will examine the contribution of G-quadruplex motifs to genome evolution, which has been critically underexplored. Aim 1 will elucidate the mechanistic basis behind the increased mutation rate at G-quadruplex loci, using state-of-the-art high-fidelity duplex sequencing. With in vivo experiments, we will test a hypothesis that mutation rates are increased specifically at G-quadruplex structures forming in human cells and are associated with replication slowdown. With in vitro experiments, we will test a hypothesis that two major eukaryotic replicative polymerases (polymerases epsilon and delta, responsible for leading and lagging strand synthesis, respectively) stall and have increased error frequencies at G-quadruplexes. Aim 2 will assess the contribution of G-quadruplex loci to regional variation in mutation rates in the genome and will test a hypothesis that G-quadruplex loci facilitate structural variation in human populations and chromosomal rearrangements during evolution. Advanced statistical techniques, including ones from the Functional Data Analysis domain, will be used in this Aim. Finally, Aim 3 will examine selection acting on G-quadruplex loci using classical and novel statistical tests. We will test a hypothesis that G-quadruplexes located in different functional compartments of the genome experience varying selective pressures, e.g., promoter motifs are expected to evolve under strong purifying selection. Moreover, we will investigate a potential association between biophysical stability of G-quadruplex structures and the strength of selection acting on them. This Aim will also identify groups of physiologically relevant G-quadruplex loci that will drive future functional studies. Overall, the project will substantially advance our understanding of the contribution of G-quadruplexes to genome evolution and diseases.
|Effective start/end date||1/1/21 → 12/31/21|
- National Institute of General Medical Sciences: $567,853.00