TY - JOUR
T1 - In vivo genome-wide profiling of RNA secondary structure reveals novel regulatory features
AU - Ding, Yiliang
AU - Tang, Yin
AU - Kwok, Chun Kit
AU - Zhang, Yu
AU - Bevilacqua, Philip C.
AU - Assmann, Sarah M.
N1 - Funding Information:
Acknowledgements This research is supported by Human Frontier Science Program (HFSP) grant RGP0002/2009-C, the Penn State Eberly College of Science, and a Penn StateHuckInstitutes HITSgranttoP.C.B.and S.M.A.WethankF.Pugh, Y.Li, A. Chan and K. Yen for help with Illumina sequencing; D. Mathews and A. Spasic for advice on RNA structure analysis; M. Axtell for reading of the manuscript; and P. Raghavan for access to the CyberSTAR server, funded by the National Science Foundation through grant OCI–0821527. We also thank L. Song, D. Chadalavada and S. Ghosh for discussions.
PY - 2014
Y1 - 2014
N2 - RNA structure has critical roles in processes ranging from ligand sensing to the regulation of translation, polyadenylation and splicing. However, a lack of genome-wide in vivo RNA structural data has limited our understanding of how RNA structure regulates gene expression in living cells. Here we present a high-throughput, genome-wide in vivo RNA structure probing method, structure-seq, in which dimethyl sulphate methylation of unprotected adenines and cytosines is identified by next-generation sequencing. Application of this method to Arabidopsis thaliana seedlings yielded the first in vivo genome-wide RNA structure map at nucleotide resolution for any organism, with quantitative structural information across more than 10,000 transcripts. Our analysis reveals a three-nucleotide periodic repeat pattern in the structure of coding regions, as well as a less-structured region immediately upstream of the start codon, and shows that these features are strongly correlated with translation efficiency. We also find patterns of strong and weak secondary structure at sites of alternative polyadenylation, as well as strong secondary structure at 5′ splice sites that correlates with unspliced events. Notably, in vivo structures of messenger RNAs annotated for stress responses are poorly predicted in silico, whereas mRNA structures of genes related to cell function maintenance are well predicted. Global comparison of several structural features between these two categories shows that the mRNAs associated with stress responses tend to have more single-strandedness, longer maximal loop length and higher free energy per nucleotide, features that may allow these RNAs to undergo conformational changes in response to environmental conditions. Structure-seq allows the RNA structurome and its biological roles to be interrogated on a genome-wide scale and should be applicable to any organism.
AB - RNA structure has critical roles in processes ranging from ligand sensing to the regulation of translation, polyadenylation and splicing. However, a lack of genome-wide in vivo RNA structural data has limited our understanding of how RNA structure regulates gene expression in living cells. Here we present a high-throughput, genome-wide in vivo RNA structure probing method, structure-seq, in which dimethyl sulphate methylation of unprotected adenines and cytosines is identified by next-generation sequencing. Application of this method to Arabidopsis thaliana seedlings yielded the first in vivo genome-wide RNA structure map at nucleotide resolution for any organism, with quantitative structural information across more than 10,000 transcripts. Our analysis reveals a three-nucleotide periodic repeat pattern in the structure of coding regions, as well as a less-structured region immediately upstream of the start codon, and shows that these features are strongly correlated with translation efficiency. We also find patterns of strong and weak secondary structure at sites of alternative polyadenylation, as well as strong secondary structure at 5′ splice sites that correlates with unspliced events. Notably, in vivo structures of messenger RNAs annotated for stress responses are poorly predicted in silico, whereas mRNA structures of genes related to cell function maintenance are well predicted. Global comparison of several structural features between these two categories shows that the mRNAs associated with stress responses tend to have more single-strandedness, longer maximal loop length and higher free energy per nucleotide, features that may allow these RNAs to undergo conformational changes in response to environmental conditions. Structure-seq allows the RNA structurome and its biological roles to be interrogated on a genome-wide scale and should be applicable to any organism.
UR - http://www.scopus.com/inward/record.url?scp=84893427735&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84893427735&partnerID=8YFLogxK
U2 - 10.1038/nature12756
DO - 10.1038/nature12756
M3 - Article
C2 - 24270811
AN - SCOPUS:84893427735
SN - 0028-0836
VL - 505
SP - 696
EP - 700
JO - Nature
JF - Nature
IS - 7485
ER -