TY - JOUR
T1 - Genome-wide discriminatory information patterns of cytosine DNA methylation
AU - Sanchez, Robersy
AU - Mackenzie, Sally A.
N1 - Funding Information:
This work was supported by a grant from the Bill and Melinda Gates Foundation (OPP1088661) to Sally A. Mackenzie
Publisher Copyright:
© 2016 by the authors.
PY - 2016/6
Y1 - 2016/6
N2 - Cytosine DNA methylation (CDM) is a highly abundant, heritable but reversible chemical modification to the genome. Herein, a machine learning approach was applied to analyze the accumulation of epigenetic marks in methylomes of 152 ecotypes and 85 silencing mutants of Arabidopsis thaliana. In an information-thermodynamics framework, two measurements were used: (1) the amount of information gained/lost with the CDM changes IR and (2) the uncertainty of not observing a SNP LCR. We hypothesize that epigenetic marks are chromosomal footprints accounting for different ontogenetic and phylogenetic histories of individual populations. A machine learning approach is proposed to verify this hypothesis. Results support the hypothesis by the existence of discriminatory information (DI) patterns of CDM able to discriminate between individuals and between individual subpopulations. The statistical analyses revealed a strong association between the topologies of the structured population of Arabidopsis ecotypes based on IR and on LCR, respectively. A statistical-physical relationship between IR and LCR was also found. Results to date imply that the genome-wide distribution of CDM changes is not only part of the biological signal created by the methylation regulatory machinery, but ensures the stability of the DNA molecule, preserving the integrity of the genetic message under continuous stress from thermal fluctuations in the cell environment.
AB - Cytosine DNA methylation (CDM) is a highly abundant, heritable but reversible chemical modification to the genome. Herein, a machine learning approach was applied to analyze the accumulation of epigenetic marks in methylomes of 152 ecotypes and 85 silencing mutants of Arabidopsis thaliana. In an information-thermodynamics framework, two measurements were used: (1) the amount of information gained/lost with the CDM changes IR and (2) the uncertainty of not observing a SNP LCR. We hypothesize that epigenetic marks are chromosomal footprints accounting for different ontogenetic and phylogenetic histories of individual populations. A machine learning approach is proposed to verify this hypothesis. Results support the hypothesis by the existence of discriminatory information (DI) patterns of CDM able to discriminate between individuals and between individual subpopulations. The statistical analyses revealed a strong association between the topologies of the structured population of Arabidopsis ecotypes based on IR and on LCR, respectively. A statistical-physical relationship between IR and LCR was also found. Results to date imply that the genome-wide distribution of CDM changes is not only part of the biological signal created by the methylation regulatory machinery, but ensures the stability of the DNA molecule, preserving the integrity of the genetic message under continuous stress from thermal fluctuations in the cell environment.
UR - http://www.scopus.com/inward/record.url?scp=84975110717&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84975110717&partnerID=8YFLogxK
U2 - 10.3390/ijms17060938
DO - 10.3390/ijms17060938
M3 - Article
C2 - 27322251
AN - SCOPUS:84975110717
SN - 1661-6596
VL - 17
JO - International Journal of Molecular Sciences
JF - International Journal of Molecular Sciences
IS - 6
M1 - 938
ER -