TY - JOUR
T1 - Analysis of DNA sequences using methods of statistical physics
AU - Buldyrev, S. V.
AU - Dokholyan, N. V.
AU - Goldberger, A. L.
AU - Havlin, S.
AU - Peng, C. K.
AU - Stanley, H. E.
AU - Viswanathan, G. M.
N1 - Funding Information:
We are grateful to many individuals, including R. Mantegna, M.E. Matsa, S.M. Ossadnik, F. Sciortino and M. Simons for major contributions to those results reviewed here that represent collaborative research efforts. We also wish to thank C. Cantor, C. DeLisi, M. Frank-Kamenetskii, A.Yu. Grosberg, I. Labat, L. Liebovitch, G.S. Michaels, P. Munson, R. Nussinov, R.D. Rosenberg, E.I. Shakhnovich, M.F. Shlesinger and E.N. Trifonov for valuable discussions. Partial support was provided by the National Science Foundation, National Institutes of Health (Human Genome Project), the G. Harold and Leila Y. Mathers Charitable Foundation, the Israel–USA Binational Science Foundation, Israel Academy of Sciences, and (to C.-K.P.) by an NIH/NIMH First Award.
PY - 1998/1/2
Y1 - 1998/1/2
N2 - We review the present status of the studies of DNA sequences using methods of statistical physics. We present evidence, based on systematic studies of the entire GenBank database, supporting the idea that the DNA sequence in genes containing noncoding regions is correlated, and that the correlation is remarkably long range, i.e., base pairs thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the DNA. We discuss the mechanisms of molecular evolution that may lead to the presence of long-range power-law correlations in noncoding DNA and their absence in coding DNA. One such mechanism is the simple repeat expansion, which recently has attracted the attention of the biological community in conjunction with genetic diseases. We also review new tools - e.g., detrended fluctuation analysis - that are useful for studies of complex hierarchical DNA structure.
AB - We review the present status of the studies of DNA sequences using methods of statistical physics. We present evidence, based on systematic studies of the entire GenBank database, supporting the idea that the DNA sequence in genes containing noncoding regions is correlated, and that the correlation is remarkably long range, i.e., base pairs thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the DNA. We discuss the mechanisms of molecular evolution that may lead to the presence of long-range power-law correlations in noncoding DNA and their absence in coding DNA. One such mechanism is the simple repeat expansion, which recently has attracted the attention of the biological community in conjunction with genetic diseases. We also review new tools - e.g., detrended fluctuation analysis - that are useful for studies of complex hierarchical DNA structure.
UR - http://www.scopus.com/inward/record.url?scp=0031996849&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0031996849&partnerID=8YFLogxK
U2 - 10.1016/S0378-4371(97)00503-7
DO - 10.1016/S0378-4371(97)00503-7
M3 - Article
AN - SCOPUS:0031996849
SN - 0378-4371
VL - 249
SP - 430
EP - 438
JO - Physica A: Statistical Mechanics and its Applications
JF - Physica A: Statistical Mechanics and its Applications
IS - 1-4
ER -