Densities, length proportions, and other distributional features of repetitive sequences in the human genome estimated from 430 megabases of genomic sequence

Zhenglong Gu, Haidong Wang, Anton Nekrutenko, Wen Hsiung Li

Research output: Contribution to journalArticle

62 Citations (Scopus)

Abstract

The densities of repetitive elements in the human genome were calculated in each GC content class using non-overlapping windows of 50kb. The density of Alu is two to three times higher in GC-rich regions than in AT-rich regions, while the opposite is true for LINE1. In contrast, LINE2 and other elements, such as DNA transposons, are more uniformly distributed in the genome. The number of Alus in the human genome was estimated to be 1.4 million, higher than previous estimates. About 40% of the autosomes and ∼51% of the X and Y chromosomes are occupied by repetitive elements. In total, the human genome is estimated to contain more than 4 million repetitive elements. The GC contents (%) of repetitive elements and their flanking regions were also calculated. The GC contents of almost all kinds of repeats are positively correlated with the window GC contents, suggesting that a repetitive sequence is subject to the same mutation pressure as its surrounding regions, so it tends to have the same GC content as its surrounding regions. This observation supports the regional mutation hypothesis. The only two exceptions are AluYa and AluYb8, the two youngest Alu subfamilies. The GC content of AluYb8 is negatively correlated with that of its surrounding regions, while AluYa shows no correlation, suggesting different insertion patterns for these two young Alu subfamilies. This suggestion was supported by the fact that the average genetic distance between members of AluYb8 in each GC window class is positively correlated with the GC content of the window, but no correlation was found for AluYa. AluYa is more frequent in Y chromosome than in other chromosomes; the same is true for LTR retroviruses. This pattern might be correlated with the evolutionary history of Y chromosome.

Original languageEnglish (US)
Pages (from-to)81-88
Number of pages8
JournalGene
Volume259
Issue number1-2
DOIs
StatePublished - Dec 23 2000

Fingerprint

Nucleic Acid Repetitive Sequences
Base Composition
Human Genome
Y Chromosome
AT Rich Sequence
GC Rich Sequence
Mutation
DNA Transposable Elements
X Chromosome
Retroviridae
Chromosomes
History
Genome
Pressure

All Science Journal Classification (ASJC) codes

  • Genetics

Cite this

@article{0801003a3be04febbbb3d255c1749153,
title = "Densities, length proportions, and other distributional features of repetitive sequences in the human genome estimated from 430 megabases of genomic sequence",
abstract = "The densities of repetitive elements in the human genome were calculated in each GC content class using non-overlapping windows of 50kb. The density of Alu is two to three times higher in GC-rich regions than in AT-rich regions, while the opposite is true for LINE1. In contrast, LINE2 and other elements, such as DNA transposons, are more uniformly distributed in the genome. The number of Alus in the human genome was estimated to be 1.4 million, higher than previous estimates. About 40{\%} of the autosomes and ∼51{\%} of the X and Y chromosomes are occupied by repetitive elements. In total, the human genome is estimated to contain more than 4 million repetitive elements. The GC contents ({\%}) of repetitive elements and their flanking regions were also calculated. The GC contents of almost all kinds of repeats are positively correlated with the window GC contents, suggesting that a repetitive sequence is subject to the same mutation pressure as its surrounding regions, so it tends to have the same GC content as its surrounding regions. This observation supports the regional mutation hypothesis. The only two exceptions are AluYa and AluYb8, the two youngest Alu subfamilies. The GC content of AluYb8 is negatively correlated with that of its surrounding regions, while AluYa shows no correlation, suggesting different insertion patterns for these two young Alu subfamilies. This suggestion was supported by the fact that the average genetic distance between members of AluYb8 in each GC window class is positively correlated with the GC content of the window, but no correlation was found for AluYa. AluYa is more frequent in Y chromosome than in other chromosomes; the same is true for LTR retroviruses. This pattern might be correlated with the evolutionary history of Y chromosome.",
author = "Zhenglong Gu and Haidong Wang and Anton Nekrutenko and Li, {Wen Hsiung}",
year = "2000",
month = "12",
day = "23",
doi = "10.1016/S0378-1119(00)00434-0",
language = "English (US)",
volume = "259",
pages = "81--88",
journal = "Gene",
issn = "0378-1119",
publisher = "Elsevier",
number = "1-2",

}

Densities, length proportions, and other distributional features of repetitive sequences in the human genome estimated from 430 megabases of genomic sequence. / Gu, Zhenglong; Wang, Haidong; Nekrutenko, Anton; Li, Wen Hsiung.

In: Gene, Vol. 259, No. 1-2, 23.12.2000, p. 81-88.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Densities, length proportions, and other distributional features of repetitive sequences in the human genome estimated from 430 megabases of genomic sequence

AU - Gu, Zhenglong

AU - Wang, Haidong

AU - Nekrutenko, Anton

AU - Li, Wen Hsiung

PY - 2000/12/23

Y1 - 2000/12/23

N2 - The densities of repetitive elements in the human genome were calculated in each GC content class using non-overlapping windows of 50kb. The density of Alu is two to three times higher in GC-rich regions than in AT-rich regions, while the opposite is true for LINE1. In contrast, LINE2 and other elements, such as DNA transposons, are more uniformly distributed in the genome. The number of Alus in the human genome was estimated to be 1.4 million, higher than previous estimates. About 40% of the autosomes and ∼51% of the X and Y chromosomes are occupied by repetitive elements. In total, the human genome is estimated to contain more than 4 million repetitive elements. The GC contents (%) of repetitive elements and their flanking regions were also calculated. The GC contents of almost all kinds of repeats are positively correlated with the window GC contents, suggesting that a repetitive sequence is subject to the same mutation pressure as its surrounding regions, so it tends to have the same GC content as its surrounding regions. This observation supports the regional mutation hypothesis. The only two exceptions are AluYa and AluYb8, the two youngest Alu subfamilies. The GC content of AluYb8 is negatively correlated with that of its surrounding regions, while AluYa shows no correlation, suggesting different insertion patterns for these two young Alu subfamilies. This suggestion was supported by the fact that the average genetic distance between members of AluYb8 in each GC window class is positively correlated with the GC content of the window, but no correlation was found for AluYa. AluYa is more frequent in Y chromosome than in other chromosomes; the same is true for LTR retroviruses. This pattern might be correlated with the evolutionary history of Y chromosome.

AB - The densities of repetitive elements in the human genome were calculated in each GC content class using non-overlapping windows of 50kb. The density of Alu is two to three times higher in GC-rich regions than in AT-rich regions, while the opposite is true for LINE1. In contrast, LINE2 and other elements, such as DNA transposons, are more uniformly distributed in the genome. The number of Alus in the human genome was estimated to be 1.4 million, higher than previous estimates. About 40% of the autosomes and ∼51% of the X and Y chromosomes are occupied by repetitive elements. In total, the human genome is estimated to contain more than 4 million repetitive elements. The GC contents (%) of repetitive elements and their flanking regions were also calculated. The GC contents of almost all kinds of repeats are positively correlated with the window GC contents, suggesting that a repetitive sequence is subject to the same mutation pressure as its surrounding regions, so it tends to have the same GC content as its surrounding regions. This observation supports the regional mutation hypothesis. The only two exceptions are AluYa and AluYb8, the two youngest Alu subfamilies. The GC content of AluYb8 is negatively correlated with that of its surrounding regions, while AluYa shows no correlation, suggesting different insertion patterns for these two young Alu subfamilies. This suggestion was supported by the fact that the average genetic distance between members of AluYb8 in each GC window class is positively correlated with the GC content of the window, but no correlation was found for AluYa. AluYa is more frequent in Y chromosome than in other chromosomes; the same is true for LTR retroviruses. This pattern might be correlated with the evolutionary history of Y chromosome.

UR - http://www.scopus.com/inward/record.url?scp=0034707206&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0034707206&partnerID=8YFLogxK

U2 - 10.1016/S0378-1119(00)00434-0

DO - 10.1016/S0378-1119(00)00434-0

M3 - Article

C2 - 11163965

AN - SCOPUS:0034707206

VL - 259

SP - 81

EP - 88

JO - Gene

JF - Gene

SN - 0378-1119

IS - 1-2

ER -