Segmenting the human genome based on states of neutral genetic divergence

Prabhani Kuruppumullage Don, Guruprasad Ananda, Francesca Chiaromonte, Kateryna D. Makova

Research output: Contribution to journalArticle

6 Citations (Scopus)

Abstract

Many studies have demonstrated that divergence levels generated by different mutation types vary and covary across the human genome. To improve our still-incomplete understanding of the mechanistic basis of this phenomenon, we analyze several mutation types simultaneously, anchoring their variation to specific regions of the genome. Using hidden Markov models on insertion, deletion, nucleotide substitution, and microsatellite divergence estimates inferred from human-orangutan alignments of neutrally evolving genomic sequences, we segment the human genome into regions corresponding to different divergence states - each uniquely characterized by specific combinations of divergence levels. We then parsed the mutagenic contributions of various biochemical processes associating divergence states with a broad range of genomic landscape features. We find that high divergence states inhabit guanine- and cytosine (GC)-rich, highly recombining subtelomeric regions; low divergence states cover inner parts of autosomes; chromosome X forms its own state with lowest divergence; and a state of elevated microsatellite mutability is interspersed across the genome. These general trends are mirrored in human diversity data from the 1000 Genomes Project, and departures from them highlight the evolutionary history of primate chromosomes. We also find that genes and noncoding functional marks [annotations from the Encyclopedia of DNA Elements (ENCODE)] are concentrated in high divergence states. Our results provide a powerful tool for biomedical data analysis: segmentations can be used to screen personal genome variants-including those associated with cancer and other diseases-and to improve computational predictions of noncoding functional elements.

Original languageEnglish (US)
Pages (from-to)14699-14704
Number of pages6
JournalProceedings of the National Academy of Sciences of the United States of America
Volume110
Issue number36
DOIs
StatePublished - Sep 3 2013

Fingerprint

Human Genome
Genome
Microsatellite Repeats
Biochemical Phenomena
Pongo
Encyclopedias
Mutation
Cytosine
Guanine
X Chromosome
Primates
Nucleotides
Chromosomes
History
DNA
Genes
Neoplasms

All Science Journal Classification (ASJC) codes

  • General

Cite this

@article{932b90a3d494453a93b8839452e7dbfb,
title = "Segmenting the human genome based on states of neutral genetic divergence",
abstract = "Many studies have demonstrated that divergence levels generated by different mutation types vary and covary across the human genome. To improve our still-incomplete understanding of the mechanistic basis of this phenomenon, we analyze several mutation types simultaneously, anchoring their variation to specific regions of the genome. Using hidden Markov models on insertion, deletion, nucleotide substitution, and microsatellite divergence estimates inferred from human-orangutan alignments of neutrally evolving genomic sequences, we segment the human genome into regions corresponding to different divergence states - each uniquely characterized by specific combinations of divergence levels. We then parsed the mutagenic contributions of various biochemical processes associating divergence states with a broad range of genomic landscape features. We find that high divergence states inhabit guanine- and cytosine (GC)-rich, highly recombining subtelomeric regions; low divergence states cover inner parts of autosomes; chromosome X forms its own state with lowest divergence; and a state of elevated microsatellite mutability is interspersed across the genome. These general trends are mirrored in human diversity data from the 1000 Genomes Project, and departures from them highlight the evolutionary history of primate chromosomes. We also find that genes and noncoding functional marks [annotations from the Encyclopedia of DNA Elements (ENCODE)] are concentrated in high divergence states. Our results provide a powerful tool for biomedical data analysis: segmentations can be used to screen personal genome variants-including those associated with cancer and other diseases-and to improve computational predictions of noncoding functional elements.",
author = "Don, {Prabhani Kuruppumullage} and Guruprasad Ananda and Francesca Chiaromonte and Makova, {Kateryna D.}",
year = "2013",
month = "9",
day = "3",
doi = "10.1073/pnas.1221792110",
language = "English (US)",
volume = "110",
pages = "14699--14704",
journal = "Proceedings of the National Academy of Sciences of the United States of America",
issn = "0027-8424",
number = "36",

}

Segmenting the human genome based on states of neutral genetic divergence. / Don, Prabhani Kuruppumullage; Ananda, Guruprasad; Chiaromonte, Francesca; Makova, Kateryna D.

In: Proceedings of the National Academy of Sciences of the United States of America, Vol. 110, No. 36, 03.09.2013, p. 14699-14704.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Segmenting the human genome based on states of neutral genetic divergence

AU - Don, Prabhani Kuruppumullage

AU - Ananda, Guruprasad

AU - Chiaromonte, Francesca

AU - Makova, Kateryna D.

PY - 2013/9/3

Y1 - 2013/9/3

N2 - Many studies have demonstrated that divergence levels generated by different mutation types vary and covary across the human genome. To improve our still-incomplete understanding of the mechanistic basis of this phenomenon, we analyze several mutation types simultaneously, anchoring their variation to specific regions of the genome. Using hidden Markov models on insertion, deletion, nucleotide substitution, and microsatellite divergence estimates inferred from human-orangutan alignments of neutrally evolving genomic sequences, we segment the human genome into regions corresponding to different divergence states - each uniquely characterized by specific combinations of divergence levels. We then parsed the mutagenic contributions of various biochemical processes associating divergence states with a broad range of genomic landscape features. We find that high divergence states inhabit guanine- and cytosine (GC)-rich, highly recombining subtelomeric regions; low divergence states cover inner parts of autosomes; chromosome X forms its own state with lowest divergence; and a state of elevated microsatellite mutability is interspersed across the genome. These general trends are mirrored in human diversity data from the 1000 Genomes Project, and departures from them highlight the evolutionary history of primate chromosomes. We also find that genes and noncoding functional marks [annotations from the Encyclopedia of DNA Elements (ENCODE)] are concentrated in high divergence states. Our results provide a powerful tool for biomedical data analysis: segmentations can be used to screen personal genome variants-including those associated with cancer and other diseases-and to improve computational predictions of noncoding functional elements.

AB - Many studies have demonstrated that divergence levels generated by different mutation types vary and covary across the human genome. To improve our still-incomplete understanding of the mechanistic basis of this phenomenon, we analyze several mutation types simultaneously, anchoring their variation to specific regions of the genome. Using hidden Markov models on insertion, deletion, nucleotide substitution, and microsatellite divergence estimates inferred from human-orangutan alignments of neutrally evolving genomic sequences, we segment the human genome into regions corresponding to different divergence states - each uniquely characterized by specific combinations of divergence levels. We then parsed the mutagenic contributions of various biochemical processes associating divergence states with a broad range of genomic landscape features. We find that high divergence states inhabit guanine- and cytosine (GC)-rich, highly recombining subtelomeric regions; low divergence states cover inner parts of autosomes; chromosome X forms its own state with lowest divergence; and a state of elevated microsatellite mutability is interspersed across the genome. These general trends are mirrored in human diversity data from the 1000 Genomes Project, and departures from them highlight the evolutionary history of primate chromosomes. We also find that genes and noncoding functional marks [annotations from the Encyclopedia of DNA Elements (ENCODE)] are concentrated in high divergence states. Our results provide a powerful tool for biomedical data analysis: segmentations can be used to screen personal genome variants-including those associated with cancer and other diseases-and to improve computational predictions of noncoding functional elements.

UR - http://www.scopus.com/inward/record.url?scp=84883334253&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84883334253&partnerID=8YFLogxK

U2 - 10.1073/pnas.1221792110

DO - 10.1073/pnas.1221792110

M3 - Article

C2 - 23959903

AN - SCOPUS:84883334253

VL - 110

SP - 14699

EP - 14704

JO - Proceedings of the National Academy of Sciences of the United States of America

JF - Proceedings of the National Academy of Sciences of the United States of America

SN - 0027-8424

IS - 36

ER -