Theoretical foundations for quantitative paleogenetics - PART III

The Molecular Divergence of Nucleic Acids and Proteins for the Case of Genetic Events of Unequal Probability

Richard Holmquist, Dennis Keith Pearl

Research output: Contribution to journalArticle

20 Citations (Scopus)

Abstract

REH theory is extended by deriving the theoretical equations that permit one to analyze the nonrandom molecular divergence of homologous genes and proteins. The nonrandomicities considered are amino acid and base composition, the frequencies with which each of the four nucleotides is replaced by one of the other three, unequal usage of degenerate codons, distribution of fixed base replacements at the three nucleotide positions within codons, and distributions of fixed base replacements among codons. The latter two distributions turn out to dominate the accuracy of genetic distance estimates. The negative binomial density is used to allow for the unequal mutability of different codon sites, and the implications of its two limiting forms, the Poisson and geometric distributions, are considered. It is shown that the fixation intensity - the average number of base replacements per variable codon - is expressible as the simple product of two factors, the first describing the asymmetry of the distribution of base replacements over the gene and the second defining the ratio of the average probability that a codon will fix a mutation to the probability that it will not. Tables are given relating these features to experimentally observable quantities in α hemoglobin, β hemoglobin, myoglobin, cytochrome c, and the parvalbumin group of proteins and to the structure of their corre-sponding genes or mRNAs. The principal results are (1) more accurate methods of estimating parameters of evolutionary interest from experimental gene and protein sequence data, and (2) the fact that change in gene and protein structure has been a much less efficient process than previously believed in the sense of requiring many more base replacements to effect a given structural change than earlier estimation procedures had indicated. This inefficiency is directly traceable to Darwinian selection for the nonrandom gene or protein structures necessary for biological function. The application of these methods is illustrated by detailed consideration of the rabbit α -and β hemoglobin mRNAs and the proteins for which they code. It is found that these two genes are separated by about 425 fixed base replacements, which is a factor of two greater than earlier estimates. The replacements are distributed over approximately 114 codon sites that were free to accept base mutations during the divergence of these two genes.

Original languageEnglish (US)
Pages (from-to)211-267
Number of pages57
JournalJournal Of Molecular Evolution
Volume16
Issue number3-4
DOIs
StatePublished - Sep 1 1980

Fingerprint

nucleic acid
Codon
Nucleic Acids
nucleic acids
codons
divergence
replacement
protein
gene
Proteins
hemoglobin
proteins
Hemoglobins
genes
protein structure
Genes
Nucleotides
mutation
Cytochrome c Group
Poisson Distribution

All Science Journal Classification (ASJC) codes

  • Ecology, Evolution, Behavior and Systematics
  • Molecular Biology
  • Genetics

Cite this

@article{eb979dec3d2d4ca68366a8b9c59160b4,
title = "Theoretical foundations for quantitative paleogenetics - PART III: The Molecular Divergence of Nucleic Acids and Proteins for the Case of Genetic Events of Unequal Probability",
abstract = "REH theory is extended by deriving the theoretical equations that permit one to analyze the nonrandom molecular divergence of homologous genes and proteins. The nonrandomicities considered are amino acid and base composition, the frequencies with which each of the four nucleotides is replaced by one of the other three, unequal usage of degenerate codons, distribution of fixed base replacements at the three nucleotide positions within codons, and distributions of fixed base replacements among codons. The latter two distributions turn out to dominate the accuracy of genetic distance estimates. The negative binomial density is used to allow for the unequal mutability of different codon sites, and the implications of its two limiting forms, the Poisson and geometric distributions, are considered. It is shown that the fixation intensity - the average number of base replacements per variable codon - is expressible as the simple product of two factors, the first describing the asymmetry of the distribution of base replacements over the gene and the second defining the ratio of the average probability that a codon will fix a mutation to the probability that it will not. Tables are given relating these features to experimentally observable quantities in α hemoglobin, β hemoglobin, myoglobin, cytochrome c, and the parvalbumin group of proteins and to the structure of their corre-sponding genes or mRNAs. The principal results are (1) more accurate methods of estimating parameters of evolutionary interest from experimental gene and protein sequence data, and (2) the fact that change in gene and protein structure has been a much less efficient process than previously believed in the sense of requiring many more base replacements to effect a given structural change than earlier estimation procedures had indicated. This inefficiency is directly traceable to Darwinian selection for the nonrandom gene or protein structures necessary for biological function. The application of these methods is illustrated by detailed consideration of the rabbit α -and β hemoglobin mRNAs and the proteins for which they code. It is found that these two genes are separated by about 425 fixed base replacements, which is a factor of two greater than earlier estimates. The replacements are distributed over approximately 114 codon sites that were free to accept base mutations during the divergence of these two genes.",
author = "Richard Holmquist and Pearl, {Dennis Keith}",
year = "1980",
month = "9",
day = "1",
doi = "10.1007/BF01804977",
language = "English (US)",
volume = "16",
pages = "211--267",
journal = "Journal of Molecular Evolution",
issn = "0022-2844",
publisher = "Springer New York",
number = "3-4",

}

TY - JOUR

T1 - Theoretical foundations for quantitative paleogenetics - PART III

T2 - The Molecular Divergence of Nucleic Acids and Proteins for the Case of Genetic Events of Unequal Probability

AU - Holmquist, Richard

AU - Pearl, Dennis Keith

PY - 1980/9/1

Y1 - 1980/9/1

N2 - REH theory is extended by deriving the theoretical equations that permit one to analyze the nonrandom molecular divergence of homologous genes and proteins. The nonrandomicities considered are amino acid and base composition, the frequencies with which each of the four nucleotides is replaced by one of the other three, unequal usage of degenerate codons, distribution of fixed base replacements at the three nucleotide positions within codons, and distributions of fixed base replacements among codons. The latter two distributions turn out to dominate the accuracy of genetic distance estimates. The negative binomial density is used to allow for the unequal mutability of different codon sites, and the implications of its two limiting forms, the Poisson and geometric distributions, are considered. It is shown that the fixation intensity - the average number of base replacements per variable codon - is expressible as the simple product of two factors, the first describing the asymmetry of the distribution of base replacements over the gene and the second defining the ratio of the average probability that a codon will fix a mutation to the probability that it will not. Tables are given relating these features to experimentally observable quantities in α hemoglobin, β hemoglobin, myoglobin, cytochrome c, and the parvalbumin group of proteins and to the structure of their corre-sponding genes or mRNAs. The principal results are (1) more accurate methods of estimating parameters of evolutionary interest from experimental gene and protein sequence data, and (2) the fact that change in gene and protein structure has been a much less efficient process than previously believed in the sense of requiring many more base replacements to effect a given structural change than earlier estimation procedures had indicated. This inefficiency is directly traceable to Darwinian selection for the nonrandom gene or protein structures necessary for biological function. The application of these methods is illustrated by detailed consideration of the rabbit α -and β hemoglobin mRNAs and the proteins for which they code. It is found that these two genes are separated by about 425 fixed base replacements, which is a factor of two greater than earlier estimates. The replacements are distributed over approximately 114 codon sites that were free to accept base mutations during the divergence of these two genes.

AB - REH theory is extended by deriving the theoretical equations that permit one to analyze the nonrandom molecular divergence of homologous genes and proteins. The nonrandomicities considered are amino acid and base composition, the frequencies with which each of the four nucleotides is replaced by one of the other three, unequal usage of degenerate codons, distribution of fixed base replacements at the three nucleotide positions within codons, and distributions of fixed base replacements among codons. The latter two distributions turn out to dominate the accuracy of genetic distance estimates. The negative binomial density is used to allow for the unequal mutability of different codon sites, and the implications of its two limiting forms, the Poisson and geometric distributions, are considered. It is shown that the fixation intensity - the average number of base replacements per variable codon - is expressible as the simple product of two factors, the first describing the asymmetry of the distribution of base replacements over the gene and the second defining the ratio of the average probability that a codon will fix a mutation to the probability that it will not. Tables are given relating these features to experimentally observable quantities in α hemoglobin, β hemoglobin, myoglobin, cytochrome c, and the parvalbumin group of proteins and to the structure of their corre-sponding genes or mRNAs. The principal results are (1) more accurate methods of estimating parameters of evolutionary interest from experimental gene and protein sequence data, and (2) the fact that change in gene and protein structure has been a much less efficient process than previously believed in the sense of requiring many more base replacements to effect a given structural change than earlier estimation procedures had indicated. This inefficiency is directly traceable to Darwinian selection for the nonrandom gene or protein structures necessary for biological function. The application of these methods is illustrated by detailed consideration of the rabbit α -and β hemoglobin mRNAs and the proteins for which they code. It is found that these two genes are separated by about 425 fixed base replacements, which is a factor of two greater than earlier estimates. The replacements are distributed over approximately 114 codon sites that were free to accept base mutations during the divergence of these two genes.

UR - http://www.scopus.com/inward/record.url?scp=0019126419&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0019126419&partnerID=8YFLogxK

U2 - 10.1007/BF01804977

DO - 10.1007/BF01804977

M3 - Article

VL - 16

SP - 211

EP - 267

JO - Journal of Molecular Evolution

JF - Journal of Molecular Evolution

SN - 0022-2844

IS - 3-4

ER -