Topological entropy of DNA sequences

    Research output: Contribution to journalArticle

    29 Citations (Scopus)

    Abstract

    Motivation: Topological entropy has been one of the most difficult to implement of all the entropy-theoretic notions. This is primarily due to finite sample effects and high-dimensionality problems. In particular, topological entropy has been implemented in previous literature to conclude that entropy of exons is higher than of introns, thus implying that exons are more 'random' than introns.Results: We define a new approximation to topological entropy free from the aforementioned difficulties. We compute its expected value and apply this definition to the intron and exon regions of the human genome to observe that as expected, the entropy of introns are significantly higher than that of exons. We also find that introns are less random than expected: their entropy is lower than the computed expected value. We also observe the perplexing phenomena that introns on chromosome Y have atypically low and bimodal entropy, possibly corresponding to random sequences (high entropy) and sequences that posses hidden structure or function (low entropy).

    Original languageEnglish (US)
    Article numberbtr077
    Pages (from-to)1061-1067
    Number of pages7
    JournalBioinformatics
    Volume27
    Issue number8
    DOIs
    StatePublished - Apr 1 2011

    Fingerprint

    Topological Entropy
    DNA sequences
    Entropy
    DNA Sequence
    Introns
    Exons
    Expected Value
    Random Sequence
    Bimodal
    Chromosome
    Dimensionality
    Genome
    Y Chromosome
    Human Genome
    Chromosomes
    Genes
    Approximation

    All Science Journal Classification (ASJC) codes

    • Statistics and Probability
    • Biochemistry
    • Molecular Biology
    • Computer Science Applications
    • Computational Theory and Mathematics
    • Computational Mathematics

    Cite this

    Koslicki, David. / Topological entropy of DNA sequences. In: Bioinformatics. 2011 ; Vol. 27, No. 8. pp. 1061-1067.
    @article{d8c22b0d6e074f1796181dcb395b0044,
    title = "Topological entropy of DNA sequences",
    abstract = "Motivation: Topological entropy has been one of the most difficult to implement of all the entropy-theoretic notions. This is primarily due to finite sample effects and high-dimensionality problems. In particular, topological entropy has been implemented in previous literature to conclude that entropy of exons is higher than of introns, thus implying that exons are more 'random' than introns.Results: We define a new approximation to topological entropy free from the aforementioned difficulties. We compute its expected value and apply this definition to the intron and exon regions of the human genome to observe that as expected, the entropy of introns are significantly higher than that of exons. We also find that introns are less random than expected: their entropy is lower than the computed expected value. We also observe the perplexing phenomena that introns on chromosome Y have atypically low and bimodal entropy, possibly corresponding to random sequences (high entropy) and sequences that posses hidden structure or function (low entropy).",
    author = "David Koslicki",
    year = "2011",
    month = "4",
    day = "1",
    doi = "10.1093/bioinformatics/btr077",
    language = "English (US)",
    volume = "27",
    pages = "1061--1067",
    journal = "Bioinformatics",
    issn = "1367-4803",
    publisher = "Oxford University Press",
    number = "8",

    }

    Topological entropy of DNA sequences. / Koslicki, David.

    In: Bioinformatics, Vol. 27, No. 8, btr077, 01.04.2011, p. 1061-1067.

    Research output: Contribution to journalArticle

    TY - JOUR

    T1 - Topological entropy of DNA sequences

    AU - Koslicki, David

    PY - 2011/4/1

    Y1 - 2011/4/1

    N2 - Motivation: Topological entropy has been one of the most difficult to implement of all the entropy-theoretic notions. This is primarily due to finite sample effects and high-dimensionality problems. In particular, topological entropy has been implemented in previous literature to conclude that entropy of exons is higher than of introns, thus implying that exons are more 'random' than introns.Results: We define a new approximation to topological entropy free from the aforementioned difficulties. We compute its expected value and apply this definition to the intron and exon regions of the human genome to observe that as expected, the entropy of introns are significantly higher than that of exons. We also find that introns are less random than expected: their entropy is lower than the computed expected value. We also observe the perplexing phenomena that introns on chromosome Y have atypically low and bimodal entropy, possibly corresponding to random sequences (high entropy) and sequences that posses hidden structure or function (low entropy).

    AB - Motivation: Topological entropy has been one of the most difficult to implement of all the entropy-theoretic notions. This is primarily due to finite sample effects and high-dimensionality problems. In particular, topological entropy has been implemented in previous literature to conclude that entropy of exons is higher than of introns, thus implying that exons are more 'random' than introns.Results: We define a new approximation to topological entropy free from the aforementioned difficulties. We compute its expected value and apply this definition to the intron and exon regions of the human genome to observe that as expected, the entropy of introns are significantly higher than that of exons. We also find that introns are less random than expected: their entropy is lower than the computed expected value. We also observe the perplexing phenomena that introns on chromosome Y have atypically low and bimodal entropy, possibly corresponding to random sequences (high entropy) and sequences that posses hidden structure or function (low entropy).

    UR - http://www.scopus.com/inward/record.url?scp=79954465218&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=79954465218&partnerID=8YFLogxK

    U2 - 10.1093/bioinformatics/btr077

    DO - 10.1093/bioinformatics/btr077

    M3 - Article

    C2 - 21317142

    AN - SCOPUS:79954465218

    VL - 27

    SP - 1061

    EP - 1067

    JO - Bioinformatics

    JF - Bioinformatics

    SN - 1367-4803

    IS - 8

    M1 - btr077

    ER -