CpG island identification with higher order and variable order markov models

Zhenqiu Liu, Dechang Chen, Xue Wen Chen

Research output: Chapter in Book/Report/Conference proceedingChapter

1 Citation (Scopus)

Abstract

Identifying the location and function of human genes in a long sequence of genome is difficult due to lack of sufficient information about genes. Experimental evidence has suggested that there exists strong correlation between CpG islands and genes immediately following them. Much research has been done to identify CpG islands in a DNA sequence using various models. In this chapter, we introduce two alternative models based on high order and variable order Markov chains. Compared with the popular models such as the ffist order Markov chain, HMM, and HMT, these two models are much easier to compute and have higher identification accuracies. One unsolved problem with the Markov model is that there is no way to decide the exact boundary point between CpG and non-CpG islands. In this chapter, we provide a novel tool to decide the boundary points using the sequential probability test. Sequential data from GeneBank are used for the experiments in this chapter.

Original languageEnglish (US)
Title of host publicationSpringer Optimization and Its Applications
PublisherSpringer International Publishing
Pages47-56
Number of pages10
DOIs
StatePublished - Jan 1 2007

Publication series

NameSpringer Optimization and Its Applications
Volume7
ISSN (Print)1931-6828
ISSN (Electronic)1931-6836

Fingerprint

Markov Model
Higher Order
Gene
Markov chain
DNA Sequence
Immediately
Genome
Model
Model-based
Sufficient
Alternatives
Experiment
Human
Evidence

All Science Journal Classification (ASJC) codes

  • Control and Optimization

Cite this

Liu, Z., Chen, D., & Chen, X. W. (2007). CpG island identification with higher order and variable order markov models. In Springer Optimization and Its Applications (pp. 47-56). (Springer Optimization and Its Applications; Vol. 7). Springer International Publishing. https://doi.org/10.1007/978-0-387-69319-4_4
Liu, Zhenqiu ; Chen, Dechang ; Chen, Xue Wen. / CpG island identification with higher order and variable order markov models. Springer Optimization and Its Applications. Springer International Publishing, 2007. pp. 47-56 (Springer Optimization and Its Applications).
@inbook{daaeb56b1218459c9773a4dc29daae4a,
title = "CpG island identification with higher order and variable order markov models",
abstract = "Identifying the location and function of human genes in a long sequence of genome is difficult due to lack of sufficient information about genes. Experimental evidence has suggested that there exists strong correlation between CpG islands and genes immediately following them. Much research has been done to identify CpG islands in a DNA sequence using various models. In this chapter, we introduce two alternative models based on high order and variable order Markov chains. Compared with the popular models such as the ffist order Markov chain, HMM, and HMT, these two models are much easier to compute and have higher identification accuracies. One unsolved problem with the Markov model is that there is no way to decide the exact boundary point between CpG and non-CpG islands. In this chapter, we provide a novel tool to decide the boundary points using the sequential probability test. Sequential data from GeneBank are used for the experiments in this chapter.",
author = "Zhenqiu Liu and Dechang Chen and Chen, {Xue Wen}",
year = "2007",
month = "1",
day = "1",
doi = "10.1007/978-0-387-69319-4_4",
language = "English (US)",
series = "Springer Optimization and Its Applications",
publisher = "Springer International Publishing",
pages = "47--56",
booktitle = "Springer Optimization and Its Applications",

}

Liu, Z, Chen, D & Chen, XW 2007, CpG island identification with higher order and variable order markov models. in Springer Optimization and Its Applications. Springer Optimization and Its Applications, vol. 7, Springer International Publishing, pp. 47-56. https://doi.org/10.1007/978-0-387-69319-4_4

CpG island identification with higher order and variable order markov models. / Liu, Zhenqiu; Chen, Dechang; Chen, Xue Wen.

Springer Optimization and Its Applications. Springer International Publishing, 2007. p. 47-56 (Springer Optimization and Its Applications; Vol. 7).

Research output: Chapter in Book/Report/Conference proceedingChapter

TY - CHAP

T1 - CpG island identification with higher order and variable order markov models

AU - Liu, Zhenqiu

AU - Chen, Dechang

AU - Chen, Xue Wen

PY - 2007/1/1

Y1 - 2007/1/1

N2 - Identifying the location and function of human genes in a long sequence of genome is difficult due to lack of sufficient information about genes. Experimental evidence has suggested that there exists strong correlation between CpG islands and genes immediately following them. Much research has been done to identify CpG islands in a DNA sequence using various models. In this chapter, we introduce two alternative models based on high order and variable order Markov chains. Compared with the popular models such as the ffist order Markov chain, HMM, and HMT, these two models are much easier to compute and have higher identification accuracies. One unsolved problem with the Markov model is that there is no way to decide the exact boundary point between CpG and non-CpG islands. In this chapter, we provide a novel tool to decide the boundary points using the sequential probability test. Sequential data from GeneBank are used for the experiments in this chapter.

AB - Identifying the location and function of human genes in a long sequence of genome is difficult due to lack of sufficient information about genes. Experimental evidence has suggested that there exists strong correlation between CpG islands and genes immediately following them. Much research has been done to identify CpG islands in a DNA sequence using various models. In this chapter, we introduce two alternative models based on high order and variable order Markov chains. Compared with the popular models such as the ffist order Markov chain, HMM, and HMT, these two models are much easier to compute and have higher identification accuracies. One unsolved problem with the Markov model is that there is no way to decide the exact boundary point between CpG and non-CpG islands. In this chapter, we provide a novel tool to decide the boundary points using the sequential probability test. Sequential data from GeneBank are used for the experiments in this chapter.

UR - http://www.scopus.com/inward/record.url?scp=84976470558&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84976470558&partnerID=8YFLogxK

U2 - 10.1007/978-0-387-69319-4_4

DO - 10.1007/978-0-387-69319-4_4

M3 - Chapter

AN - SCOPUS:84976470558

T3 - Springer Optimization and Its Applications

SP - 47

EP - 56

BT - Springer Optimization and Its Applications

PB - Springer International Publishing

ER -

Liu Z, Chen D, Chen XW. CpG island identification with higher order and variable order markov models. In Springer Optimization and Its Applications. Springer International Publishing. 2007. p. 47-56. (Springer Optimization and Its Applications). https://doi.org/10.1007/978-0-387-69319-4_4