A lin-kernighan heuristic for the DCJ median problem of genomes with unequal contents

Zhaoming Yin, Jijun Tang, Stephen Wade Schaeffer, David A. Bader

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this paper, we designed a distance metric as DCJ-Indel-Exemplar distance to estimate the dissimilarity between two genomes with unequal contents (with gene insertions/deletions (Indels) and duplications). Based on the aforementioned distance metric, we proposed the DCJ-Indel-Exemplar median problem, to find a median genome that minimize the DCJ-Indel-Exemplar distance between this genome and the given three genomes. We adapted Lin-Kernighan (LK) heuristic to calculate the median quickly by utilizing the features of adequate sub-graph decomposition and search space reduction technologies. Experimental results on simulated gene order data indicate that our distance estimator can closely estimate the real number of rearrangement events; while compared with the exact solver using equal content genomes, our median solver can get very accurate results as well. More importantly, our median solver can deal with Indels and duplications and generates results very close to the synthetic cumulative number of evolutionary events.

Original languageEnglish (US)
Title of host publicationComputing and Combinatorics - 20th International Conference, COCOON 2014, Proceedings
PublisherSpringer Verlag
Pages227-238
Number of pages12
ISBN (Print)9783319087825
DOIs
StatePublished - Jan 1 2014
Event20th International Computing and Combinatorics Conference, COCOON 2014 - Atlanta, GA, United States
Duration: Aug 4 2014Aug 6 2014

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume8591 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other20th International Computing and Combinatorics Conference, COCOON 2014
CountryUnited States
CityAtlanta, GA
Period8/4/148/6/14

Fingerprint

Unequal
Genome
Genes
Heuristics
Distance Metric
Duplication
Deletion
Insertion
Graph Decomposition
Gene
Graph Search
Dissimilarity
Rearrangement
Estimate
Search Space
Subgraph
Minimise
Estimator
Calculate
Experimental Results

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Yin, Z., Tang, J., Schaeffer, S. W., & Bader, D. A. (2014). A lin-kernighan heuristic for the DCJ median problem of genomes with unequal contents. In Computing and Combinatorics - 20th International Conference, COCOON 2014, Proceedings (pp. 227-238). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 8591 LNCS). Springer Verlag. https://doi.org/10.1007/978-3-319-08783-2_20
Yin, Zhaoming ; Tang, Jijun ; Schaeffer, Stephen Wade ; Bader, David A. / A lin-kernighan heuristic for the DCJ median problem of genomes with unequal contents. Computing and Combinatorics - 20th International Conference, COCOON 2014, Proceedings. Springer Verlag, 2014. pp. 227-238 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{50adbf8aac814012900aad81dcad4cc4,
title = "A lin-kernighan heuristic for the DCJ median problem of genomes with unequal contents",
abstract = "In this paper, we designed a distance metric as DCJ-Indel-Exemplar distance to estimate the dissimilarity between two genomes with unequal contents (with gene insertions/deletions (Indels) and duplications). Based on the aforementioned distance metric, we proposed the DCJ-Indel-Exemplar median problem, to find a median genome that minimize the DCJ-Indel-Exemplar distance between this genome and the given three genomes. We adapted Lin-Kernighan (LK) heuristic to calculate the median quickly by utilizing the features of adequate sub-graph decomposition and search space reduction technologies. Experimental results on simulated gene order data indicate that our distance estimator can closely estimate the real number of rearrangement events; while compared with the exact solver using equal content genomes, our median solver can get very accurate results as well. More importantly, our median solver can deal with Indels and duplications and generates results very close to the synthetic cumulative number of evolutionary events.",
author = "Zhaoming Yin and Jijun Tang and Schaeffer, {Stephen Wade} and Bader, {David A.}",
year = "2014",
month = "1",
day = "1",
doi = "10.1007/978-3-319-08783-2_20",
language = "English (US)",
isbn = "9783319087825",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Verlag",
pages = "227--238",
booktitle = "Computing and Combinatorics - 20th International Conference, COCOON 2014, Proceedings",
address = "Germany",

}

Yin, Z, Tang, J, Schaeffer, SW & Bader, DA 2014, A lin-kernighan heuristic for the DCJ median problem of genomes with unequal contents. in Computing and Combinatorics - 20th International Conference, COCOON 2014, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 8591 LNCS, Springer Verlag, pp. 227-238, 20th International Computing and Combinatorics Conference, COCOON 2014, Atlanta, GA, United States, 8/4/14. https://doi.org/10.1007/978-3-319-08783-2_20

A lin-kernighan heuristic for the DCJ median problem of genomes with unequal contents. / Yin, Zhaoming; Tang, Jijun; Schaeffer, Stephen Wade; Bader, David A.

Computing and Combinatorics - 20th International Conference, COCOON 2014, Proceedings. Springer Verlag, 2014. p. 227-238 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 8591 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - A lin-kernighan heuristic for the DCJ median problem of genomes with unequal contents

AU - Yin, Zhaoming

AU - Tang, Jijun

AU - Schaeffer, Stephen Wade

AU - Bader, David A.

PY - 2014/1/1

Y1 - 2014/1/1

N2 - In this paper, we designed a distance metric as DCJ-Indel-Exemplar distance to estimate the dissimilarity between two genomes with unequal contents (with gene insertions/deletions (Indels) and duplications). Based on the aforementioned distance metric, we proposed the DCJ-Indel-Exemplar median problem, to find a median genome that minimize the DCJ-Indel-Exemplar distance between this genome and the given three genomes. We adapted Lin-Kernighan (LK) heuristic to calculate the median quickly by utilizing the features of adequate sub-graph decomposition and search space reduction technologies. Experimental results on simulated gene order data indicate that our distance estimator can closely estimate the real number of rearrangement events; while compared with the exact solver using equal content genomes, our median solver can get very accurate results as well. More importantly, our median solver can deal with Indels and duplications and generates results very close to the synthetic cumulative number of evolutionary events.

AB - In this paper, we designed a distance metric as DCJ-Indel-Exemplar distance to estimate the dissimilarity between two genomes with unequal contents (with gene insertions/deletions (Indels) and duplications). Based on the aforementioned distance metric, we proposed the DCJ-Indel-Exemplar median problem, to find a median genome that minimize the DCJ-Indel-Exemplar distance between this genome and the given three genomes. We adapted Lin-Kernighan (LK) heuristic to calculate the median quickly by utilizing the features of adequate sub-graph decomposition and search space reduction technologies. Experimental results on simulated gene order data indicate that our distance estimator can closely estimate the real number of rearrangement events; while compared with the exact solver using equal content genomes, our median solver can get very accurate results as well. More importantly, our median solver can deal with Indels and duplications and generates results very close to the synthetic cumulative number of evolutionary events.

UR - http://www.scopus.com/inward/record.url?scp=84904740848&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84904740848&partnerID=8YFLogxK

U2 - 10.1007/978-3-319-08783-2_20

DO - 10.1007/978-3-319-08783-2_20

M3 - Conference contribution

SN - 9783319087825

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 227

EP - 238

BT - Computing and Combinatorics - 20th International Conference, COCOON 2014, Proceedings

PB - Springer Verlag

ER -

Yin Z, Tang J, Schaeffer SW, Bader DA. A lin-kernighan heuristic for the DCJ median problem of genomes with unequal contents. In Computing and Combinatorics - 20th International Conference, COCOON 2014, Proceedings. Springer Verlag. 2014. p. 227-238. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-319-08783-2_20