Self-organizing maps of position weight matrices for motif discovery in biological sequences

Shaun A. Mahony, David Hendrix, Terry J. Smith, Aaron Golden

Research output: Contribution to journalArticle

10 Citations (Scopus)

Abstract

The identification of overrepresented motifs in a collection of biological sequences continues to be a relevant and challenging problem in computational biology. Currently popular methods of motif discovery are based on statistical learning theory. In this paper, a machine-learning approach to the motif discovery problem is explored. The approach is based on a Self-Organizing Map (SOM) where the output layer neuron weight vectors are replaced by position weight matrices. This approach can be used to characterise features present in a set of sequences, and thus can be used as an aid in overrepresented motif discovery. The SOM approach to motif discovery is demonstrated using biological sequence datasets, both real and simulated.

Original languageEnglish (US)
Pages (from-to)397-413
Number of pages17
JournalArtificial Intelligence Review
Volume24
Issue number3-4
DOIs
StatePublished - Nov 1 2005

Fingerprint

Self organizing maps
learning theory
Neurons
Learning systems
biology
present
learning
Self-organizing Map
Motifs

All Science Journal Classification (ASJC) codes

  • Control and Systems Engineering
  • Artificial Intelligence

Cite this

Mahony, Shaun A. ; Hendrix, David ; Smith, Terry J. ; Golden, Aaron. / Self-organizing maps of position weight matrices for motif discovery in biological sequences. In: Artificial Intelligence Review. 2005 ; Vol. 24, No. 3-4. pp. 397-413.
@article{e8d27efb80e149928b524a43f192d499,
title = "Self-organizing maps of position weight matrices for motif discovery in biological sequences",
abstract = "The identification of overrepresented motifs in a collection of biological sequences continues to be a relevant and challenging problem in computational biology. Currently popular methods of motif discovery are based on statistical learning theory. In this paper, a machine-learning approach to the motif discovery problem is explored. The approach is based on a Self-Organizing Map (SOM) where the output layer neuron weight vectors are replaced by position weight matrices. This approach can be used to characterise features present in a set of sequences, and thus can be used as an aid in overrepresented motif discovery. The SOM approach to motif discovery is demonstrated using biological sequence datasets, both real and simulated.",
author = "Mahony, {Shaun A.} and David Hendrix and Smith, {Terry J.} and Aaron Golden",
year = "2005",
month = "11",
day = "1",
doi = "10.1007/s10462-005-9011-9",
language = "English (US)",
volume = "24",
pages = "397--413",
journal = "Artificial Intelligence Review",
issn = "0269-2821",
publisher = "Springer Netherlands",
number = "3-4",

}

Self-organizing maps of position weight matrices for motif discovery in biological sequences. / Mahony, Shaun A.; Hendrix, David; Smith, Terry J.; Golden, Aaron.

In: Artificial Intelligence Review, Vol. 24, No. 3-4, 01.11.2005, p. 397-413.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Self-organizing maps of position weight matrices for motif discovery in biological sequences

AU - Mahony, Shaun A.

AU - Hendrix, David

AU - Smith, Terry J.

AU - Golden, Aaron

PY - 2005/11/1

Y1 - 2005/11/1

N2 - The identification of overrepresented motifs in a collection of biological sequences continues to be a relevant and challenging problem in computational biology. Currently popular methods of motif discovery are based on statistical learning theory. In this paper, a machine-learning approach to the motif discovery problem is explored. The approach is based on a Self-Organizing Map (SOM) where the output layer neuron weight vectors are replaced by position weight matrices. This approach can be used to characterise features present in a set of sequences, and thus can be used as an aid in overrepresented motif discovery. The SOM approach to motif discovery is demonstrated using biological sequence datasets, both real and simulated.

AB - The identification of overrepresented motifs in a collection of biological sequences continues to be a relevant and challenging problem in computational biology. Currently popular methods of motif discovery are based on statistical learning theory. In this paper, a machine-learning approach to the motif discovery problem is explored. The approach is based on a Self-Organizing Map (SOM) where the output layer neuron weight vectors are replaced by position weight matrices. This approach can be used to characterise features present in a set of sequences, and thus can be used as an aid in overrepresented motif discovery. The SOM approach to motif discovery is demonstrated using biological sequence datasets, both real and simulated.

UR - http://www.scopus.com/inward/record.url?scp=29144501231&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=29144501231&partnerID=8YFLogxK

U2 - 10.1007/s10462-005-9011-9

DO - 10.1007/s10462-005-9011-9

M3 - Article

AN - SCOPUS:29144501231

VL - 24

SP - 397

EP - 413

JO - Artificial Intelligence Review

JF - Artificial Intelligence Review

SN - 0269-2821

IS - 3-4

ER -