RNBL-MN: A recursive Naive Bayes learner for sequence classification

Dae Ki Kang, Adrian Silvescu, Vasant Honavar

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

Naive Bayes (NB) classifier relies on the assumption that the instances in each class can be described by a single generative model. This assumption can be restrictive in many real world classification tasks. We describe RNBL-MN, which relaxes this assumption by constructing a tree of Naive Bayes classifiers for sequence classification, where each individual NB classifier in the tree is based on a multinomial event model (one for each class at each node in the tree). In our experiments on protein sequence and text classification tasks, we observe that RNBL-MN substantially outperforms NB classifier. Furthermore, our experiments show that RNBL-MN outperforms C4.5 decision tree learner (using tests on sequence composition statistics as the splitting criterion) and yields accuracies that are comparable to those of support vector machines (SVM) using similar information.

Original languageEnglish (US)
Title of host publicationAdvances in Knowledge Discovery and Data Mining - 10th Pacific-Asia Conference, PAKDD 2006, Proceedings
Pages45-54
Number of pages10
StatePublished - Jul 14 2006
Event10th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, PAKDD 2006 - Singapore, Singapore
Duration: Apr 9 2006Apr 12 2006

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume3918 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other10th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, PAKDD 2006
CountrySingapore
CitySingapore
Period4/9/064/12/06

Fingerprint

Naive Bayes Classifier
Naive Bayes
Classifiers
Generative Models
Text Classification
Protein Sequence
Decision trees
Decision tree
Experiment
Support vector machines
Support Vector Machine
Experiments
Statistics
Proteins
Vertex of a graph
Chemical analysis
Class

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Kang, D. K., Silvescu, A., & Honavar, V. (2006). RNBL-MN: A recursive Naive Bayes learner for sequence classification. In Advances in Knowledge Discovery and Data Mining - 10th Pacific-Asia Conference, PAKDD 2006, Proceedings (pp. 45-54). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 3918 LNAI).
Kang, Dae Ki ; Silvescu, Adrian ; Honavar, Vasant. / RNBL-MN : A recursive Naive Bayes learner for sequence classification. Advances in Knowledge Discovery and Data Mining - 10th Pacific-Asia Conference, PAKDD 2006, Proceedings. 2006. pp. 45-54 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{0df98c14e09e46cb9522795ad2dd45bc,
title = "RNBL-MN: A recursive Naive Bayes learner for sequence classification",
abstract = "Naive Bayes (NB) classifier relies on the assumption that the instances in each class can be described by a single generative model. This assumption can be restrictive in many real world classification tasks. We describe RNBL-MN, which relaxes this assumption by constructing a tree of Naive Bayes classifiers for sequence classification, where each individual NB classifier in the tree is based on a multinomial event model (one for each class at each node in the tree). In our experiments on protein sequence and text classification tasks, we observe that RNBL-MN substantially outperforms NB classifier. Furthermore, our experiments show that RNBL-MN outperforms C4.5 decision tree learner (using tests on sequence composition statistics as the splitting criterion) and yields accuracies that are comparable to those of support vector machines (SVM) using similar information.",
author = "Kang, {Dae Ki} and Adrian Silvescu and Vasant Honavar",
year = "2006",
month = "7",
day = "14",
language = "English (US)",
isbn = "3540332065",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "45--54",
booktitle = "Advances in Knowledge Discovery and Data Mining - 10th Pacific-Asia Conference, PAKDD 2006, Proceedings",

}

Kang, DK, Silvescu, A & Honavar, V 2006, RNBL-MN: A recursive Naive Bayes learner for sequence classification. in Advances in Knowledge Discovery and Data Mining - 10th Pacific-Asia Conference, PAKDD 2006, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 3918 LNAI, pp. 45-54, 10th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, PAKDD 2006, Singapore, Singapore, 4/9/06.

RNBL-MN : A recursive Naive Bayes learner for sequence classification. / Kang, Dae Ki; Silvescu, Adrian; Honavar, Vasant.

Advances in Knowledge Discovery and Data Mining - 10th Pacific-Asia Conference, PAKDD 2006, Proceedings. 2006. p. 45-54 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 3918 LNAI).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - RNBL-MN

T2 - A recursive Naive Bayes learner for sequence classification

AU - Kang, Dae Ki

AU - Silvescu, Adrian

AU - Honavar, Vasant

PY - 2006/7/14

Y1 - 2006/7/14

N2 - Naive Bayes (NB) classifier relies on the assumption that the instances in each class can be described by a single generative model. This assumption can be restrictive in many real world classification tasks. We describe RNBL-MN, which relaxes this assumption by constructing a tree of Naive Bayes classifiers for sequence classification, where each individual NB classifier in the tree is based on a multinomial event model (one for each class at each node in the tree). In our experiments on protein sequence and text classification tasks, we observe that RNBL-MN substantially outperforms NB classifier. Furthermore, our experiments show that RNBL-MN outperforms C4.5 decision tree learner (using tests on sequence composition statistics as the splitting criterion) and yields accuracies that are comparable to those of support vector machines (SVM) using similar information.

AB - Naive Bayes (NB) classifier relies on the assumption that the instances in each class can be described by a single generative model. This assumption can be restrictive in many real world classification tasks. We describe RNBL-MN, which relaxes this assumption by constructing a tree of Naive Bayes classifiers for sequence classification, where each individual NB classifier in the tree is based on a multinomial event model (one for each class at each node in the tree). In our experiments on protein sequence and text classification tasks, we observe that RNBL-MN substantially outperforms NB classifier. Furthermore, our experiments show that RNBL-MN outperforms C4.5 decision tree learner (using tests on sequence composition statistics as the splitting criterion) and yields accuracies that are comparable to those of support vector machines (SVM) using similar information.

UR - http://www.scopus.com/inward/record.url?scp=33745766242&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33745766242&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:33745766242

SN - 3540332065

SN - 9783540332060

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 45

EP - 54

BT - Advances in Knowledge Discovery and Data Mining - 10th Pacific-Asia Conference, PAKDD 2006, Proceedings

ER -

Kang DK, Silvescu A, Honavar V. RNBL-MN: A recursive Naive Bayes learner for sequence classification. In Advances in Knowledge Discovery and Data Mining - 10th Pacific-Asia Conference, PAKDD 2006, Proceedings. 2006. p. 45-54. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).