Unsupervised learning of probabilistic context-free grammar using iterative biclustering

Kewei Tu, Vasant Honavar

Research output: Chapter in Book/Report/Conference proceedingConference contribution

8 Citations (Scopus)

Abstract

This paper presents PCFG-BCL, an unsupervised algorithm that learns a probabilistic context-free grammar (PCFG) from positive samples. The algorithm acquires rules of an unknown PCFG through iterative biclustering of bigrams in the training corpus. Our analysis shows that this procedure uses a greedy approach to adding rules such that each set of rules that is added to the grammar results in the largest increase in the posterior of the grammar given the training corpus. Results of our experiments on several benchmark datasets show that PCFG-BCL is competitive with existing methods for unsupervised CFG learning.

Original languageEnglish (US)
Title of host publicationGrammatical Inference
Subtitle of host publicationAlgorithms and Applications - 9th International Colloquium, ICGI 2008, Proceedings
Pages224-237
Number of pages14
DOIs
StatePublished - Nov 28 2008
Event9th International Colloquium on Grammatical Inference, ICGI 2008 - Saint-Malo, France
Duration: Sep 22 2008Sep 24 2008

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume5278 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other9th International Colloquium on Grammatical Inference, ICGI 2008
CountryFrance
CitySaint-Malo
Period9/22/089/24/08

Fingerprint

Biclustering
Context free grammars
Unsupervised learning
Context-free Grammar
Unsupervised Learning
Grammar
Benchmark
Unknown
Experiment
Experiments
Corpus
Training

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Tu, K., & Honavar, V. (2008). Unsupervised learning of probabilistic context-free grammar using iterative biclustering. In Grammatical Inference: Algorithms and Applications - 9th International Colloquium, ICGI 2008, Proceedings (pp. 224-237). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 5278 LNAI). https://doi.org/10.1007/978-3-540-88009-7_18
Tu, Kewei ; Honavar, Vasant. / Unsupervised learning of probabilistic context-free grammar using iterative biclustering. Grammatical Inference: Algorithms and Applications - 9th International Colloquium, ICGI 2008, Proceedings. 2008. pp. 224-237 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{4b6ccde70cce469aa4de417373eca6fe,
title = "Unsupervised learning of probabilistic context-free grammar using iterative biclustering",
abstract = "This paper presents PCFG-BCL, an unsupervised algorithm that learns a probabilistic context-free grammar (PCFG) from positive samples. The algorithm acquires rules of an unknown PCFG through iterative biclustering of bigrams in the training corpus. Our analysis shows that this procedure uses a greedy approach to adding rules such that each set of rules that is added to the grammar results in the largest increase in the posterior of the grammar given the training corpus. Results of our experiments on several benchmark datasets show that PCFG-BCL is competitive with existing methods for unsupervised CFG learning.",
author = "Kewei Tu and Vasant Honavar",
year = "2008",
month = "11",
day = "28",
doi = "10.1007/978-3-540-88009-7_18",
language = "English (US)",
isbn = "3540880089",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "224--237",
booktitle = "Grammatical Inference",

}

Tu, K & Honavar, V 2008, Unsupervised learning of probabilistic context-free grammar using iterative biclustering. in Grammatical Inference: Algorithms and Applications - 9th International Colloquium, ICGI 2008, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 5278 LNAI, pp. 224-237, 9th International Colloquium on Grammatical Inference, ICGI 2008, Saint-Malo, France, 9/22/08. https://doi.org/10.1007/978-3-540-88009-7_18

Unsupervised learning of probabilistic context-free grammar using iterative biclustering. / Tu, Kewei; Honavar, Vasant.

Grammatical Inference: Algorithms and Applications - 9th International Colloquium, ICGI 2008, Proceedings. 2008. p. 224-237 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 5278 LNAI).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Unsupervised learning of probabilistic context-free grammar using iterative biclustering

AU - Tu, Kewei

AU - Honavar, Vasant

PY - 2008/11/28

Y1 - 2008/11/28

N2 - This paper presents PCFG-BCL, an unsupervised algorithm that learns a probabilistic context-free grammar (PCFG) from positive samples. The algorithm acquires rules of an unknown PCFG through iterative biclustering of bigrams in the training corpus. Our analysis shows that this procedure uses a greedy approach to adding rules such that each set of rules that is added to the grammar results in the largest increase in the posterior of the grammar given the training corpus. Results of our experiments on several benchmark datasets show that PCFG-BCL is competitive with existing methods for unsupervised CFG learning.

AB - This paper presents PCFG-BCL, an unsupervised algorithm that learns a probabilistic context-free grammar (PCFG) from positive samples. The algorithm acquires rules of an unknown PCFG through iterative biclustering of bigrams in the training corpus. Our analysis shows that this procedure uses a greedy approach to adding rules such that each set of rules that is added to the grammar results in the largest increase in the posterior of the grammar given the training corpus. Results of our experiments on several benchmark datasets show that PCFG-BCL is competitive with existing methods for unsupervised CFG learning.

UR - http://www.scopus.com/inward/record.url?scp=56649091092&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=56649091092&partnerID=8YFLogxK

U2 - 10.1007/978-3-540-88009-7_18

DO - 10.1007/978-3-540-88009-7_18

M3 - Conference contribution

SN - 3540880089

SN - 9783540880080

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 224

EP - 237

BT - Grammatical Inference

ER -

Tu K, Honavar V. Unsupervised learning of probabilistic context-free grammar using iterative biclustering. In Grammatical Inference: Algorithms and Applications - 9th International Colloquium, ICGI 2008, Proceedings. 2008. p. 224-237. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-540-88009-7_18