Hybrid models for semantic classification of Chinese unknown words

Research output: Contribution to conferencePaper

8 Citations (Scopus)

Abstract

This paper addresses the problem of classifying Chinese unknown words into fine-grained semantic categories defined in a Chinese thesaurus. We describe three novel knowledge-based models that capture the relationship between the semantic categories of an unknown word and those of its component characters in three different ways. We then combine two of the knowledge-based models with a corpus-based model which classifies unknown words using contextual information. Experiments show that the knowledge-based models outperform previous methods on the same task, but the use of contextual information does not further improve performance.

Original languageEnglish (US)
Pages188-195
Number of pages8
StatePublished - Dec 1 2007
EventHuman Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics, NAACL HLT 2007 - Rochester, NY, United States
Duration: Apr 22 2007Apr 27 2007

Other

OtherHuman Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics, NAACL HLT 2007
CountryUnited States
CityRochester, NY
Period4/22/074/27/07

Fingerprint

semantics
knowledge
thesaurus
Hybrid Model
experiment
performance
Contextual
Semantic Category

All Science Journal Classification (ASJC) codes

  • Language and Linguistics
  • Linguistics and Language

Cite this

Lu, X. (2007). Hybrid models for semantic classification of Chinese unknown words. 188-195. Paper presented at Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics, NAACL HLT 2007, Rochester, NY, United States.
Lu, Xiaofei. / Hybrid models for semantic classification of Chinese unknown words. Paper presented at Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics, NAACL HLT 2007, Rochester, NY, United States.8 p.
@conference{c8c4f95b7f9d44469419d939ca093ea9,
title = "Hybrid models for semantic classification of Chinese unknown words",
abstract = "This paper addresses the problem of classifying Chinese unknown words into fine-grained semantic categories defined in a Chinese thesaurus. We describe three novel knowledge-based models that capture the relationship between the semantic categories of an unknown word and those of its component characters in three different ways. We then combine two of the knowledge-based models with a corpus-based model which classifies unknown words using contextual information. Experiments show that the knowledge-based models outperform previous methods on the same task, but the use of contextual information does not further improve performance.",
author = "Xiaofei Lu",
year = "2007",
month = "12",
day = "1",
language = "English (US)",
pages = "188--195",
note = "Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics, NAACL HLT 2007 ; Conference date: 22-04-2007 Through 27-04-2007",

}

Lu, X 2007, 'Hybrid models for semantic classification of Chinese unknown words', Paper presented at Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics, NAACL HLT 2007, Rochester, NY, United States, 4/22/07 - 4/27/07 pp. 188-195.

Hybrid models for semantic classification of Chinese unknown words. / Lu, Xiaofei.

2007. 188-195 Paper presented at Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics, NAACL HLT 2007, Rochester, NY, United States.

Research output: Contribution to conferencePaper

TY - CONF

T1 - Hybrid models for semantic classification of Chinese unknown words

AU - Lu, Xiaofei

PY - 2007/12/1

Y1 - 2007/12/1

N2 - This paper addresses the problem of classifying Chinese unknown words into fine-grained semantic categories defined in a Chinese thesaurus. We describe three novel knowledge-based models that capture the relationship between the semantic categories of an unknown word and those of its component characters in three different ways. We then combine two of the knowledge-based models with a corpus-based model which classifies unknown words using contextual information. Experiments show that the knowledge-based models outperform previous methods on the same task, but the use of contextual information does not further improve performance.

AB - This paper addresses the problem of classifying Chinese unknown words into fine-grained semantic categories defined in a Chinese thesaurus. We describe three novel knowledge-based models that capture the relationship between the semantic categories of an unknown word and those of its component characters in three different ways. We then combine two of the knowledge-based models with a corpus-based model which classifies unknown words using contextual information. Experiments show that the knowledge-based models outperform previous methods on the same task, but the use of contextual information does not further improve performance.

UR - http://www.scopus.com/inward/record.url?scp=79952255904&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79952255904&partnerID=8YFLogxK

M3 - Paper

AN - SCOPUS:79952255904

SP - 188

EP - 195

ER -

Lu X. Hybrid models for semantic classification of Chinese unknown words. 2007. Paper presented at Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics, NAACL HLT 2007, Rochester, NY, United States.