Construction of a Chinese opinion treebank

Lun Wei Ku, Ting Hao Huang, Hsin Hsi Chen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

In this paper, we base on the syntactic structural Chinese Treebank corpus, construct the Chinese Opinon Treebank for the research of opinion analysis. We introduce the tagging scheme and develop a tagging tool for constructing this corpus. Annotated samples are described. Information including opinions (yes or no), their polarities (positive, neutral or negative), types (expression, status, or action), is defined and annotated. In addition, five structure trios are introduced according to the linguistic relations between two Chinese words. Four of them that are possibly related to opinions are also annotated in the constructed corpus to provide the linguistic cues. The number of opinion sentences together with the number of their polarities, opinion types, and trio types are calculated. These statistics are compared and discussed. To know the quality of the annotations in this corpus, the kappa values of the annotations are calculated. The substantial agreement between annotations ensures the applicability and reliability of the constructed corpus.

Original languageEnglish (US)
Title of host publicationProceedings of the 7th International Conference on Language Resources and Evaluation, LREC 2010
EditorsDaniel Tapias, Irene Russo, Olivier Hamon, Stelios Piperidis, Nicoletta Calzolari, Khalid Choukri, Joseph Mariani, Helene Mazo, Bente Maegaard, Jan Odijk, Mike Rosner
PublisherEuropean Language Resources Association (ELRA)
Pages1315-1319
Number of pages5
ISBN (Electronic)2951740867, 9782951740860
StatePublished - Jan 1 2010
Event7th International Conference on Language Resources and Evaluation, LREC 2010 - Valletta, Malta
Duration: May 17 2010May 23 2010

Publication series

NameProceedings of the 7th International Conference on Language Resources and Evaluation, LREC 2010

Other

Other7th International Conference on Language Resources and Evaluation, LREC 2010
CountryMalta
CityValletta
Period5/17/105/23/10

Fingerprint

linguistics
statistics
Treebank
Annotation
Polarity
Trio
Tagging
Statistics
Syntax

All Science Journal Classification (ASJC) codes

  • Education
  • Library and Information Sciences
  • Linguistics and Language
  • Language and Linguistics

Cite this

Ku, L. W., Huang, T. H., & Chen, H. H. (2010). Construction of a Chinese opinion treebank. In D. Tapias, I. Russo, O. Hamon, S. Piperidis, N. Calzolari, K. Choukri, J. Mariani, H. Mazo, B. Maegaard, J. Odijk, ... M. Rosner (Eds.), Proceedings of the 7th International Conference on Language Resources and Evaluation, LREC 2010 (pp. 1315-1319). (Proceedings of the 7th International Conference on Language Resources and Evaluation, LREC 2010). European Language Resources Association (ELRA).
Ku, Lun Wei ; Huang, Ting Hao ; Chen, Hsin Hsi. / Construction of a Chinese opinion treebank. Proceedings of the 7th International Conference on Language Resources and Evaluation, LREC 2010. editor / Daniel Tapias ; Irene Russo ; Olivier Hamon ; Stelios Piperidis ; Nicoletta Calzolari ; Khalid Choukri ; Joseph Mariani ; Helene Mazo ; Bente Maegaard ; Jan Odijk ; Mike Rosner. European Language Resources Association (ELRA), 2010. pp. 1315-1319 (Proceedings of the 7th International Conference on Language Resources and Evaluation, LREC 2010).
@inproceedings{cb07a2546e2e4382a9cb9bfb565b5389,
title = "Construction of a Chinese opinion treebank",
abstract = "In this paper, we base on the syntactic structural Chinese Treebank corpus, construct the Chinese Opinon Treebank for the research of opinion analysis. We introduce the tagging scheme and develop a tagging tool for constructing this corpus. Annotated samples are described. Information including opinions (yes or no), their polarities (positive, neutral or negative), types (expression, status, or action), is defined and annotated. In addition, five structure trios are introduced according to the linguistic relations between two Chinese words. Four of them that are possibly related to opinions are also annotated in the constructed corpus to provide the linguistic cues. The number of opinion sentences together with the number of their polarities, opinion types, and trio types are calculated. These statistics are compared and discussed. To know the quality of the annotations in this corpus, the kappa values of the annotations are calculated. The substantial agreement between annotations ensures the applicability and reliability of the constructed corpus.",
author = "Ku, {Lun Wei} and Huang, {Ting Hao} and Chen, {Hsin Hsi}",
year = "2010",
month = "1",
day = "1",
language = "English (US)",
series = "Proceedings of the 7th International Conference on Language Resources and Evaluation, LREC 2010",
publisher = "European Language Resources Association (ELRA)",
pages = "1315--1319",
editor = "Daniel Tapias and Irene Russo and Olivier Hamon and Stelios Piperidis and Nicoletta Calzolari and Khalid Choukri and Joseph Mariani and Helene Mazo and Bente Maegaard and Jan Odijk and Mike Rosner",
booktitle = "Proceedings of the 7th International Conference on Language Resources and Evaluation, LREC 2010",

}

Ku, LW, Huang, TH & Chen, HH 2010, Construction of a Chinese opinion treebank. in D Tapias, I Russo, O Hamon, S Piperidis, N Calzolari, K Choukri, J Mariani, H Mazo, B Maegaard, J Odijk & M Rosner (eds), Proceedings of the 7th International Conference on Language Resources and Evaluation, LREC 2010. Proceedings of the 7th International Conference on Language Resources and Evaluation, LREC 2010, European Language Resources Association (ELRA), pp. 1315-1319, 7th International Conference on Language Resources and Evaluation, LREC 2010, Valletta, Malta, 5/17/10.

Construction of a Chinese opinion treebank. / Ku, Lun Wei; Huang, Ting Hao; Chen, Hsin Hsi.

Proceedings of the 7th International Conference on Language Resources and Evaluation, LREC 2010. ed. / Daniel Tapias; Irene Russo; Olivier Hamon; Stelios Piperidis; Nicoletta Calzolari; Khalid Choukri; Joseph Mariani; Helene Mazo; Bente Maegaard; Jan Odijk; Mike Rosner. European Language Resources Association (ELRA), 2010. p. 1315-1319 (Proceedings of the 7th International Conference on Language Resources and Evaluation, LREC 2010).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Construction of a Chinese opinion treebank

AU - Ku, Lun Wei

AU - Huang, Ting Hao

AU - Chen, Hsin Hsi

PY - 2010/1/1

Y1 - 2010/1/1

N2 - In this paper, we base on the syntactic structural Chinese Treebank corpus, construct the Chinese Opinon Treebank for the research of opinion analysis. We introduce the tagging scheme and develop a tagging tool for constructing this corpus. Annotated samples are described. Information including opinions (yes or no), their polarities (positive, neutral or negative), types (expression, status, or action), is defined and annotated. In addition, five structure trios are introduced according to the linguistic relations between two Chinese words. Four of them that are possibly related to opinions are also annotated in the constructed corpus to provide the linguistic cues. The number of opinion sentences together with the number of their polarities, opinion types, and trio types are calculated. These statistics are compared and discussed. To know the quality of the annotations in this corpus, the kappa values of the annotations are calculated. The substantial agreement between annotations ensures the applicability and reliability of the constructed corpus.

AB - In this paper, we base on the syntactic structural Chinese Treebank corpus, construct the Chinese Opinon Treebank for the research of opinion analysis. We introduce the tagging scheme and develop a tagging tool for constructing this corpus. Annotated samples are described. Information including opinions (yes or no), their polarities (positive, neutral or negative), types (expression, status, or action), is defined and annotated. In addition, five structure trios are introduced according to the linguistic relations between two Chinese words. Four of them that are possibly related to opinions are also annotated in the constructed corpus to provide the linguistic cues. The number of opinion sentences together with the number of their polarities, opinion types, and trio types are calculated. These statistics are compared and discussed. To know the quality of the annotations in this corpus, the kappa values of the annotations are calculated. The substantial agreement between annotations ensures the applicability and reliability of the constructed corpus.

UR - http://www.scopus.com/inward/record.url?scp=85037155974&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85037155974&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85037155974

T3 - Proceedings of the 7th International Conference on Language Resources and Evaluation, LREC 2010

SP - 1315

EP - 1319

BT - Proceedings of the 7th International Conference on Language Resources and Evaluation, LREC 2010

A2 - Tapias, Daniel

A2 - Russo, Irene

A2 - Hamon, Olivier

A2 - Piperidis, Stelios

A2 - Calzolari, Nicoletta

A2 - Choukri, Khalid

A2 - Mariani, Joseph

A2 - Mazo, Helene

A2 - Maegaard, Bente

A2 - Odijk, Jan

A2 - Rosner, Mike

PB - European Language Resources Association (ELRA)

ER -

Ku LW, Huang TH, Chen HH. Construction of a Chinese opinion treebank. In Tapias D, Russo I, Hamon O, Piperidis S, Calzolari N, Choukri K, Mariani J, Mazo H, Maegaard B, Odijk J, Rosner M, editors, Proceedings of the 7th International Conference on Language Resources and Evaluation, LREC 2010. European Language Resources Association (ELRA). 2010. p. 1315-1319. (Proceedings of the 7th International Conference on Language Resources and Evaluation, LREC 2010).