Empirical comparisons of MASC word sense annotations

Gerard De Melo, Collin F. Baker, Nancy Ide, Rebecca Jane Passonneau, Christiane Fellbaum

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Citations (Scopus)

Abstract

We analyze how different conceptions of lexical semantics affect sense annotations and how multiple sense inventories can be compared empirically, based on annotated text. Our study focuses on the MASC project, where data has been annotated using WordNet sense identifiers on the one hand, and FrameNet lexical units on the other. This allows us to compare the sense inventories of these lexical resources empirically rather than just theoretically, based on their glosses, leading to new insights. In particular, we compute contingency matrices and develop a novel measure, the Expected Jaccard Index, that quantifies the agreement between annotations of the same data based on two different resources even when they have different sets of categories.

Original languageEnglish (US)
Title of host publicationProceedings of the 8th International Conference on Language Resources and Evaluation, LREC 2012
EditorsMehmet Ugur Dogan, Joseph Mariani, Asuncion Moreno, Sara Goggi, Khalid Choukri, Nicoletta Calzolari, Jan Odijk, Thierry Declerck, Bente Maegaard, Stelios Piperidis, Helene Mazo, Olivier Hamon
PublisherEuropean Language Resources Association (ELRA)
Pages3036-3043
Number of pages8
ISBN (Electronic)9782951740877
StatePublished - Jan 1 2012
Event8th International Conference on Language Resources and Evaluation, LREC 2012 - Istanbul, Turkey
Duration: May 21 2012May 27 2012

Publication series

NameProceedings of the 8th International Conference on Language Resources and Evaluation, LREC 2012

Other

Other8th International Conference on Language Resources and Evaluation, LREC 2012
CountryTurkey
CityIstanbul
Period5/21/125/27/12

Fingerprint

gloss
resources
contingency
semantics
Word Sense
Annotation
WordNet
Resources
Lexical Resources
Gloss
Lexical Semantics
Lexical Unit
Conception
Contingency

All Science Journal Classification (ASJC) codes

  • Linguistics and Language
  • Language and Linguistics
  • Education
  • Library and Information Sciences

Cite this

De Melo, G., Baker, C. F., Ide, N., Passonneau, R. J., & Fellbaum, C. (2012). Empirical comparisons of MASC word sense annotations. In M. U. Dogan, J. Mariani, A. Moreno, S. Goggi, K. Choukri, N. Calzolari, J. Odijk, T. Declerck, B. Maegaard, S. Piperidis, H. Mazo, ... O. Hamon (Eds.), Proceedings of the 8th International Conference on Language Resources and Evaluation, LREC 2012 (pp. 3036-3043). (Proceedings of the 8th International Conference on Language Resources and Evaluation, LREC 2012). European Language Resources Association (ELRA).
De Melo, Gerard ; Baker, Collin F. ; Ide, Nancy ; Passonneau, Rebecca Jane ; Fellbaum, Christiane. / Empirical comparisons of MASC word sense annotations. Proceedings of the 8th International Conference on Language Resources and Evaluation, LREC 2012. editor / Mehmet Ugur Dogan ; Joseph Mariani ; Asuncion Moreno ; Sara Goggi ; Khalid Choukri ; Nicoletta Calzolari ; Jan Odijk ; Thierry Declerck ; Bente Maegaard ; Stelios Piperidis ; Helene Mazo ; Olivier Hamon. European Language Resources Association (ELRA), 2012. pp. 3036-3043 (Proceedings of the 8th International Conference on Language Resources and Evaluation, LREC 2012).
@inproceedings{d1e2cb90fd794ab0be02be4036c75a8e,
title = "Empirical comparisons of MASC word sense annotations",
abstract = "We analyze how different conceptions of lexical semantics affect sense annotations and how multiple sense inventories can be compared empirically, based on annotated text. Our study focuses on the MASC project, where data has been annotated using WordNet sense identifiers on the one hand, and FrameNet lexical units on the other. This allows us to compare the sense inventories of these lexical resources empirically rather than just theoretically, based on their glosses, leading to new insights. In particular, we compute contingency matrices and develop a novel measure, the Expected Jaccard Index, that quantifies the agreement between annotations of the same data based on two different resources even when they have different sets of categories.",
author = "{De Melo}, Gerard and Baker, {Collin F.} and Nancy Ide and Passonneau, {Rebecca Jane} and Christiane Fellbaum",
year = "2012",
month = "1",
day = "1",
language = "English (US)",
series = "Proceedings of the 8th International Conference on Language Resources and Evaluation, LREC 2012",
publisher = "European Language Resources Association (ELRA)",
pages = "3036--3043",
editor = "Dogan, {Mehmet Ugur} and Joseph Mariani and Asuncion Moreno and Sara Goggi and Khalid Choukri and Nicoletta Calzolari and Jan Odijk and Thierry Declerck and Bente Maegaard and Stelios Piperidis and Helene Mazo and Olivier Hamon",
booktitle = "Proceedings of the 8th International Conference on Language Resources and Evaluation, LREC 2012",

}

De Melo, G, Baker, CF, Ide, N, Passonneau, RJ & Fellbaum, C 2012, Empirical comparisons of MASC word sense annotations. in MU Dogan, J Mariani, A Moreno, S Goggi, K Choukri, N Calzolari, J Odijk, T Declerck, B Maegaard, S Piperidis, H Mazo & O Hamon (eds), Proceedings of the 8th International Conference on Language Resources and Evaluation, LREC 2012. Proceedings of the 8th International Conference on Language Resources and Evaluation, LREC 2012, European Language Resources Association (ELRA), pp. 3036-3043, 8th International Conference on Language Resources and Evaluation, LREC 2012, Istanbul, Turkey, 5/21/12.

Empirical comparisons of MASC word sense annotations. / De Melo, Gerard; Baker, Collin F.; Ide, Nancy; Passonneau, Rebecca Jane; Fellbaum, Christiane.

Proceedings of the 8th International Conference on Language Resources and Evaluation, LREC 2012. ed. / Mehmet Ugur Dogan; Joseph Mariani; Asuncion Moreno; Sara Goggi; Khalid Choukri; Nicoletta Calzolari; Jan Odijk; Thierry Declerck; Bente Maegaard; Stelios Piperidis; Helene Mazo; Olivier Hamon. European Language Resources Association (ELRA), 2012. p. 3036-3043 (Proceedings of the 8th International Conference on Language Resources and Evaluation, LREC 2012).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Empirical comparisons of MASC word sense annotations

AU - De Melo, Gerard

AU - Baker, Collin F.

AU - Ide, Nancy

AU - Passonneau, Rebecca Jane

AU - Fellbaum, Christiane

PY - 2012/1/1

Y1 - 2012/1/1

N2 - We analyze how different conceptions of lexical semantics affect sense annotations and how multiple sense inventories can be compared empirically, based on annotated text. Our study focuses on the MASC project, where data has been annotated using WordNet sense identifiers on the one hand, and FrameNet lexical units on the other. This allows us to compare the sense inventories of these lexical resources empirically rather than just theoretically, based on their glosses, leading to new insights. In particular, we compute contingency matrices and develop a novel measure, the Expected Jaccard Index, that quantifies the agreement between annotations of the same data based on two different resources even when they have different sets of categories.

AB - We analyze how different conceptions of lexical semantics affect sense annotations and how multiple sense inventories can be compared empirically, based on annotated text. Our study focuses on the MASC project, where data has been annotated using WordNet sense identifiers on the one hand, and FrameNet lexical units on the other. This allows us to compare the sense inventories of these lexical resources empirically rather than just theoretically, based on their glosses, leading to new insights. In particular, we compute contingency matrices and develop a novel measure, the Expected Jaccard Index, that quantifies the agreement between annotations of the same data based on two different resources even when they have different sets of categories.

UR - http://www.scopus.com/inward/record.url?scp=84929349762&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84929349762&partnerID=8YFLogxK

M3 - Conference contribution

T3 - Proceedings of the 8th International Conference on Language Resources and Evaluation, LREC 2012

SP - 3036

EP - 3043

BT - Proceedings of the 8th International Conference on Language Resources and Evaluation, LREC 2012

A2 - Dogan, Mehmet Ugur

A2 - Mariani, Joseph

A2 - Moreno, Asuncion

A2 - Goggi, Sara

A2 - Choukri, Khalid

A2 - Calzolari, Nicoletta

A2 - Odijk, Jan

A2 - Declerck, Thierry

A2 - Maegaard, Bente

A2 - Piperidis, Stelios

A2 - Mazo, Helene

A2 - Hamon, Olivier

PB - European Language Resources Association (ELRA)

ER -

De Melo G, Baker CF, Ide N, Passonneau RJ, Fellbaum C. Empirical comparisons of MASC word sense annotations. In Dogan MU, Mariani J, Moreno A, Goggi S, Choukri K, Calzolari N, Odijk J, Declerck T, Maegaard B, Piperidis S, Mazo H, Hamon O, editors, Proceedings of the 8th International Conference on Language Resources and Evaluation, LREC 2012. European Language Resources Association (ELRA). 2012. p. 3036-3043. (Proceedings of the 8th International Conference on Language Resources and Evaluation, LREC 2012).