Measuring agreement on set-valued items (MASI) for semantic and pragmatic annotation

Research output: Contribution to conferencePaper

56 Citations (Scopus)

Abstract

Annotation projects dealing with complex semantic or pragmatic phenomena face the dilemma of creating annotation schemes that oversimplify the phenomena, or that capture distinctions conventional reliability metrics cannot measure adequately. The solution to the dilemma is to develop metrics that quantify the decisions that annotators are asked to make. This paper discusses MASI, distance metric for comparing sets, and illustrates its use in quantifying the reliability of a specific dataset. Annotations of Summary Content Units (SCUs) generate models referred to as pyramids which can be used to evaluate unseen human summaries or machine summaries. The paper presents reliability results for five pairs of pyramids created for document sets from the 2003 Document Understanding Conference (DUC). The annotators worked independently of each other. Differences between application of MASI to pyramid annotation and its previous application to co-reference annotation are discussed. In addition, it is argued that a paradigmatic reliability study should relate measures of inter-annotator agreement to independent assessments, such as significance tests of the annotated variables with respect to other phenomena. In effect, what counts as sufficiently reliable intera-annotator agreement depends on the use the annotated data will be put to.

Original languageEnglish (US)
Pages831-836
Number of pages6
StatePublished - Jan 1 2006
Event5th International Conference on Language Resources and Evaluation, LREC 2006 - Genoa, Italy
Duration: May 22 2006May 28 2006

Other

Other5th International Conference on Language Resources and Evaluation, LREC 2006
CountryItaly
CityGenoa
Period5/22/065/28/06

Fingerprint

pragmatics
semantics
significance test
Annotation
Onset
Summary

All Science Journal Classification (ASJC) codes

  • Education
  • Library and Information Sciences
  • Linguistics and Language
  • Language and Linguistics

Cite this

Passonneau, R. J. (2006). Measuring agreement on set-valued items (MASI) for semantic and pragmatic annotation. 831-836. Paper presented at 5th International Conference on Language Resources and Evaluation, LREC 2006, Genoa, Italy.
Passonneau, Rebecca Jane. / Measuring agreement on set-valued items (MASI) for semantic and pragmatic annotation. Paper presented at 5th International Conference on Language Resources and Evaluation, LREC 2006, Genoa, Italy.6 p.
@conference{f28337ee9d9a42a9ac012ef58d0c5828,
title = "Measuring agreement on set-valued items (MASI) for semantic and pragmatic annotation",
abstract = "Annotation projects dealing with complex semantic or pragmatic phenomena face the dilemma of creating annotation schemes that oversimplify the phenomena, or that capture distinctions conventional reliability metrics cannot measure adequately. The solution to the dilemma is to develop metrics that quantify the decisions that annotators are asked to make. This paper discusses MASI, distance metric for comparing sets, and illustrates its use in quantifying the reliability of a specific dataset. Annotations of Summary Content Units (SCUs) generate models referred to as pyramids which can be used to evaluate unseen human summaries or machine summaries. The paper presents reliability results for five pairs of pyramids created for document sets from the 2003 Document Understanding Conference (DUC). The annotators worked independently of each other. Differences between application of MASI to pyramid annotation and its previous application to co-reference annotation are discussed. In addition, it is argued that a paradigmatic reliability study should relate measures of inter-annotator agreement to independent assessments, such as significance tests of the annotated variables with respect to other phenomena. In effect, what counts as sufficiently reliable intera-annotator agreement depends on the use the annotated data will be put to.",
author = "Passonneau, {Rebecca Jane}",
year = "2006",
month = "1",
day = "1",
language = "English (US)",
pages = "831--836",
note = "5th International Conference on Language Resources and Evaluation, LREC 2006 ; Conference date: 22-05-2006 Through 28-05-2006",

}

Passonneau, RJ 2006, 'Measuring agreement on set-valued items (MASI) for semantic and pragmatic annotation' Paper presented at 5th International Conference on Language Resources and Evaluation, LREC 2006, Genoa, Italy, 5/22/06 - 5/28/06, pp. 831-836.

Measuring agreement on set-valued items (MASI) for semantic and pragmatic annotation. / Passonneau, Rebecca Jane.

2006. 831-836 Paper presented at 5th International Conference on Language Resources and Evaluation, LREC 2006, Genoa, Italy.

Research output: Contribution to conferencePaper

TY - CONF

T1 - Measuring agreement on set-valued items (MASI) for semantic and pragmatic annotation

AU - Passonneau, Rebecca Jane

PY - 2006/1/1

Y1 - 2006/1/1

N2 - Annotation projects dealing with complex semantic or pragmatic phenomena face the dilemma of creating annotation schemes that oversimplify the phenomena, or that capture distinctions conventional reliability metrics cannot measure adequately. The solution to the dilemma is to develop metrics that quantify the decisions that annotators are asked to make. This paper discusses MASI, distance metric for comparing sets, and illustrates its use in quantifying the reliability of a specific dataset. Annotations of Summary Content Units (SCUs) generate models referred to as pyramids which can be used to evaluate unseen human summaries or machine summaries. The paper presents reliability results for five pairs of pyramids created for document sets from the 2003 Document Understanding Conference (DUC). The annotators worked independently of each other. Differences between application of MASI to pyramid annotation and its previous application to co-reference annotation are discussed. In addition, it is argued that a paradigmatic reliability study should relate measures of inter-annotator agreement to independent assessments, such as significance tests of the annotated variables with respect to other phenomena. In effect, what counts as sufficiently reliable intera-annotator agreement depends on the use the annotated data will be put to.

AB - Annotation projects dealing with complex semantic or pragmatic phenomena face the dilemma of creating annotation schemes that oversimplify the phenomena, or that capture distinctions conventional reliability metrics cannot measure adequately. The solution to the dilemma is to develop metrics that quantify the decisions that annotators are asked to make. This paper discusses MASI, distance metric for comparing sets, and illustrates its use in quantifying the reliability of a specific dataset. Annotations of Summary Content Units (SCUs) generate models referred to as pyramids which can be used to evaluate unseen human summaries or machine summaries. The paper presents reliability results for five pairs of pyramids created for document sets from the 2003 Document Understanding Conference (DUC). The annotators worked independently of each other. Differences between application of MASI to pyramid annotation and its previous application to co-reference annotation are discussed. In addition, it is argued that a paradigmatic reliability study should relate measures of inter-annotator agreement to independent assessments, such as significance tests of the annotated variables with respect to other phenomena. In effect, what counts as sufficiently reliable intera-annotator agreement depends on the use the annotated data will be put to.

UR - http://www.scopus.com/inward/record.url?scp=84904292692&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84904292692&partnerID=8YFLogxK

M3 - Paper

AN - SCOPUS:84904292692

SP - 831

EP - 836

ER -

Passonneau RJ. Measuring agreement on set-valued items (MASI) for semantic and pragmatic annotation. 2006. Paper presented at 5th International Conference on Language Resources and Evaluation, LREC 2006, Genoa, Italy.