TY - GEN
T1 - Anveshan
T2 - 4th Linguistic Annotation Workshop, LAW 2010
AU - Bhardwaj, Vikas
AU - Passonneau, Rebecca J.
AU - Salleb-Aouissi, Ansaf
AU - Ide, Nancy
PY - 2010
Y1 - 2010
N2 - Manual annotation of natural language to capture linguistic information is essential for NLP tasks involving supervised machine learning of semantic knowledge. Judgements of meaning can be more or less subjective, in which case instead of a single correct label, the labels assigned might vary among annotators based on the annotators' knowledge, age, gender, intuitions, background, and so on. We introduce a framework "Anveshan," where we investigate annotator behavior to find outliers, cluster annotators by behavior, and identify confusable labels. We also investigate the effectiveness of using trained annotators versus a larger number of untrained annotators on a word sense annotation task. The annotation data comes from a word sense disambiguation task for polysemous words, annotated by both trained annotators and untrained annotators from Amazon's Mechanical turk. Our results show that Anveshan is effective in uncovering patterns in annotator behavior, and we also show that trained annotators are superior to a larger number of untrained annotators for this task.
AB - Manual annotation of natural language to capture linguistic information is essential for NLP tasks involving supervised machine learning of semantic knowledge. Judgements of meaning can be more or less subjective, in which case instead of a single correct label, the labels assigned might vary among annotators based on the annotators' knowledge, age, gender, intuitions, background, and so on. We introduce a framework "Anveshan," where we investigate annotator behavior to find outliers, cluster annotators by behavior, and identify confusable labels. We also investigate the effectiveness of using trained annotators versus a larger number of untrained annotators on a word sense annotation task. The annotation data comes from a word sense disambiguation task for polysemous words, annotated by both trained annotators and untrained annotators from Amazon's Mechanical turk. Our results show that Anveshan is effective in uncovering patterns in annotator behavior, and we also show that trained annotators are superior to a larger number of untrained annotators for this task.
UR - http://www.scopus.com/inward/record.url?scp=84880373129&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84880373129&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84880373129
SN - 1932432728
SN - 9781932432725
T3 - ACL 2010 - LAW 2010: 4th Linguistic Annotation Workshop, Proceedings
SP - 47
EP - 55
BT - ACL 2010 - LAW 2010
Y2 - 15 July 2010 through 16 July 2010
ER -