MASC: The manually annotated Sub-Corpus of American English

Nancy Ide, Collin Baker, Christiane Fellbaum, Charles Fillmore, Rebecca Passonneau

Research output: Chapter in Book/Report/Conference proceedingConference contribution

23 Citations (Scopus)

Abstract

To answer the critical need for sharable, reusable annotated resources with rich linguistic annotations, we are developing a Manually Annotated Sub-Corpus (MASC) including texts from diverse genres and manual annotations or manually-validated annotations for multiple levels, including WordNet senses and FrameNet frames and frame elements, both of which have become significant resources in the international computational linguistics community. To derive maximal benefit from the semantic information provided by these resources, the MASC will also include manually-validated shallow parses and named entities, which will enable linking WordNet senses and FrameNet frames within the same sentences into more complex semantic structures and, because named entities will often be the role fillers of FrameNet frames, enrich the semantic and pragmatic information derivable from the sub-corpus. All MASC annotations will be published with detailed inter-annotator agreement measures. The MASC and its annotations will be freely downloadable from the ANC website, thus providing maximum accessibility for researchers from around the globe.

Original languageEnglish (US)
Title of host publicationProceedings of the 6th International Conference on Language Resources and Evaluation, LREC 2008
PublisherEuropean Language Resources Association (ELRA)
Pages2455-2460
Number of pages6
ISBN (Electronic)2951740840, 9782951740846
StatePublished - Jan 1 2008
Event6th International Conference on Language Resources and Evaluation, LREC 2008 - Marrakech, Morocco
Duration: May 28 2008May 30 2008

Publication series

NameProceedings of the 6th International Conference on Language Resources and Evaluation, LREC 2008

Other

Other6th International Conference on Language Resources and Evaluation, LREC 2008
CountryMorocco
CityMarrakech
Period5/28/085/30/08

Fingerprint

semantics
resources
computational linguistics
ANC
website
genre
pragmatics
linguistics
community
Annotation
American English
Resources
Semantic Information
Entity
WordNet

All Science Journal Classification (ASJC) codes

  • Library and Information Sciences
  • Linguistics and Language
  • Language and Linguistics
  • Education

Cite this

Ide, N., Baker, C., Fellbaum, C., Fillmore, C., & Passonneau, R. (2008). MASC: The manually annotated Sub-Corpus of American English. In Proceedings of the 6th International Conference on Language Resources and Evaluation, LREC 2008 (pp. 2455-2460). (Proceedings of the 6th International Conference on Language Resources and Evaluation, LREC 2008). European Language Resources Association (ELRA).
Ide, Nancy ; Baker, Collin ; Fellbaum, Christiane ; Fillmore, Charles ; Passonneau, Rebecca. / MASC : The manually annotated Sub-Corpus of American English. Proceedings of the 6th International Conference on Language Resources and Evaluation, LREC 2008. European Language Resources Association (ELRA), 2008. pp. 2455-2460 (Proceedings of the 6th International Conference on Language Resources and Evaluation, LREC 2008).
@inproceedings{27cf7ec0a0de4751a4377bc4b9938f87,
title = "MASC: The manually annotated Sub-Corpus of American English",
abstract = "To answer the critical need for sharable, reusable annotated resources with rich linguistic annotations, we are developing a Manually Annotated Sub-Corpus (MASC) including texts from diverse genres and manual annotations or manually-validated annotations for multiple levels, including WordNet senses and FrameNet frames and frame elements, both of which have become significant resources in the international computational linguistics community. To derive maximal benefit from the semantic information provided by these resources, the MASC will also include manually-validated shallow parses and named entities, which will enable linking WordNet senses and FrameNet frames within the same sentences into more complex semantic structures and, because named entities will often be the role fillers of FrameNet frames, enrich the semantic and pragmatic information derivable from the sub-corpus. All MASC annotations will be published with detailed inter-annotator agreement measures. The MASC and its annotations will be freely downloadable from the ANC website, thus providing maximum accessibility for researchers from around the globe.",
author = "Nancy Ide and Collin Baker and Christiane Fellbaum and Charles Fillmore and Rebecca Passonneau",
year = "2008",
month = "1",
day = "1",
language = "English (US)",
series = "Proceedings of the 6th International Conference on Language Resources and Evaluation, LREC 2008",
publisher = "European Language Resources Association (ELRA)",
pages = "2455--2460",
booktitle = "Proceedings of the 6th International Conference on Language Resources and Evaluation, LREC 2008",

}

Ide, N, Baker, C, Fellbaum, C, Fillmore, C & Passonneau, R 2008, MASC: The manually annotated Sub-Corpus of American English. in Proceedings of the 6th International Conference on Language Resources and Evaluation, LREC 2008. Proceedings of the 6th International Conference on Language Resources and Evaluation, LREC 2008, European Language Resources Association (ELRA), pp. 2455-2460, 6th International Conference on Language Resources and Evaluation, LREC 2008, Marrakech, Morocco, 5/28/08.

MASC : The manually annotated Sub-Corpus of American English. / Ide, Nancy; Baker, Collin; Fellbaum, Christiane; Fillmore, Charles; Passonneau, Rebecca.

Proceedings of the 6th International Conference on Language Resources and Evaluation, LREC 2008. European Language Resources Association (ELRA), 2008. p. 2455-2460 (Proceedings of the 6th International Conference on Language Resources and Evaluation, LREC 2008).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - MASC

T2 - The manually annotated Sub-Corpus of American English

AU - Ide, Nancy

AU - Baker, Collin

AU - Fellbaum, Christiane

AU - Fillmore, Charles

AU - Passonneau, Rebecca

PY - 2008/1/1

Y1 - 2008/1/1

N2 - To answer the critical need for sharable, reusable annotated resources with rich linguistic annotations, we are developing a Manually Annotated Sub-Corpus (MASC) including texts from diverse genres and manual annotations or manually-validated annotations for multiple levels, including WordNet senses and FrameNet frames and frame elements, both of which have become significant resources in the international computational linguistics community. To derive maximal benefit from the semantic information provided by these resources, the MASC will also include manually-validated shallow parses and named entities, which will enable linking WordNet senses and FrameNet frames within the same sentences into more complex semantic structures and, because named entities will often be the role fillers of FrameNet frames, enrich the semantic and pragmatic information derivable from the sub-corpus. All MASC annotations will be published with detailed inter-annotator agreement measures. The MASC and its annotations will be freely downloadable from the ANC website, thus providing maximum accessibility for researchers from around the globe.

AB - To answer the critical need for sharable, reusable annotated resources with rich linguistic annotations, we are developing a Manually Annotated Sub-Corpus (MASC) including texts from diverse genres and manual annotations or manually-validated annotations for multiple levels, including WordNet senses and FrameNet frames and frame elements, both of which have become significant resources in the international computational linguistics community. To derive maximal benefit from the semantic information provided by these resources, the MASC will also include manually-validated shallow parses and named entities, which will enable linking WordNet senses and FrameNet frames within the same sentences into more complex semantic structures and, because named entities will often be the role fillers of FrameNet frames, enrich the semantic and pragmatic information derivable from the sub-corpus. All MASC annotations will be published with detailed inter-annotator agreement measures. The MASC and its annotations will be freely downloadable from the ANC website, thus providing maximum accessibility for researchers from around the globe.

UR - http://www.scopus.com/inward/record.url?scp=85017453047&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85017453047&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85017453047

T3 - Proceedings of the 6th International Conference on Language Resources and Evaluation, LREC 2008

SP - 2455

EP - 2460

BT - Proceedings of the 6th International Conference on Language Resources and Evaluation, LREC 2008

PB - European Language Resources Association (ELRA)

ER -

Ide N, Baker C, Fellbaum C, Fillmore C, Passonneau R. MASC: The manually annotated Sub-Corpus of American English. In Proceedings of the 6th International Conference on Language Resources and Evaluation, LREC 2008. European Language Resources Association (ELRA). 2008. p. 2455-2460. (Proceedings of the 6th International Conference on Language Resources and Evaluation, LREC 2008).