The Manually Annotated Sub-Corpus: A community resource for and by the people

Nancy Ide, Collin Baker, Christiane Fellbaum, Rebecca Jane Passonneau

Research output: Chapter in Book/Report/Conference proceedingConference contribution

43 Citations (Scopus)

Abstract

The Manually Annotated Sub-Corpus (MASC) project provides data and annotations to serve as the base for a communitywide annotation effort of a subset of the American National Corpus. The MASC infrastructure enables the incorporation of contributed annotations into a single, usable format that can then be analyzed as it is or ported to any of a variety of other formats. MASC includes data from a much wider variety of genres than existing multiply-annotated corpora of English, and the project is committed to a fully open model of distribution, without restriction, for all data and annotations produced or contributed. As such, MASC is the first large-scale, open, community-based effort to create much needed language resources for NLP. This paper describes the MASC project, its corpus and annotations, and serves as a call for contributions of data and annotations from the language processing community.

Original languageEnglish (US)
Title of host publicationACL 2010 - 48th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference
Pages68-73
Number of pages6
StatePublished - Dec 1 2010
Event48th Annual Meeting of the Association for Computational Linguistics, ACL 2010 - Uppsala, Sweden
Duration: Jul 11 2010Jul 16 2010

Publication series

NameACL 2010 - 48th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference

Other

Other48th Annual Meeting of the Association for Computational Linguistics, ACL 2010
CountrySweden
CityUppsala
Period7/11/107/16/10

Fingerprint

resources
community
language
genre
infrastructure
Resources
Annotation

All Science Journal Classification (ASJC) codes

  • Language and Linguistics
  • Linguistics and Language

Cite this

Ide, N., Baker, C., Fellbaum, C., & Passonneau, R. J. (2010). The Manually Annotated Sub-Corpus: A community resource for and by the people. In ACL 2010 - 48th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (pp. 68-73). (ACL 2010 - 48th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference).
Ide, Nancy ; Baker, Collin ; Fellbaum, Christiane ; Passonneau, Rebecca Jane. / The Manually Annotated Sub-Corpus : A community resource for and by the people. ACL 2010 - 48th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference. 2010. pp. 68-73 (ACL 2010 - 48th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference).
@inproceedings{bd165b0c0645454f99d063e39585c3a3,
title = "The Manually Annotated Sub-Corpus: A community resource for and by the people",
abstract = "The Manually Annotated Sub-Corpus (MASC) project provides data and annotations to serve as the base for a communitywide annotation effort of a subset of the American National Corpus. The MASC infrastructure enables the incorporation of contributed annotations into a single, usable format that can then be analyzed as it is or ported to any of a variety of other formats. MASC includes data from a much wider variety of genres than existing multiply-annotated corpora of English, and the project is committed to a fully open model of distribution, without restriction, for all data and annotations produced or contributed. As such, MASC is the first large-scale, open, community-based effort to create much needed language resources for NLP. This paper describes the MASC project, its corpus and annotations, and serves as a call for contributions of data and annotations from the language processing community.",
author = "Nancy Ide and Collin Baker and Christiane Fellbaum and Passonneau, {Rebecca Jane}",
year = "2010",
month = "12",
day = "1",
language = "English (US)",
isbn = "9781617388088",
series = "ACL 2010 - 48th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference",
pages = "68--73",
booktitle = "ACL 2010 - 48th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference",

}

Ide, N, Baker, C, Fellbaum, C & Passonneau, RJ 2010, The Manually Annotated Sub-Corpus: A community resource for and by the people. in ACL 2010 - 48th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference. ACL 2010 - 48th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, pp. 68-73, 48th Annual Meeting of the Association for Computational Linguistics, ACL 2010, Uppsala, Sweden, 7/11/10.

The Manually Annotated Sub-Corpus : A community resource for and by the people. / Ide, Nancy; Baker, Collin; Fellbaum, Christiane; Passonneau, Rebecca Jane.

ACL 2010 - 48th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference. 2010. p. 68-73 (ACL 2010 - 48th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - The Manually Annotated Sub-Corpus

T2 - A community resource for and by the people

AU - Ide, Nancy

AU - Baker, Collin

AU - Fellbaum, Christiane

AU - Passonneau, Rebecca Jane

PY - 2010/12/1

Y1 - 2010/12/1

N2 - The Manually Annotated Sub-Corpus (MASC) project provides data and annotations to serve as the base for a communitywide annotation effort of a subset of the American National Corpus. The MASC infrastructure enables the incorporation of contributed annotations into a single, usable format that can then be analyzed as it is or ported to any of a variety of other formats. MASC includes data from a much wider variety of genres than existing multiply-annotated corpora of English, and the project is committed to a fully open model of distribution, without restriction, for all data and annotations produced or contributed. As such, MASC is the first large-scale, open, community-based effort to create much needed language resources for NLP. This paper describes the MASC project, its corpus and annotations, and serves as a call for contributions of data and annotations from the language processing community.

AB - The Manually Annotated Sub-Corpus (MASC) project provides data and annotations to serve as the base for a communitywide annotation effort of a subset of the American National Corpus. The MASC infrastructure enables the incorporation of contributed annotations into a single, usable format that can then be analyzed as it is or ported to any of a variety of other formats. MASC includes data from a much wider variety of genres than existing multiply-annotated corpora of English, and the project is committed to a fully open model of distribution, without restriction, for all data and annotations produced or contributed. As such, MASC is the first large-scale, open, community-based effort to create much needed language resources for NLP. This paper describes the MASC project, its corpus and annotations, and serves as a call for contributions of data and annotations from the language processing community.

UR - http://www.scopus.com/inward/record.url?scp=84859942549&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84859942549&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84859942549

SN - 9781617388088

T3 - ACL 2010 - 48th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference

SP - 68

EP - 73

BT - ACL 2010 - 48th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference

ER -

Ide N, Baker C, Fellbaum C, Passonneau RJ. The Manually Annotated Sub-Corpus: A community resource for and by the people. In ACL 2010 - 48th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference. 2010. p. 68-73. (ACL 2010 - 48th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference).