Biber redux: Reconsidering dimensions of variation in American english

Rebecca J. Passonneau, Nancy Ide, Songqiao Su, Jesse Stuart

Research output: Chapter in Book/Report/Conference proceedingConference contribution

10 Scopus citations

Abstract

Genre classification has been found to improve performance in many applications of statistical NLP, including language modeling for spoken language, domain adaptation of statistical parsers, and machine translation. It has also been found to benefit retrieval of spoken or written documents. At its base, however, classification assumes separability. This paper revisits an assumption that genre variation is continuous along multiple dimensions, and an early use of principal component analysis to find these dimensions. Results on a very heterogeneous corpus of post- 1990s American English reveal four major dimensions, three of which echo those found in prior work and the fourth depending on features not used in the earlier study. The resulting model can provide a basis for more detailed analysis of sub-genres and the relation between genre and situations of language use, as well as a means to predict distributional properties of new genres.

Original languageEnglish (US)
Title of host publicationCOLING 2014 - 25th International Conference on Computational Linguistics, Proceedings of COLING 2014
Subtitle of host publicationTechnical Papers
PublisherAssociation for Computational Linguistics, ACL Anthology
Pages565-576
Number of pages12
ISBN (Electronic)9781941643266
StatePublished - Jan 1 2014
Event25th International Conference on Computational Linguistics, COLING 2014 - Dublin, Ireland
Duration: Aug 23 2014Aug 29 2014

Publication series

NameCOLING 2014 - 25th International Conference on Computational Linguistics, Proceedings of COLING 2014: Technical Papers

Other

Other25th International Conference on Computational Linguistics, COLING 2014
CountryIreland
CityDublin
Period8/23/148/29/14

All Science Journal Classification (ASJC) codes

  • Language and Linguistics
  • Linguistics and Language

Fingerprint Dive into the research topics of 'Biber redux: Reconsidering dimensions of variation in American english'. Together they form a unique fingerprint.

Cite this