Fightin' words: Lexical feature selection and evaluation for identifying the content of political conflict

Burt L. Monroe, Michael P. Colaresi, Kevin M. Quinn

Research output: Contribution to journalArticle

133 Citations (Scopus)

Abstract

Entries in the burgeoning "text-as-data" movement are often accompanied by lists or visualizations of how word (or other lexical feature) usage differs across some pair or set of documents. These are intended either to establish some target semantic concept (like the content of partisan frames) to estimate word-specific measures that feed forward into another analysis (like locating parties in ideological space) or both. We discuss a variety of techniques for selecting words that capture partisan, or other, differences in political speech and for evaluating the relative importance of those words. We introduce and emphasize several new approaches based on Bayesian shrinkage and regularization. We illustrate the relative utility of these approaches with analyses of partisan, gender, and distributive speech in the U.S. Senate.

Original languageEnglish (US)
Pages (from-to)372-403
Number of pages32
JournalPolitical Analysis
Volume16
Issue number4 SPEC. ISS.
DOIs
StatePublished - Dec 1 2008

Fingerprint

political speech
political conflict
senate
visualization
semantics
gender
evaluation

All Science Journal Classification (ASJC) codes

  • Sociology and Political Science
  • Political Science and International Relations

Cite this

Monroe, Burt L. ; Colaresi, Michael P. ; Quinn, Kevin M. / Fightin' words : Lexical feature selection and evaluation for identifying the content of political conflict. In: Political Analysis. 2008 ; Vol. 16, No. 4 SPEC. ISS. pp. 372-403.
@article{887f9d5882a749c49700509ecaf0ec5a,
title = "Fightin' words: Lexical feature selection and evaluation for identifying the content of political conflict",
abstract = "Entries in the burgeoning {"}text-as-data{"} movement are often accompanied by lists or visualizations of how word (or other lexical feature) usage differs across some pair or set of documents. These are intended either to establish some target semantic concept (like the content of partisan frames) to estimate word-specific measures that feed forward into another analysis (like locating parties in ideological space) or both. We discuss a variety of techniques for selecting words that capture partisan, or other, differences in political speech and for evaluating the relative importance of those words. We introduce and emphasize several new approaches based on Bayesian shrinkage and regularization. We illustrate the relative utility of these approaches with analyses of partisan, gender, and distributive speech in the U.S. Senate.",
author = "Monroe, {Burt L.} and Colaresi, {Michael P.} and Quinn, {Kevin M.}",
year = "2008",
month = "12",
day = "1",
doi = "10.1093/pan/mpn018",
language = "English (US)",
volume = "16",
pages = "372--403",
journal = "Political Analysis",
issn = "1047-1987",
publisher = "Oxford University Press",
number = "4 SPEC. ISS.",

}

Fightin' words : Lexical feature selection and evaluation for identifying the content of political conflict. / Monroe, Burt L.; Colaresi, Michael P.; Quinn, Kevin M.

In: Political Analysis, Vol. 16, No. 4 SPEC. ISS., 01.12.2008, p. 372-403.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Fightin' words

T2 - Lexical feature selection and evaluation for identifying the content of political conflict

AU - Monroe, Burt L.

AU - Colaresi, Michael P.

AU - Quinn, Kevin M.

PY - 2008/12/1

Y1 - 2008/12/1

N2 - Entries in the burgeoning "text-as-data" movement are often accompanied by lists or visualizations of how word (or other lexical feature) usage differs across some pair or set of documents. These are intended either to establish some target semantic concept (like the content of partisan frames) to estimate word-specific measures that feed forward into another analysis (like locating parties in ideological space) or both. We discuss a variety of techniques for selecting words that capture partisan, or other, differences in political speech and for evaluating the relative importance of those words. We introduce and emphasize several new approaches based on Bayesian shrinkage and regularization. We illustrate the relative utility of these approaches with analyses of partisan, gender, and distributive speech in the U.S. Senate.

AB - Entries in the burgeoning "text-as-data" movement are often accompanied by lists or visualizations of how word (or other lexical feature) usage differs across some pair or set of documents. These are intended either to establish some target semantic concept (like the content of partisan frames) to estimate word-specific measures that feed forward into another analysis (like locating parties in ideological space) or both. We discuss a variety of techniques for selecting words that capture partisan, or other, differences in political speech and for evaluating the relative importance of those words. We introduce and emphasize several new approaches based on Bayesian shrinkage and regularization. We illustrate the relative utility of these approaches with analyses of partisan, gender, and distributive speech in the U.S. Senate.

UR - http://www.scopus.com/inward/record.url?scp=62249190300&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=62249190300&partnerID=8YFLogxK

U2 - 10.1093/pan/mpn018

DO - 10.1093/pan/mpn018

M3 - Article

AN - SCOPUS:62249190300

VL - 16

SP - 372

EP - 403

JO - Political Analysis

JF - Political Analysis

SN - 1047-1987

IS - 4 SPEC. ISS.

ER -