Corpus-based dictionaries for sentiment analysis of specialized vocabularies

Douglas R. Rice, Christopher Jon Zorn

Research output: Contribution to journalArticle

Abstract

Contemporary dictionary-based approaches to sentiment analysis exhibit serious validity problems when applied to specialized vocabularies, but human-coded dictionaries for such applications are often labor-intensive and inefficient to develop. We demonstrate the validity of "minimally-supervised" approaches for the creation of a sentiment dictionary from a corpus of text drawn from a specialized vocabulary. We demonstrate the validity of this approach in estimating sentiment from texts in a large-scale benchmarking dataset recently introduced in computational linguistics, and demonstrate the improvements in accuracy of our approach over well-known standard (nonspecialized) sentiment dictionaries. Finally, we show the usefulness of our approach in an application to the specialized language used in US federal appellate court decisions.

Original languageEnglish (US)
JournalPolitical Science Research and Methods
DOIs
StatePublished - Jan 1 2019

Fingerprint

dictionary
vocabulary
computational linguistics
appellate court
benchmarking
court decision
labor
language

All Science Journal Classification (ASJC) codes

  • Sociology and Political Science
  • Political Science and International Relations

Cite this

@article{daaf1edcacfc477da616ef1c6cdc20f8,
title = "Corpus-based dictionaries for sentiment analysis of specialized vocabularies",
abstract = "Contemporary dictionary-based approaches to sentiment analysis exhibit serious validity problems when applied to specialized vocabularies, but human-coded dictionaries for such applications are often labor-intensive and inefficient to develop. We demonstrate the validity of {"}minimally-supervised{"} approaches for the creation of a sentiment dictionary from a corpus of text drawn from a specialized vocabulary. We demonstrate the validity of this approach in estimating sentiment from texts in a large-scale benchmarking dataset recently introduced in computational linguistics, and demonstrate the improvements in accuracy of our approach over well-known standard (nonspecialized) sentiment dictionaries. Finally, we show the usefulness of our approach in an application to the specialized language used in US federal appellate court decisions.",
author = "Rice, {Douglas R.} and Zorn, {Christopher Jon}",
year = "2019",
month = "1",
day = "1",
doi = "10.1017/psrm.2019.10",
language = "English (US)",
journal = "Political Science Research and Methods",
issn = "2049-8470",
publisher = "Cambridge University Press",

}

TY - JOUR

T1 - Corpus-based dictionaries for sentiment analysis of specialized vocabularies

AU - Rice, Douglas R.

AU - Zorn, Christopher Jon

PY - 2019/1/1

Y1 - 2019/1/1

N2 - Contemporary dictionary-based approaches to sentiment analysis exhibit serious validity problems when applied to specialized vocabularies, but human-coded dictionaries for such applications are often labor-intensive and inefficient to develop. We demonstrate the validity of "minimally-supervised" approaches for the creation of a sentiment dictionary from a corpus of text drawn from a specialized vocabulary. We demonstrate the validity of this approach in estimating sentiment from texts in a large-scale benchmarking dataset recently introduced in computational linguistics, and demonstrate the improvements in accuracy of our approach over well-known standard (nonspecialized) sentiment dictionaries. Finally, we show the usefulness of our approach in an application to the specialized language used in US federal appellate court decisions.

AB - Contemporary dictionary-based approaches to sentiment analysis exhibit serious validity problems when applied to specialized vocabularies, but human-coded dictionaries for such applications are often labor-intensive and inefficient to develop. We demonstrate the validity of "minimally-supervised" approaches for the creation of a sentiment dictionary from a corpus of text drawn from a specialized vocabulary. We demonstrate the validity of this approach in estimating sentiment from texts in a large-scale benchmarking dataset recently introduced in computational linguistics, and demonstrate the improvements in accuracy of our approach over well-known standard (nonspecialized) sentiment dictionaries. Finally, we show the usefulness of our approach in an application to the specialized language used in US federal appellate court decisions.

UR - http://www.scopus.com/inward/record.url?scp=85063767414&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85063767414&partnerID=8YFLogxK

U2 - 10.1017/psrm.2019.10

DO - 10.1017/psrm.2019.10

M3 - Article

JO - Political Science Research and Methods

JF - Political Science Research and Methods

SN - 2049-8470

ER -