Corpus-based dictionaries for sentiment analysis of specialized vocabularies

Douglas R. Rice, Christopher Zorn

Research output: Contribution to journalArticle

Abstract

Contemporary dictionary-based approaches to sentiment analysis exhibit serious validity problems when applied to specialized vocabularies, but human-coded dictionaries for such applications are often labor-intensive and inefficient to develop. We demonstrate the validity of "minimally-supervised" approaches for the creation of a sentiment dictionary from a corpus of text drawn from a specialized vocabulary. We demonstrate the validity of this approach in estimating sentiment from texts in a large-scale benchmarking dataset recently introduced in computational linguistics, and demonstrate the improvements in accuracy of our approach over well-known standard (nonspecialized) sentiment dictionaries. Finally, we show the usefulness of our approach in an application to the specialized language used in US federal appellate court decisions.

Original languageEnglish (US)
JournalPolitical Science Research and Methods
DOIs
StatePublished - Jan 1 2019

Fingerprint

dictionary
vocabulary
computational linguistics
appellate court
benchmarking
court decision
labor
language

All Science Journal Classification (ASJC) codes

  • Sociology and Political Science
  • Political Science and International Relations

Cite this

@article{daaf1edcacfc477da616ef1c6cdc20f8,
title = "Corpus-based dictionaries for sentiment analysis of specialized vocabularies",
abstract = "Contemporary dictionary-based approaches to sentiment analysis exhibit serious validity problems when applied to specialized vocabularies, but human-coded dictionaries for such applications are often labor-intensive and inefficient to develop. We demonstrate the validity of {"}minimally-supervised{"} approaches for the creation of a sentiment dictionary from a corpus of text drawn from a specialized vocabulary. We demonstrate the validity of this approach in estimating sentiment from texts in a large-scale benchmarking dataset recently introduced in computational linguistics, and demonstrate the improvements in accuracy of our approach over well-known standard (nonspecialized) sentiment dictionaries. Finally, we show the usefulness of our approach in an application to the specialized language used in US federal appellate court decisions.",
author = "Rice, {Douglas R.} and Christopher Zorn",
year = "2019",
month = "1",
day = "1",
doi = "10.1017/psrm.2019.10",
language = "English (US)",
journal = "Political Science Research and Methods",
issn = "2049-8470",
publisher = "Cambridge University Press",

}

Corpus-based dictionaries for sentiment analysis of specialized vocabularies. / Rice, Douglas R.; Zorn, Christopher.

In: Political Science Research and Methods, 01.01.2019.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Corpus-based dictionaries for sentiment analysis of specialized vocabularies

AU - Rice, Douglas R.

AU - Zorn, Christopher

PY - 2019/1/1

Y1 - 2019/1/1

N2 - Contemporary dictionary-based approaches to sentiment analysis exhibit serious validity problems when applied to specialized vocabularies, but human-coded dictionaries for such applications are often labor-intensive and inefficient to develop. We demonstrate the validity of "minimally-supervised" approaches for the creation of a sentiment dictionary from a corpus of text drawn from a specialized vocabulary. We demonstrate the validity of this approach in estimating sentiment from texts in a large-scale benchmarking dataset recently introduced in computational linguistics, and demonstrate the improvements in accuracy of our approach over well-known standard (nonspecialized) sentiment dictionaries. Finally, we show the usefulness of our approach in an application to the specialized language used in US federal appellate court decisions.

AB - Contemporary dictionary-based approaches to sentiment analysis exhibit serious validity problems when applied to specialized vocabularies, but human-coded dictionaries for such applications are often labor-intensive and inefficient to develop. We demonstrate the validity of "minimally-supervised" approaches for the creation of a sentiment dictionary from a corpus of text drawn from a specialized vocabulary. We demonstrate the validity of this approach in estimating sentiment from texts in a large-scale benchmarking dataset recently introduced in computational linguistics, and demonstrate the improvements in accuracy of our approach over well-known standard (nonspecialized) sentiment dictionaries. Finally, we show the usefulness of our approach in an application to the specialized language used in US federal appellate court decisions.

UR - http://www.scopus.com/inward/record.url?scp=85063767414&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85063767414&partnerID=8YFLogxK

U2 - 10.1017/psrm.2019.10

DO - 10.1017/psrm.2019.10

M3 - Article

AN - SCOPUS:85063767414

JO - Political Science Research and Methods

JF - Political Science Research and Methods

SN - 2049-8470

ER -