Detection and analysis of self-disclosure in online news commentaries

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Online users engage in self-disclosure - revealing personal information to others - in pursuit of social rewards. However, there are associated costs of disclosure to users' privacy. User profiling techniques support the use of contributed content for a number of purposes, e.g., micro-targeting advertisements. In this paper, we study self-disclosure as it occurs in newspaper comment forums. We explore a longitudinal dataset of about 60, 000 comments on 2202 news articles from four major English news websites. We start with detection of language indicative of various types of self-disclosure, leveraging both syntactic and semantic information present in texts. Specifically, we use dependency parsing for subject, verb, and object extraction from sentences, in conjunction with named entity recognition to extract linguistic indicators of self-disclosure. We then use these indicators to examine the effects of anonymity and topic of discussion on self-disclosure. We find that anonymous users are more likely to self-disclose than identifiable users, and that self-disclosure varies across topics of discussion. Finally, we discuss the implications of our findings for user privacy.

Original languageEnglish (US)
Title of host publicationThe Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019
PublisherAssociation for Computing Machinery, Inc
Pages3272-3278
Number of pages7
ISBN (Electronic)9781450366748
DOIs
StatePublished - May 13 2019
Event2019 World Wide Web Conference, WWW 2019 - San Francisco, United States
Duration: May 13 2019May 17 2019

Publication series

NameThe Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019

Conference

Conference2019 World Wide Web Conference, WWW 2019
CountryUnited States
CitySan Francisco
Period5/13/195/17/19

Fingerprint

Syntactics
Linguistics
Websites
Semantics
Costs

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Software

Cite this

Umar, P., Squicciarini, A., & Rajtmajer, S. (2019). Detection and analysis of self-disclosure in online news commentaries. In The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019 (pp. 3272-3278). (The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019). Association for Computing Machinery, Inc. https://doi.org/10.1145/3308558.3313669
Umar, Prasanna ; Squicciarini, Anna ; Rajtmajer, Sarah. / Detection and analysis of self-disclosure in online news commentaries. The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019. Association for Computing Machinery, Inc, 2019. pp. 3272-3278 (The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019).
@inproceedings{e26caa343e014b7090fc6d876306970b,
title = "Detection and analysis of self-disclosure in online news commentaries",
abstract = "Online users engage in self-disclosure - revealing personal information to others - in pursuit of social rewards. However, there are associated costs of disclosure to users' privacy. User profiling techniques support the use of contributed content for a number of purposes, e.g., micro-targeting advertisements. In this paper, we study self-disclosure as it occurs in newspaper comment forums. We explore a longitudinal dataset of about 60, 000 comments on 2202 news articles from four major English news websites. We start with detection of language indicative of various types of self-disclosure, leveraging both syntactic and semantic information present in texts. Specifically, we use dependency parsing for subject, verb, and object extraction from sentences, in conjunction with named entity recognition to extract linguistic indicators of self-disclosure. We then use these indicators to examine the effects of anonymity and topic of discussion on self-disclosure. We find that anonymous users are more likely to self-disclose than identifiable users, and that self-disclosure varies across topics of discussion. Finally, we discuss the implications of our findings for user privacy.",
author = "Prasanna Umar and Anna Squicciarini and Sarah Rajtmajer",
year = "2019",
month = "5",
day = "13",
doi = "10.1145/3308558.3313669",
language = "English (US)",
series = "The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019",
publisher = "Association for Computing Machinery, Inc",
pages = "3272--3278",
booktitle = "The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019",

}

Umar, P, Squicciarini, A & Rajtmajer, S 2019, Detection and analysis of self-disclosure in online news commentaries. in The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019. The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019, Association for Computing Machinery, Inc, pp. 3272-3278, 2019 World Wide Web Conference, WWW 2019, San Francisco, United States, 5/13/19. https://doi.org/10.1145/3308558.3313669

Detection and analysis of self-disclosure in online news commentaries. / Umar, Prasanna; Squicciarini, Anna; Rajtmajer, Sarah.

The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019. Association for Computing Machinery, Inc, 2019. p. 3272-3278 (The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Detection and analysis of self-disclosure in online news commentaries

AU - Umar, Prasanna

AU - Squicciarini, Anna

AU - Rajtmajer, Sarah

PY - 2019/5/13

Y1 - 2019/5/13

N2 - Online users engage in self-disclosure - revealing personal information to others - in pursuit of social rewards. However, there are associated costs of disclosure to users' privacy. User profiling techniques support the use of contributed content for a number of purposes, e.g., micro-targeting advertisements. In this paper, we study self-disclosure as it occurs in newspaper comment forums. We explore a longitudinal dataset of about 60, 000 comments on 2202 news articles from four major English news websites. We start with detection of language indicative of various types of self-disclosure, leveraging both syntactic and semantic information present in texts. Specifically, we use dependency parsing for subject, verb, and object extraction from sentences, in conjunction with named entity recognition to extract linguistic indicators of self-disclosure. We then use these indicators to examine the effects of anonymity and topic of discussion on self-disclosure. We find that anonymous users are more likely to self-disclose than identifiable users, and that self-disclosure varies across topics of discussion. Finally, we discuss the implications of our findings for user privacy.

AB - Online users engage in self-disclosure - revealing personal information to others - in pursuit of social rewards. However, there are associated costs of disclosure to users' privacy. User profiling techniques support the use of contributed content for a number of purposes, e.g., micro-targeting advertisements. In this paper, we study self-disclosure as it occurs in newspaper comment forums. We explore a longitudinal dataset of about 60, 000 comments on 2202 news articles from four major English news websites. We start with detection of language indicative of various types of self-disclosure, leveraging both syntactic and semantic information present in texts. Specifically, we use dependency parsing for subject, verb, and object extraction from sentences, in conjunction with named entity recognition to extract linguistic indicators of self-disclosure. We then use these indicators to examine the effects of anonymity and topic of discussion on self-disclosure. We find that anonymous users are more likely to self-disclose than identifiable users, and that self-disclosure varies across topics of discussion. Finally, we discuss the implications of our findings for user privacy.

UR - http://www.scopus.com/inward/record.url?scp=85066898075&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85066898075&partnerID=8YFLogxK

U2 - 10.1145/3308558.3313669

DO - 10.1145/3308558.3313669

M3 - Conference contribution

AN - SCOPUS:85066898075

T3 - The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019

SP - 3272

EP - 3278

BT - The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019

PB - Association for Computing Machinery, Inc

ER -

Umar P, Squicciarini A, Rajtmajer S. Detection and analysis of self-disclosure in online news commentaries. In The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019. Association for Computing Machinery, Inc. 2019. p. 3272-3278. (The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019). https://doi.org/10.1145/3308558.3313669