5 sources of clickbaits you should know! Using synthetic clickbaits to improve prediction and distinguish between bot-generated and human-written headlines

Thai Le, Kai Shu, Maria D. Molina, Dongwon Lee, S. Shyam Sundar, Huan Liu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations

Abstract

Clickbait is an attractive yet misleading headline that lures readers to commit click-conversion. Development of robust clickbait detection models has been, however, hampered due to the shortage of high-quality labeled training samples. To overcome this challenge, we investigate how to exploit human-written and machine-generated synthetic clickbaits. We first ask crowdworkers and journalism students to generate clickbaity news headlines. Second, we utilize deep generative models to generate clickbaity headlines. Through empirical evaluations, we demonstrate that synthetic clickbaits by human entities and deep generative models are consistently useful in improving the accuracy of various prediction models, by as much as 14.5% in AUC, across two real datasets and different types of algorithms. Especially, we observe an improvement in accuracy, up to 8.5% in AUC, even for top-ranked clickbait detectors from Clickbait Challenge 2017. Our study proposes a novel direction to address the shortage of labeled training data, one of fundamental bottlenecks in supervised learning, by means of synthetic training data with reinforced domain knowledge. It also provides a solution for distinguishing between bot-generated and human-written clickbaits, thus aiding the work of moderators and better alerting news consumers.

Original languageEnglish (US)
Title of host publicationProceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2019
EditorsFrancesca Spezzano, Wei Chen, Xiaokui Xiao
PublisherAssociation for Computing Machinery, Inc
Pages33-40
Number of pages8
ISBN (Electronic)9781450368681
DOIs
StatePublished - Aug 27 2019
Event11th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2019 - Vancouver, Canada
Duration: Aug 27 2019Aug 30 2019

Publication series

NameProceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2019

Conference

Conference11th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2019
CountryCanada
CityVancouver
Period8/27/198/30/19

    Fingerprint

All Science Journal Classification (ASJC) codes

  • Communication
  • Computer Networks and Communications
  • Information Systems and Management
  • Sociology and Political Science

Cite this

Le, T., Shu, K., Molina, M. D., Lee, D., Shyam Sundar, S., & Liu, H. (2019). 5 sources of clickbaits you should know! Using synthetic clickbaits to improve prediction and distinguish between bot-generated and human-written headlines. In F. Spezzano, W. Chen, & X. Xiao (Eds.), Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2019 (pp. 33-40). (Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2019). Association for Computing Machinery, Inc. https://doi.org/10.1145/3341161.3342875