Characterization and Early Detection of Evergreen News Articles

Yiming Liao, Shuguang Wang, Eui Hong Han, Jongwuk Lee, Dongwon Lee

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Although the majority of news articles are only viewed for days or weeks, there are a small fraction of news articles that are read across years, thus named as evergreen news articles. Because evergreen articles maintain a timeless quality and are consistently of interests to the public, understanding their characteristics better has huge implications for news outlets and platforms yet there are few studies that have explicitly investigated on evergreen articles. Addressing this gap, in this paper, we first propose a flexible parameterized definition of evergreen articles to capture their long-term high traffic patterns. Using a real dataset from the Washington Post, then, we unearth several distinctive characteristics of evergreen articles and build an early prediction model with encouraging results. Although less than 1% of news articles were identified as evergreen, our model achieves 0.961 in ROC AUC and 0.172 in PR AUC in 10-fold cross validation.

Original languageEnglish (US)
Title of host publicationMachine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2019, Proceedings
EditorsUlf Brefeld, Elisa Fromont, Andreas Hotho, Arno Knobbe, Marloes Maathuis, Céline Robardet
PublisherSpringer
Pages552-568
Number of pages17
ISBN (Print)9783030461324
DOIs
StatePublished - Jan 1 2020
EventEuropean Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2019 - Wurzburg, Germany
Duration: Sep 16 2019Sep 20 2019

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11908 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceEuropean Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2019
CountryGermany
CityWurzburg
Period9/16/199/20/19

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint Dive into the research topics of 'Characterization and Early Detection of Evergreen News Articles'. Together they form a unique fingerprint.

  • Cite this

    Liao, Y., Wang, S., Han, E. H., Lee, J., & Lee, D. (2020). Characterization and Early Detection of Evergreen News Articles. In U. Brefeld, E. Fromont, A. Hotho, A. Knobbe, M. Maathuis, & C. Robardet (Eds.), Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2019, Proceedings (pp. 552-568). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11908 LNAI). Springer. https://doi.org/10.1007/978-3-030-46133-1_33