WILSON: A divide and conquer approach for fast and effective news timeline summarization

Yiming Liao, Shuguang Wang, Dongwon Lee

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations

Abstract

Major news media frequently uses the method of news timeline summarization to summarize important daily news over major events across the timeline. While various sophisticated methods have been proposed to generate both concise and complete news timelines, in practice, generating timelines from a large number of news articles not only faces quality issues but also encounters the challenge of generation speed, which all existing methods have neglected. To mitigate these issues, in this work, we propose to speed up timeline generation by dividing the whole summarization task into sub-summarization tasks, adopting the “divide and conquer" philosophy: (1) date selection and (2) text summarization. Furthermore, since existing methods in news timeline summarization pay less attention to the date selection than text summarization, in this paper, we re-examine the role of date selection in news timeline summarization and demonstrate that accurate date selection “alone" can significantly contribute to the task of news timeline summarization. Leveraging on the explicit date selection, then, we propose a simple yet fast and effective news timeline summarization method, named WILSON (neWs tImeLine SummarizatiON). Experimented on two widely used timeline summarization benchmark datasets, timeline17 and crisis, empirical evaluation shows that WILSON outperforms state-of-the-art approaches in both speed and ROUGE scores, significantly improving ROUGE-2 F1 scores by 9.5%∼17.7% and reducing generation time by two orders of magnitude. A further user study with professional journalists also validates the superiority of WILSON. Finally, we build a real-time news timeline summarization system and achieve encouraging results on an industrial-level corpus.

Original languageEnglish (US)
Title of host publicationAdvances in Database Technology - EDBT 2021
Subtitle of host publication24th International Conference on Extending Database Technology, Proceedings
EditorsYannis Velegrakis, Yannis Velegrakis, Demetris Zeinalipour, Panos K. Chrysanthis, Panos K. Chrysanthis, Francesco Guerra
PublisherOpenProceedings.org
Pages635-645
Number of pages11
ISBN (Electronic)9783893180844
DOIs
StatePublished - 2021
EventAdvances in Database Technology - 24th International Conference on Extending Database Technology, EDBT 2021 - Virtual, Nicosia, Cyprus
Duration: Mar 23 2021Mar 26 2021

Publication series

NameAdvances in Database Technology - EDBT
Volume2021-March
ISSN (Electronic)2367-2005

Conference

ConferenceAdvances in Database Technology - 24th International Conference on Extending Database Technology, EDBT 2021
Country/TerritoryCyprus
CityVirtual, Nicosia
Period3/23/213/26/21

All Science Journal Classification (ASJC) codes

  • Information Systems
  • Software
  • Computer Science Applications

Cite this