TY - JOUR
T1 - Summarizing Situational Tweets in Crisis Scenarios
T2 - An Extractive-Abstractive Approach
AU - Rudra, Koustav
AU - Goyal, Pawan
AU - Ganguly, Niloy
AU - Imran, Muhammad
AU - Mitra, Prasenjit
N1 - Funding Information:
Manuscript received August 22, 2018; revised March 22, 2019 and July 7, 2019; accepted August 11, 2019. Date of publication September 16, 2019; date of current version October 7, 2019. This work was supported in part by the European Union’s Horizon 2020 Research and Innovation Programme under Grant 832921. (Corresponding author: Koustav Rudra.) K. Rudra was with IIT Kharagpur, Kharagpur 721302, India. He is now with the L3S Research Center, Leibniz University Hannover, 30167 Hannover, Germany (e-mail: rudra@l3s.de).
Publisher Copyright:
© 2014 IEEE.
PY - 2019/10
Y1 - 2019/10
N2 - Microblogging platforms such as Twitter are widely used by eyewitnesses and affected people to post situational updates during mass convergence events such as natural and man-made disasters. These crisis-related messages disperse among multiple classes/categories such as infrastructure damage, shelter needs, information about missing, injured, and dead people. Moreover, we observe that sometimes people post information about their missing relatives and friends with personal details such as names and last seen location. The information requirements of different stakeholders (government, NGOs, and rescue workers) also vary a lot. This brings twofold challenges: 1) extracting important high-level situational updates from these messages, assigning them appropriate categories, and finally summarizing big trove of information in each category and 2) extracting small-scale time-critical sparse updates related to missing or trapped people. In this article, we propose a classification-summarization framework which first assigns tweets into different situational classes and then summarizes those tweets. In the summarization phase, we propose a two-step extractive-abstractive summarization framework. In the first step, it extracts a set of important tweets from the whole set of information, develops a bigram-based word-graph from those tweets, and generates paths by traversing the word-graph. Next, it uses an optimization technique based on integer linear programming (ILP) to select the most important tweets and paths based on different optimization parameters such as informativeness and coverage of content words. Apart from general classwise summarization, we also show the customization of our summarization model to address time-critical sparse information needs (e.g., missing relatives). Our proposed method is time- and memory-efficient and shows better performance than state-of-the-art methods in terms of both quantitative and qualitative judgment.
AB - Microblogging platforms such as Twitter are widely used by eyewitnesses and affected people to post situational updates during mass convergence events such as natural and man-made disasters. These crisis-related messages disperse among multiple classes/categories such as infrastructure damage, shelter needs, information about missing, injured, and dead people. Moreover, we observe that sometimes people post information about their missing relatives and friends with personal details such as names and last seen location. The information requirements of different stakeholders (government, NGOs, and rescue workers) also vary a lot. This brings twofold challenges: 1) extracting important high-level situational updates from these messages, assigning them appropriate categories, and finally summarizing big trove of information in each category and 2) extracting small-scale time-critical sparse updates related to missing or trapped people. In this article, we propose a classification-summarization framework which first assigns tweets into different situational classes and then summarizes those tweets. In the summarization phase, we propose a two-step extractive-abstractive summarization framework. In the first step, it extracts a set of important tweets from the whole set of information, develops a bigram-based word-graph from those tweets, and generates paths by traversing the word-graph. Next, it uses an optimization technique based on integer linear programming (ILP) to select the most important tweets and paths based on different optimization parameters such as informativeness and coverage of content words. Apart from general classwise summarization, we also show the customization of our summarization model to address time-critical sparse information needs (e.g., missing relatives). Our proposed method is time- and memory-efficient and shows better performance than state-of-the-art methods in terms of both quantitative and qualitative judgment.
UR - http://www.scopus.com/inward/record.url?scp=85072522966&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85072522966&partnerID=8YFLogxK
U2 - 10.1109/TCSS.2019.2937899
DO - 10.1109/TCSS.2019.2937899
M3 - Article
AN - SCOPUS:85072522966
VL - 6
SP - 981
EP - 993
JO - IEEE Transactions on Computational Social Systems
JF - IEEE Transactions on Computational Social Systems
SN - 2329-924X
IS - 5
M1 - 8839735
ER -