Identifying valuable information from Twitter during natural disasters

Brandon Truong, Cornelia Caragea, Anna Squicciarini, Andrea H. Tapia

Research output: Contribution to journalArticlepeer-review

20 Scopus citations


Social media is a vital source of information during any major event, especially natural disasters. However, with the exponential increase in volume of social media data, so comes the increase in conversational data that does not provide valuable information, especially in the context of disaster events, thus, diminishing peoples' ability to find the information that they need in order to organize relief efforts, find help, and potentially save lives. This project focuses on the development of a Bayesian approach to the classification of tweets (posts on Twitter) during Hurricane Sandy in order to distinguish "informational" from "conversational" tweets. We designed an effective set of features and used them as input to Naïve Bayes classifiers. In comparison to a "bag of words" approach, the new feature set provides similar results in the classification of tweets. However, the designed feature set contains only 9 features compared with more than 3000 features for "bag of words." When the feature set is combined with "bag of words", accuracy achieves 85.2914%. If integrated into disaster-related systems, our approach can serve as a boon to any person or organization seeking to extract useful information in the midst of a natural disaster.

Original languageEnglish (US)
JournalProceedings of the ASIST Annual Meeting
Issue number1
StatePublished - 2014

All Science Journal Classification (ASJC) codes

  • Information Systems
  • Library and Information Sciences


Dive into the research topics of 'Identifying valuable information from Twitter during natural disasters'. Together they form a unique fingerprint.

Cite this