Modelling cascades over time in microblogs

Wei Xie, Feida Zhu, Siyuan Liu, Ke Wang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

One of the most important features of microblogging services such as Twitter is how easy it is to re-share a piece of information across the network through various user connections, forming what we call a «cascade». Business applications such as viral marketing have driven a tremendous amount of research effort predicting whether a certain cascade will go viral. Yet the rarity of viral cascades in real data poses a challenge to all existing prediction methods. One solution is to simulate cascades that well fit the real viral ones, which requires our ability to tell how a certain cascade grows over time. In this paper, we build a general time-aware cascade model for each particular cascade, in which the chance of one user's re-sharing behaviour over time is modelled as a hazard function of time. Based on two key observations on user retweeting behaviour, we design an appropriate hazard function specifically for Twitter network. We evaluate our model on a large real Twitter dataset with over two million retweeting cascades. Our experiment results show our proposed model outperforms other baseline models in terms of model fitting. Further, we make use of our model to simulate viral cascades, which are otherwise few and far in-between, to alleviate the imbalance issue in cascade data, offering a 20% boost in viral cascade discovery.

Original languageEnglish (US)
Title of host publicationProceedings - 2015 IEEE International Conference on Big Data, IEEE Big Data 2015
EditorsFeng Luo, Kemafor Ogan, Mohammed J. Zaki, Laura Haas, Beng Chin Ooi, Vipin Kumar, Sudarsan Rachuri, Saumyadipta Pyne, Howard Ho, Xiaohua Hu, Shipeng Yu, Morris Hui-I Hsiao, Jian Li
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages677-686
Number of pages10
ISBN (Electronic)9781479999255
DOIs
StatePublished - Dec 22 2015
Event3rd IEEE International Conference on Big Data, IEEE Big Data 2015 - Santa Clara, United States
Duration: Oct 29 2015Nov 1 2015

Publication series

NameProceedings - 2015 IEEE International Conference on Big Data, IEEE Big Data 2015

Other

Other3rd IEEE International Conference on Big Data, IEEE Big Data 2015
CountryUnited States
CitySanta Clara
Period10/29/1511/1/15

Fingerprint

Hazards
Cascades (fluid mechanics)
Marketing
Industry
Experiments

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Computer Science Applications
  • Information Systems
  • Software

Cite this

Xie, W., Zhu, F., Liu, S., & Wang, K. (2015). Modelling cascades over time in microblogs. In F. Luo, K. Ogan, M. J. Zaki, L. Haas, B. C. Ooi, V. Kumar, S. Rachuri, S. Pyne, H. Ho, X. Hu, S. Yu, M. H-I. Hsiao, ... J. Li (Eds.), Proceedings - 2015 IEEE International Conference on Big Data, IEEE Big Data 2015 (pp. 677-686). [7363812] (Proceedings - 2015 IEEE International Conference on Big Data, IEEE Big Data 2015). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/BigData.2015.7363812
Xie, Wei ; Zhu, Feida ; Liu, Siyuan ; Wang, Ke. / Modelling cascades over time in microblogs. Proceedings - 2015 IEEE International Conference on Big Data, IEEE Big Data 2015. editor / Feng Luo ; Kemafor Ogan ; Mohammed J. Zaki ; Laura Haas ; Beng Chin Ooi ; Vipin Kumar ; Sudarsan Rachuri ; Saumyadipta Pyne ; Howard Ho ; Xiaohua Hu ; Shipeng Yu ; Morris Hui-I Hsiao ; Jian Li. Institute of Electrical and Electronics Engineers Inc., 2015. pp. 677-686 (Proceedings - 2015 IEEE International Conference on Big Data, IEEE Big Data 2015).
@inproceedings{2bcacb2c5e85477596c8e3f0414fad1d,
title = "Modelling cascades over time in microblogs",
abstract = "One of the most important features of microblogging services such as Twitter is how easy it is to re-share a piece of information across the network through various user connections, forming what we call a «cascade». Business applications such as viral marketing have driven a tremendous amount of research effort predicting whether a certain cascade will go viral. Yet the rarity of viral cascades in real data poses a challenge to all existing prediction methods. One solution is to simulate cascades that well fit the real viral ones, which requires our ability to tell how a certain cascade grows over time. In this paper, we build a general time-aware cascade model for each particular cascade, in which the chance of one user's re-sharing behaviour over time is modelled as a hazard function of time. Based on two key observations on user retweeting behaviour, we design an appropriate hazard function specifically for Twitter network. We evaluate our model on a large real Twitter dataset with over two million retweeting cascades. Our experiment results show our proposed model outperforms other baseline models in terms of model fitting. Further, we make use of our model to simulate viral cascades, which are otherwise few and far in-between, to alleviate the imbalance issue in cascade data, offering a 20{\%} boost in viral cascade discovery.",
author = "Wei Xie and Feida Zhu and Siyuan Liu and Ke Wang",
year = "2015",
month = "12",
day = "22",
doi = "10.1109/BigData.2015.7363812",
language = "English (US)",
series = "Proceedings - 2015 IEEE International Conference on Big Data, IEEE Big Data 2015",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "677--686",
editor = "Feng Luo and Kemafor Ogan and Zaki, {Mohammed J.} and Laura Haas and Ooi, {Beng Chin} and Vipin Kumar and Sudarsan Rachuri and Saumyadipta Pyne and Howard Ho and Xiaohua Hu and Shipeng Yu and Hsiao, {Morris Hui-I} and Jian Li",
booktitle = "Proceedings - 2015 IEEE International Conference on Big Data, IEEE Big Data 2015",
address = "United States",

}

Xie, W, Zhu, F, Liu, S & Wang, K 2015, Modelling cascades over time in microblogs. in F Luo, K Ogan, MJ Zaki, L Haas, BC Ooi, V Kumar, S Rachuri, S Pyne, H Ho, X Hu, S Yu, MH-I Hsiao & J Li (eds), Proceedings - 2015 IEEE International Conference on Big Data, IEEE Big Data 2015., 7363812, Proceedings - 2015 IEEE International Conference on Big Data, IEEE Big Data 2015, Institute of Electrical and Electronics Engineers Inc., pp. 677-686, 3rd IEEE International Conference on Big Data, IEEE Big Data 2015, Santa Clara, United States, 10/29/15. https://doi.org/10.1109/BigData.2015.7363812

Modelling cascades over time in microblogs. / Xie, Wei; Zhu, Feida; Liu, Siyuan; Wang, Ke.

Proceedings - 2015 IEEE International Conference on Big Data, IEEE Big Data 2015. ed. / Feng Luo; Kemafor Ogan; Mohammed J. Zaki; Laura Haas; Beng Chin Ooi; Vipin Kumar; Sudarsan Rachuri; Saumyadipta Pyne; Howard Ho; Xiaohua Hu; Shipeng Yu; Morris Hui-I Hsiao; Jian Li. Institute of Electrical and Electronics Engineers Inc., 2015. p. 677-686 7363812 (Proceedings - 2015 IEEE International Conference on Big Data, IEEE Big Data 2015).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Modelling cascades over time in microblogs

AU - Xie, Wei

AU - Zhu, Feida

AU - Liu, Siyuan

AU - Wang, Ke

PY - 2015/12/22

Y1 - 2015/12/22

N2 - One of the most important features of microblogging services such as Twitter is how easy it is to re-share a piece of information across the network through various user connections, forming what we call a «cascade». Business applications such as viral marketing have driven a tremendous amount of research effort predicting whether a certain cascade will go viral. Yet the rarity of viral cascades in real data poses a challenge to all existing prediction methods. One solution is to simulate cascades that well fit the real viral ones, which requires our ability to tell how a certain cascade grows over time. In this paper, we build a general time-aware cascade model for each particular cascade, in which the chance of one user's re-sharing behaviour over time is modelled as a hazard function of time. Based on two key observations on user retweeting behaviour, we design an appropriate hazard function specifically for Twitter network. We evaluate our model on a large real Twitter dataset with over two million retweeting cascades. Our experiment results show our proposed model outperforms other baseline models in terms of model fitting. Further, we make use of our model to simulate viral cascades, which are otherwise few and far in-between, to alleviate the imbalance issue in cascade data, offering a 20% boost in viral cascade discovery.

AB - One of the most important features of microblogging services such as Twitter is how easy it is to re-share a piece of information across the network through various user connections, forming what we call a «cascade». Business applications such as viral marketing have driven a tremendous amount of research effort predicting whether a certain cascade will go viral. Yet the rarity of viral cascades in real data poses a challenge to all existing prediction methods. One solution is to simulate cascades that well fit the real viral ones, which requires our ability to tell how a certain cascade grows over time. In this paper, we build a general time-aware cascade model for each particular cascade, in which the chance of one user's re-sharing behaviour over time is modelled as a hazard function of time. Based on two key observations on user retweeting behaviour, we design an appropriate hazard function specifically for Twitter network. We evaluate our model on a large real Twitter dataset with over two million retweeting cascades. Our experiment results show our proposed model outperforms other baseline models in terms of model fitting. Further, we make use of our model to simulate viral cascades, which are otherwise few and far in-between, to alleviate the imbalance issue in cascade data, offering a 20% boost in viral cascade discovery.

UR - http://www.scopus.com/inward/record.url?scp=84963729708&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84963729708&partnerID=8YFLogxK

U2 - 10.1109/BigData.2015.7363812

DO - 10.1109/BigData.2015.7363812

M3 - Conference contribution

AN - SCOPUS:84963729708

T3 - Proceedings - 2015 IEEE International Conference on Big Data, IEEE Big Data 2015

SP - 677

EP - 686

BT - Proceedings - 2015 IEEE International Conference on Big Data, IEEE Big Data 2015

A2 - Luo, Feng

A2 - Ogan, Kemafor

A2 - Zaki, Mohammed J.

A2 - Haas, Laura

A2 - Ooi, Beng Chin

A2 - Kumar, Vipin

A2 - Rachuri, Sudarsan

A2 - Pyne, Saumyadipta

A2 - Ho, Howard

A2 - Hu, Xiaohua

A2 - Yu, Shipeng

A2 - Hsiao, Morris Hui-I

A2 - Li, Jian

PB - Institute of Electrical and Electronics Engineers Inc.

ER -

Xie W, Zhu F, Liu S, Wang K. Modelling cascades over time in microblogs. In Luo F, Ogan K, Zaki MJ, Haas L, Ooi BC, Kumar V, Rachuri S, Pyne S, Ho H, Hu X, Yu S, Hsiao MH-I, Li J, editors, Proceedings - 2015 IEEE International Conference on Big Data, IEEE Big Data 2015. Institute of Electrical and Electronics Engineers Inc. 2015. p. 677-686. 7363812. (Proceedings - 2015 IEEE International Conference on Big Data, IEEE Big Data 2015). https://doi.org/10.1109/BigData.2015.7363812