A method for determining information diffusion cascades on social networks

Authors

  • Nguyen Viet Anh Institute of Information Technology Vietnam Academy of Science and Technology 18 Hoang Quoc Viet, Cau Giay, Hanoi, Vietnam, 100000, Viet Nam https://orcid.org/0000-0001-7736-2470
  • Duong Ngoc Son Ministry of Public Security 44 Yet Kieu, Hanoi, Vietnam, 100000, Viet Nam https://orcid.org/0000-0002-9126-5199
  • Nguyen Thi Thu Ha Vietnam Electric Power University 235 Hoang Quoc Viet, Hanoi, Vietnam, 100000, Viet Nam https://orcid.org/0000-0001-9766-6826
  • Sergey Kuznetsov Ivannikov Institute for System Programming of the RAS Alexander Solzhenitsyn str., 25, Moscow, Russia, 109004 Lomonosov Moscow State University GSP-1, Leninskie Gory, Moscow, Russia, 119991 Moscow Institute of Physics and Technology Institutskiy lane, Dolgoprudny, Moscow Region, 141701, Russia Higher School of Economics Myasnitskaya str., 20, Moscow, Russia, 101000, Russian Federation https://orcid.org/0000-0002-8257-028X
  • Nguyen Tran Quoc Vinh The University of Da Nang – University of Science and Education 459 Ton Duc Thang, Lien Chieu, Da Nang, Vietnam, 550000, Viet Nam https://orcid.org/0000-0003-2281-0429

DOI:

https://doi.org/10.15587/1729-4061.2018.150295

Keywords:

information diffusion, social network, independent cascade model, diffusion probability

Abstract

Information diffusion on social networks has many potential real-world applications such as online marketing, e-government campaigns, and predicting large social events. Modeling information diffusion is therefore a crucial task in order both to understand its diffusion mechanism and to better control it. Our research aims at finding what factors might influence people in adopting a piece of information that is being shared on a social network. In this study, the traditional independent cascade model for information diffusion is extended with discrete time steps. The proposed model is capable of incorporating three different sources of diffusion influence: user-user influence, user-content preference, and external influence. Specifically, these sources of influence are quantified into real values of diffusion probability. To calculate user-user influence, we adopt and extend the disease transmission model according to the role of the user who diffuses the content. User-content preference, which measures the correlation between user preference and the adopted contents, is calculated based on a topic-based model. External influence is detected in a diffusion time step and is quantified and incorporated into our model for the next diffusion time step by applying and solving a logistic function. Moreover, the process of information diffusion is characterized by constructing a tree of information adoption and the diffusion scale is quantified by predicting the number of infected nodes. It is found that these sources of influence, especially external influence, play a significant role in information diffusion and eventually affect the shape and size of the diffusion cascade. The model is validated on both synthetic and real-world datasets. Experimental results confirm the advantage of our proposed method, which significantly improves over the previous models in terms of prediction accuracy

Author Biographies

Nguyen Viet Anh, Institute of Information Technology Vietnam Academy of Science and Technology 18 Hoang Quoc Viet, Cau Giay, Hanoi, Vietnam, 100000

PhD, Senior researcher

Department of Data Science and Application

Duong Ngoc Son, Ministry of Public Security 44 Yet Kieu, Hanoi, Vietnam, 100000

Master of Science

Technical Department of Security

Nguyen Thi Thu Ha, Vietnam Electric Power University 235 Hoang Quoc Viet, Hanoi, Vietnam, 100000

PhD lecturer

Department of Commerce

Sergey Kuznetsov, Ivannikov Institute for System Programming of the RAS Alexander Solzhenitsyn str., 25, Moscow, Russia, 109004 Lomonosov Moscow State University GSP-1, Leninskie Gory, Moscow, Russia, 119991 Moscow Institute of Physics and Technology Institutskiy lane, Dolgoprudny, Moscow Region, 141701, Russia Higher School of Economics Myasnitskaya str., 20, Moscow, Russia, 101000

Doctor of Science, Professor, Chief Scientist

Department of Information Systems

Professor

Departments of System Programming

 

 

Nguyen Tran Quoc Vinh, The University of Da Nang – University of Science and Education 459 Ton Duc Thang, Lien Chieu, Da Nang, Vietnam, 550000

PhD, Dean

Faculty of Information Technology

References

  1. McCulloh, I., Armstrong, H., Johnson, A. (2013). Social Network Analysis with Applications. Wiley Publishing.
  2. David, B. K., Alan, G., Fernando, J. V. Z. Online social network analysis: A survey of research applications in computer science. Available at: https://arxiv.org/pdf/1504.05655.pdf
  3. Newman, M. E. J. (2002). Spread of epidemic disease on networks. Physical Review E, 66 (1). doi: https://doi.org/10.1103/physreve.66.016128
  4. Domingos, P., Richardson, M. (2001). Mining the network value of customers. Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining – KDD ’01. doi: https://doi.org/10.1145/502512.502525
  5. Kempe, D., Kleinberg, J., Tardos, É. (2003). Maximizing the spread of influence through a social network. Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining – KDD '03. doi: https://doi.org/10.1145/956755.956769
  6. Cao, Y., Shao, P., Li, L., Cao, Y. (2011). Topic Propagation Model Based on Diffusion Threshold in Blog Networks. 2011 International Conference on Business Computing and Global Informatization. doi: https://doi.org/10.1109/bcgin.2011.142
  7. Lim, S., Jung, I., Lee, S., Jung, K. (2015). Analysis of information diffusion for threshold models on arbitrary networks. The European Physical Journal B, 88 (8). doi: https://doi.org/10.1140/epjb/e2015-60263-6
  8. Saito, K., Nakano, R., Kimura, M. (2008). Prediction of Information Diffusion Probabilities for Independent Cascade Model. International Conference on Knowledge-Based and Intelligent Information and Engineering Systems KES 2008: Knowledge-Based Intelligent Information and Engineering Systems, 67–75. doi: https://doi.org/10.1007/978-3-540-85567-5_9
  9. Lee, W., Kim, J., Yu, H. (2012). CT-IC: Continuously Activated and Time-Restricted Independent Cascade Model for Viral Marketing. 2012 IEEE 12th International Conference on Data Mining. doi: https://doi.org/10.1109/icdm.2012.40
  10. Yang, W., Brenner, L., Giua, A. (2018). Computation of Activation Probabilities in the Independent Cascade Model. 2018 5th International Conference on Control, Decision and Information Technologies (CoDIT). doi: https://doi.org/10.1109/codit.2018.8394923
  11. Saito, K., Kimura, M., Ohara, K., Motoda, H. (2009). Learning Continuous-Time Information Diffusion Model for Social Behavioral Data Analysis. Advances in Machine Learning, 322–337. doi: https://doi.org/10.1007/978-3-642-05224-8_25
  12. Saito, K., Ohara, K., Yamagishi, Y., Kimura, M., Motoda, H. (2011). Learning Diffusion Probability Based on Node Attributes in Social Networks. International Symposium on Methodologies for Intelligent Systems ISMIS 2011: Foundations of Intelligent Systems, 153–162. doi: https://doi.org/10.1007/978-3-642-21916-0_18
  13. Gomez-Rodriguez, M., Balduzzi, D., Scholkopf, B. (2011). Uncovering the temporal dynamics of diffusion networks. Proceedings of the 28 th International Conference on Machine Learning. Bellevue. Available at: http://snap.stanford.edu/class/cs224w-readings/rodriguez11diffusion.pdf
  14. Barbieri, N., Bonchi, F., Manco, G. (2012). Topic-Aware Social Influence Propagation Models. 2012 IEEE 12th International Conference on Data Mining. doi: https://doi.org/10.1109/icdm.2012.122
  15. Chen, W., Collins, A., Cummings, R., Ke, T., Liu, Z., Rincon, D. et. al. (2011). Influence Maximization in Social Networks When Negative Opinions May Emerge and Propagate. Proceedings of the 2011 SIAM International Conference on Data Mining, 379–390. doi: https://doi.org/10.1137/1.9781611972818.33
  16. Li, Y., Chen, W., Wang, Y., Zhang, Z.-L. (2013). Influence diffusion dynamics and influence maximization in social networks with friend and foe relationships. Proceedings of the Sixth ACM International Conference on Web Search and Data Mining – WSDM ’13. doi: https://doi.org/10.1145/2433396.2433478
  17. Bakshy, E., Karrer, B., Adamic, L. A. (2009). Social influence and the diffusion of user-created content. Proceedings of the Tenth ACM Conference on Electronic Commerce – EC ’09. doi: https://doi.org/10.1145/1566374.1566421
  18. Szabo, G., Huberman, B. A. (2010). Predicting the popularity of online content. Communications of the ACM, 53 (8), 80. doi: https://doi.org/10.1145/1787234.1787254
  19. Kupavskii, A., Ostroumova, L., Umnov, A., Usachev, S., Serdyukov, P., Gusev, G., Kustarev, A. (2012). Prediction of retweet cascade size over time. Proceedings of the 21st ACM International Conference on Information and Knowledge Management – CIKM ’12. doi: https://doi.org/10.1145/2396761.2398634
  20. Jenders, M., Kasneci, G., Naumann, F. (2013). Analyzing and predicting viral tweets. Proceedings of the 22nd International Conference on World Wide Web – WWW’13 Companion. doi: https://doi.org/10.1145/2487788.2488017
  21. Kwak, H., Lee, C., Park, H., Moon, S. (2010). What is Twitter, a social network or a news media? Proceedings of the 19th International Conference on World Wide Web – WWW ’10. doi: https://doi.org/10.1145/1772690.1772751
  22. Haveliwala, T. H. (2002). Topic-sensitive PageRank. Proceedings of the Eleventh International Conference on World Wide Web – WWW ’02. doi: https://doi.org/10.1145/511446.511513
  23. Weng, J., Lim, E.-P., Jiang, J., He, Q. (2010). TwitterRank. Proceedings of the Third ACM International Conference on Web Search and Data Mining – WSDM ’10. doi: https://doi.org/10.1145/1718487.1718520
  24. Yang, J., Leskovec, J. (2010). Modeling Information Diffusion in Implicit Networks. 2010 IEEE International Conference on Data Mining. doi: https://doi.org/10.1109/icdm.2010.22
  25. Ma, Z., Sun, A., Cong, G. (2013). On predicting the popularity of newly emerging hashtags in Twitter. Journal of the American Society for Information Science and Technology, 64 (7), 1399–1410. doi: https://doi.org/10.1002/asi.22844
  26. Cohen, E., Delling, D., Pajor, T., Werneck, R. F. (2014). Sketch-based Influence Maximization and Computation. Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management – CIKM ’14. doi: https://doi.org/10.1145/2661829.2662077
  27. Lucier, B., Oren, J., Singer, Y. (2015). Influence at Scale. Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining – KDD ’15. doi: https://doi.org/10.1145/2783258.2783334
  28. Richardson, M., Domingos, P. (2002). Mining knowledge-sharing sites for viral marketing. Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining – KDD ’02. doi: https://doi.org/10.1145/775047.775057
  29. Cheng, J., Adamic, L., Dow, P. A., Kleinberg, J. M., Leskovec, J. (2014). Can cascades be predicted? Proceedings of the 23rd International Conference on World Wide Web – WWW ’14. doi: https://doi.org/10.1145/2566486.2567997
  30. Goel, S., Watts, D. J., Goldstein, D. G. (2012). The structure of online diffusion networks. Proceedings of the 13th ACM Conference on Electronic Commerce – EC ’12. doi: https://doi.org/10.1145/2229012.2229058
  31. Myers, S. A., Zhu, C., Leskovec, J. (2012). Information diffusion and external influence in networks. Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining – KDD ’12. doi: https://doi.org/10.1145/2339530.2339540
  32. Wu, D., Li, C., Lau, R. Y. K. (2015). Topic Based Information Diffusion Prediction Model with External Trends. 2015 IEEE 12th International Conference on e-Business Engineering. doi: https://doi.org/10.1109/icebe.2015.15
  33. Bi, B., Tian, Y., Sismanis, Y., Balmin, A., Cho, J. (2014). Scalable topic-specific influence analysis on microblogs. Proceedings of the 7th ACM International Conference on Web Search and Data Mining – WSDM ’14. doi: https://doi.org/10.1145/2556195.2556229
  34. Du, N., Song, L., Woo, H., Zha, H. (2013). Uncover topic-sensitive information diffusion networks. In AISTATS, 229–237.
  35. Pramanik, S., Wang, Q., Danisch, M., Guillaume, J.-L., Mitra, B. (2017). Modeling cascade formation in Twitter amidst mentions and retweets. Social Network Analysis and Mining, 7 (1). doi: https://doi.org/10.1007/s13278-017-0462-1
  36. Leskovec, J., Chakrabarti, D., Kleinberg, J., Faloutsos, C., Ghahramani, Z. (2010). Kronecker graphs: An approach to modeling networks. Journal of Machine Learning Research, 11, 985–1042.
  37. Krongen: Kronecker graphs graph generator. Available at: https://github.com/snap-stanford/snap/tree/master/examples/krongen
  38. Structure and dynamics of information pathways in on-line media. Available at: http://snap.stanford.edu/infopath
  39. Gomez-Rodriguez, M., Leskovec, J., Krause, A. (2012). Inferring Networks of Diffusion and Influence. ACM Transactions on Knowledge Discovery from Data, 5 (4), 1–37. doi: https://doi.org/10.1145/2086737.2086741
  40. KDD Cup 2012, Track 1. Available at: https://www.kaggle.com/c/kddcup2012-track1/data
  41. Bakshy, E., Hofman, J. M., Mason, W. A., Watts, D. J. (2011). Everyone's an influencer. Proceedings of the fourth ACM international conference on Web search and data mining – WSDM '11. doi: https://doi.org/10.1145/1935826.1935845
  42. Matsubara, Y., Sakurai, Y., Prakash, B. A., Li, L., Faloutsos, C. (2012). Rise and fall patterns of information diffusion. Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining – KDD’12. doi: https://doi.org/10.1145/2339530.2339537
  43. Hui, C., Tyshchuk, Y., Wallace, W. A., Magdon-Ismail, M., Goldberg, M. (2012). Information cascades in social media in response to a crisis. Proceedings of the 21st International Conference Companion on World Wide Web – WWW ’12 Companion. 2012. doi: https://doi.org/10.1145/2187980.2188173
  44. Chen, W., Wang, C., Wang, Y. (2010). Scalable influence maximization for prevalent viral marketing in large-scale social networks. Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining – KDD ’10. doi: https://doi.org/10.1145/1835804.1835934
  45. Tang, Y., Shi, Y., Xiao, X. (2015). Influence Maximization in Near-Linear Time. Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data – SIGMOD ’15. doi: https://doi.org/10.1145/2723372.2723734
  46. Zhang, J., Liu, B., Tang, J., Chen, T., Li, J. (2013). Social influence locality for modeling retweeting behaviors. Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence, 2761–2767.
  47. Suh, B., Hong, L., Pirolli, P., Chi, E. H. (2010). Want to be Retweeted? Large Scale Analytics on Factors Impacting Retweet in Twitter Network. 2010 IEEE Second International Conference on Social Computing. doi: https://doi.org/10.1109/socialcom.2010.33

Downloads

Published

2018-12-10

How to Cite

Anh, N. V., Son, D. N., Ha, N. T. T., Kuznetsov, S., & Vinh, N. T. Q. (2018). A method for determining information diffusion cascades on social networks. Eastern-European Journal of Enterprise Technologies, 6(2 (96), 61–69. https://doi.org/10.15587/1729-4061.2018.150295