An algebraic method for calculating PageRank

Authors

  • Vladislav Vlasyuk Vertamedia LLC branch Admiralskyi ave., 34а, Odessa, Ukraine, 65009, Ukraine
  • Oleg Galchonkov Institute of Computer Systems Odessa National Polytechnic University Shevchenko ave., 1, Odessa, Ukraine, 65044, Ukraine https://orcid.org/0000-0001-5468-7299
  • Alexander Nevrev Institute of Computer Systems Odessa National Polytechnic University Shevchenko ave., 1, Odessa, Ukraine, 65044, Ukraine https://orcid.org/0000-0001-7673-5466

DOI:

https://doi.org/10.15587/1729-4061.2018.131275

Keywords:

site graph, page ranks, transition matrix, damping coefficient, teleportation matrix

Abstract

An algebraic method is proposed for finding PageRank estimates for pages of websites. The amount of calculation in the proposed method does not depend on the value of the damping coefficient, which allows obtaining more accurate estimates of the rankings of PageRank in comparison with analogues. A distinctive feature of the proposed method is a step-by-step performance of calculations simultaneously with the work of the graph traversal algorithm. The comparative analysis of algorithms for traversing graphs has shown that, in contrast to the depth search algorithm, the breadth search algorithm gives a more orderly arranged matrix of transitions, which has the blockwise Hessenberg form. The use of this circumstance makes it possible to reduce significantly the amount of calculations by the proposed method. The resulting equations describing the proposed method have a block structure that allows efficient distribution of the entire volume of operations to parallel computational threads . Proceeding from the fact that the bulk of the calculations can be performed while the graph traversal algorithm is running, the study has determined the conditions under which the proposed method makes it possible to obtain PageRank values faster than the known iterative algorithms. The applicability area of the developed method is, first of all, its use in direct verification of the reliability of posting advertising materials on a relevant web resource; therefore, it is limited to specific Internet sites or segments with no more than 104 or 105 pages.

Author Biographies

Vladislav Vlasyuk, Vertamedia LLC branch Admiralskyi ave., 34а, Odessa, Ukraine, 65009

Manager

Oleg Galchonkov, Institute of Computer Systems Odessa National Polytechnic University Shevchenko ave., 1, Odessa, Ukraine, 65044

PhD, Associate Professor

Department of Information Systems

Alexander Nevrev, Institute of Computer Systems Odessa National Polytechnic University Shevchenko ave., 1, Odessa, Ukraine, 65044

PhD, Associate Professor

Department of Information Systems

References

  1. Marciel, M., Cuevas, R., Banchs, A., González, R., Traverso, S., Ahmed, M., Azcorra, A. (2016). Understanding the Detection of View Fraud in Video Content Portals. Proceedings of the 25th International Conference on World Wide Web – WWW ’16. doi: 10.1145/2872427.2882980
  2. Alternative Ad Networks to Open Up New Channels of Growth in 2018. Available at: https://www.singlegrain.com/blog-posts/pay-per-click/44-ad-networks-will-help-open-new-channels-growth/
  3. Agency Trading Desks. Available at: https://adexchanger.com/Agency_Trading_White_Paper.pdf
  4. What’s the Difference Between Using a DSP and an Agency Trading Desk? Available at: http://weevermedia.com/uncategorised/whats-difference-using-dsp-agency-trading-desk/
  5. Thompson, J. What Is a DSP, SSP and Ad exchange, and how do they fit together? Available at: https://www.bannerconnect.net/what-is-a-dsp-ssp-and-ad-exchange/
  6. Scott, S. The $8.2 Billion Adtech Fraud Problem That Everyone Is Ignoring. TechCrunch. Available at: https://techcrunch.com/2016/01/06/the-8-2-billion-adtech-fraud-problem-that-everyone-is-ignoring/
  7. Reichow, J. (2017). TUNE Fraud Series: Types of advertising fraud. TUNE. Available at: https://www.tune.com/blog/types-of-advertising-fraud/
  8. Ricci, M. (2017). Why it’s Time to Take a Stand Against Ad Fraud. MartechAdvisor. Available at: https://www.martechadvisor.com/articles/ads/why-its-time-to-take-a-stand-against-ad-fraud/
  9. Leskovec, J., Rajaraman, A., Ullman, J. D. (2014). Mining of Massive Datasets. Cambridge University Press. doi: 10.1017/cbo9781139924801
  10. Cormen, T. H., Leiserson, C. E., Rivest, R. L., Stein, C. (2009). Introduction to Algorithms. MIT press and McGraw-Hill, 1313.
  11. Brin, S., Page, L. (1998). The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems, 30 (1-7), 107–117. doi: 10.1016/s0169-7552(98)00110-x
  12. Polyak, B. T., Tremba, A. A. (2012). Regularization-based solution of the PageRank problem for large matrices. Automation and Remote Control, 73 (11), 1877–1894. doi: 10.1134/s0005117912110094
  13. Del Corso, G. M., Gullí, A., Romani, F. (2005). Fast PageRank Computation via a Sparse Linear System. Internet Mathematics, 2 (3), 251–273. doi: 10.1080/15427951.2005.10129108
  14. PageRank. Wikipedia. Available at: https://en.wikipedia.org/wiki/PageRank
  15. Harris, L., Papanikolaou, I., Shi, J., Strott, D., Tan, Y., Zhong, C., Changhui, T. (2015). Eigensystems for Large Matrices. AMSC460.
  16. Sargolzaei, P., Soleymani, F. (2010). PageRank Problem, Survey And Future Research Directions. International Mathematical Forum, 5 (19), 937–956.
  17. Wu, G., Wei, Y. (2007). A Power–Arnoldi algorithm for computing PageRank. Numerical Linear Algebra with Applications, 14 (7), 521–546. doi: 10.1002/nla.531
  18. Kamvar, S. D., Haveliwala, T. H., Golub, G. H. (2003). Extrapolation methods for accelerating PageRank computations. Technique Report SCCM 03-02.2003. Stanford University.
  19. Wu, G., Wei, Y. (2010). An Arnoldi-Extrapolation algorithm for computing PageRank. Journal of Computational and Applied Mathematics, 234 (11), 3196–3212. doi: 10.1016/j.cam.2010.02.009
  20. Kamvar, S., Haveliwala, T., Golub, G. (2004). Adaptive methods for the computation of PageRank. Linear Algebra and Its Applications, 386, 51–65. doi: 10.1016/j.laa.2003.12.008
  21. Yin, J.-F., Yin, G.-J., Ng, M. (2011). On adaptively accelerated Arnoldi method for computing PageRank. Numerical Linear Algebra with Applications, 19 (1), 73–85. doi: 10.1002/nla.789
  22. Kim, M.-S. (2012). Towards Exploiting GPUs for Fast PageRank Computation of Large-Scale Networks. Proceeding of the third International Conference on Emerging Databases.
  23. Rungsawang, A., Manaskasemsak, B. (2012). Fast PageRank Computation on a GPU Cluster. 2012 20th Euromicro International Conference on Parallel, Distributed and Network-Based Processing. doi: 10.1109/pdp.2012.78
  24. Liu, C., Li, Y. (2015). A Parallel PageRank Algorithm with Power Iteration Acceleration. International Journal of Grid and Distributed Computing, 8 (2), 273–284. doi: 10.14257/ijgdc.2015.8.2.24
  25. Srivastava, A. K., Srivastava, M., Garg, R., Mishra, P. K. (2017). Parallel PageRank Algorithms: A Survey. International Journal on Recent and Innovation Trends in Computing and Communication, 5 (5), 470–473.
  26. Choudhari, P., Baikampadi, E., Patil, P., Gadekar, S. (2015). Parallel and Improved PageRank Algorithm for GPU-CPU Collaborative Environment. International Journal of Computer Science and Information Technologies, 6 (3), 2003–2005.
  27. Yang, P., Zhou, L. (2016). Research on PageRank Algorithm parallel computing Based on Hadoop. Proceedings of the 2016 4th International Conference on Mechanical Materials and Manufacturing Engineering. doi: 10.2991/mmme-16.2016.40
  28. Gleich, D., Zhukov, L., Berkhin, P. Fast Parallel PageRank: A Linear System Approach. Available at: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.592.64&rep=rep1&type=pdf
  29. Gantmakher, F. (2004). Theory of matrices. Moscow: Nauka, 581.
  30. Wright, G. Probability, linear algebra, and numerical analysis: the mathematics behind Google'sTM PageRankTM. Available at: http://math.boisestate.edu/~wright/courses/m297/google_talk.pdf

Downloads

Published

2018-05-16

How to Cite

Vlasyuk, V., Galchonkov, O., & Nevrev, A. (2018). An algebraic method for calculating PageRank. Eastern-European Journal of Enterprise Technologies, 3(2 (93), 6–12. https://doi.org/10.15587/1729-4061.2018.131275