Method of planning data processing tasks in distributed systems with limited information about available resources

Authors

DOI:

https://doi.org/10.30837/ITSSI.2023.25.027

Keywords:

software engineering; distributed systems; data processing; database; task scheduling; algebra of finite predicates.

Abstract

In today's digital landscape, distributed data processing systems (DDPs) are becoming increasingly critical to efficiently process, analyze, and manage large volumes of data. These systems are often used in commercial, scientific and social domains to process complex data in real-time or batch mode. One of the key components of such systems is task scheduling, which is an extremely complex process, particularly when information about resource requirements is not complete or accurate. The subject of research are algorithms, methods and approaches used for scheduling tasks between nodes in distributed systems. The purpose of the study is to create an optimized method of task planning in the RSOD with limited availability of information about available resources. The task of the research: to analyze the limitations of modern methods for scheduling tasks in distributed data processing systems (DDS); optimize the method of scheduling tasks based on metadata between RSOD nodes, based on the methodology of searching for nearest neighbors using the method of localized hashing and the algebra of finite predicates; develop the architecture of the software solution and its implementation based on the optimized method; test the algorithm on the example of a video decoding task. The following methods were used: statistical algorithms and techniques such as classification and cluster analysis were used to predict resource requirements, and visualization techniques assisted in the analysis and interpretation of results. As a result of the work: the limitations of modern methods for the distribution of tasks in distributed data processing systems (DDPs) were analyzed; an optimized method of task planning based on metadata in RSOD was created, based on the methodology of searching for nearest neighbors using the method of localized hashing and the algebra of finite predicates; the processes in the modified nearest neighbor search algorithm are detailed; the architecture of the software solution was developed, which integrates an optimized method of task planning based on metadata and resource allocation; validation of the software solution was carried out with the help of a practical scenario – the use of the created algorithm in the planning task for decoding video information. The conclusions of this study confirmed that the proposed method, based on the methodology of localized hashing and the use of finite redicate algebra, is effective even with insufficient or limited information about resource needs. This highlights the possibility of using dynamic scheduling strategies to adapt to changing load conditions and resource availability.

Author Biographies

Andrii Kozyriev, Kharkiv National University of Radio Electronics

Postgraduate

Ihor Shubin, Kharkiv National University of Radio Electronics

PhD (Engineering Sciences), Associate Professor, Professor at the Software Department

References

Список літератури

Endo, P., de Almeida Palhares, A., Pereira, N., Goncalves, G., Sadok, D., Kelner, J., Melander, B. & Mangs, J.E. Resource allocation for distributed cloud: concepts and research challenges. IEEE Network. 2011. Vol. 25, No. 4. P. 42–46. DOI: 10.1109/mnet.2011.5958007

Sunyaev A. Principles of Distributed Systems and Emerging Internet-Based Technologies. Internet Computing. Cham: Springer International Publishing, 2020. Vol. XVIII. 413 р. DOI: 10.1007/978-3-030-34957-8

Siva Prasad B. V., Sucharitha G., Venkatesan K. G., Patnala T. R., Murari T., Karanam S. R., Optimisation of the Execution Time Using Hadoop-Based Parallel Machine Learning on Computing Clusters. Computer Networks, Big Data and IoT. Singapore, 2022. P. 233–244. DOI: 10.1007/978-981-19-0898-9_18

Kamran M. Fundamentals of Smart Grid Systems. Elsevier Science & Technology Books, 2023. 500 р. URL: https://shop.elsevier.com/books/fundamentals-of-smart-grid-systems/kamran/978-0-323-99560-3

Hasimi L., Penzel D. A Case Study on Cloud Computing: Challenges, Opportunities, and Potentials. Studies in Systems, Decision and Control. Cham, 2023. P. 1–25. DOI: 10.1007/978-3-031-27506-7_1

Zolotariov D. Microservice Architecture for Building High-Availability Distributed Automated Computing System in A Cloud Infrastructure. Сучасний стан наукових досліджень і технологій в промисловості. 2021. No. 3 (17). P. 13–22. DOI: 10.30837/itssi.2021.17.013

Tom L., Bindu V. R. Task Scheduling Algorithms in Cloud Computing: A Survey. Inventive Computation Technologies. Cham, 2019. P. 342–350. DOI: 10.1007/978-3-030-33846-6_39

Chen C., Shi H., Wang Z., Yu Z. A Task Scheduling Algorithm Based on Big.LITTLE Architecture in Cloud Computing". In: 2020 6th International Conference on Big Data and Information Analytics (BigDIA), 4–6 December 2020, Shenzhen, China IEEE. 2020. Р. 94–99. DOI: 10.1109/bigdia51454.2020.00023

Chan C., Cooper B. Debugging incidents in Google's distributed systems. Communications of the ACM. 2020. Vol. 63, No. 10. P. 40–46. DOI: 10.1145/3397880

Langhnoja H. K., Hetal A Joshiyara P. Multi-Objective Based Integrated Task Scheduling In Cloud Computing. 2019 3rd International conference on Electronics, Communication and Aerospace Technology (ICECA), Coimbatore, India, 12–14 June 2019. 2019. DOI: 10.1109/iceca.2019.8821912

Zolotariov D. Automated deployment of a software environment for microservices in a rapidly changing technology stack. Сучасний стан наукових досліджень і технологій в промисловості. 2021. No. 4 (18). P. 23–30. DOI: 10.30837/itssi.2021.18.023.

Danielsson J., Seceleanu T., Jagemar M., Behnam M., Sjodin M., Resource Depedency Analysis in Multi-Core Systems. 2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC), Madrid, Spain, 13–17 July 2020. 2020. DOI: 10.1109/compsac48688.2020.00021

Liu F., Guo W. Optimized Min-Min Dynamic Task Scheduling Algorithm in Grid Computing. Advances in Intelligent Systems and Computing. Cham, 2019. P. 745–752. DOI: 10.1007/978-3-030-25128-4_92

Stavrinides G. L., Karatza H. D. Scheduling Single-Task Jobs along with Bag-of-Task-Chains in Distributed Systems. ICFNDS '19: 3rd International Conference on Future Networks and Distributed Systems, Paris France. New York, NY, USA, 2019. Р. 1–6. DOI: 10.1145/3341325.3342023

Yadav S., Mohan R., Yadav P. K. Fuzzy based task allocation technique in distributed computing system. International Journal of Information Technology. 2018. Vol. 11, No. 1. P. 13–20. DOI: 10.1007/s41870-018-0172-6

Li C., Liu F., Wang B., Philip Chen C. L., Tang X., Jiang J., Liu J. Dependency-Aware Vehicular Task Scheduling Policy for Tracking Service VEC Networks. IEEE Transactions on Intelligent Vehicles. 2022. P. 1–15. DOI: 10.1109/tiv.2022.3224057

Kozyriev A., Litvin S. Methods of Creating Service-oriented Software Systems. CEUR Workshop Proceedings. Vol. 3171, 2022. Р. 763–774. URL: https://www.scopus.com/inward/record.uri?eid=2-s2.0-85134768124&partnerID=40&md5=6ed7a3d1bcc4e527548c2ade3019562d

Shubin I., Karataiev O. Reuse of Information Based on The Interpretation of Knowledge. Сучасний стан наукових досліджень і технологій в промисловості. 2023. No. 2 (24). P. 62–71. URL: https://doi.org/10.30837/itssi.2023.24.062 (дата звернення: 01.10.2023).

References

Endo, P., de Almeida Palhares, A., Pereira, N., Goncalves, G., Sadok, D., Kelner, J., Melander, B. & Mangs, J.E. (2011), "Resource allocation for distributed cloud: concepts and research challenges". IEEE Network. Vol. 25(4), Р. 42–46. DOI: 10.1109/mnet.2011.5958007

Sunyaev A. (2020), "Principles of Distributed Systems and Emerging Internet-Based Technologies". Internet Computing. Cham: Springer International Publishing, 2020. Vol. XVIII. 413 р. DOI: 10.1007/978-3-030-34957-8

Siva Prasad, B. V., Sucharitha, G., Venkatesan, K. G., Patnala, T. R., Murari, T., Karanam, S. R. (2022), "Optimisation of the Execution Time Using Hadoop-Based Parallel Machine Learning on Computing Clusters". Computer Networks, Big Data and IoT. Singapore, 2022. P. 233–244. DOI: 10.1007/978-981-19-0898-9_18

Kamran, M. "Fundamentals of Smart Grid Systems". Elsevier Science & Technology Books, 2023. 500 р. available at: https://shop.elsevier.com/books/fundamentals-of-smart-grid-systems/kamran/978-0-323-99560-3

Hasimi L., Penzel D. (2023), "A Case Study on Cloud Computing: Challenges, Opportunities, and Potentials". Studies in Systems, Decision and Control. Cham. P. 1–25. DOI 10.1007/978-3-031-27506-7_1

Zolotariov, D. (2021), "Microservice Architecture for Building High-Availability Distributed Automated Computing System in A Cloud Infrastructure". Сучасний стан наукових досліджень і технологій в промисловості. No. 3 (17). P. 13–22. DOI: 10.30837/itssi.2021.17.013

Tom, L., Bindu, V. R. (2019), "Task Scheduling Algorithms in Cloud Computing: A Survey". Inventive Computation Technologies. Cham. P. 342–350. DOI: 10.1007/978-3-030-33846-6_39

Chen, C., Shi, H., Wang, Z., Yu, Z. (2020), "A Task Scheduling Algorithm Based on Big.LITTLE Architecture in Cloud Computing". In: 2020 6th International Conference on Big Data and Information Analytics (BigDIA), 4–6 December 2020, Shenzhen, China IEEE. Р. 94–99. DOI: 10.1109/bigdia51454.2020.00023

Chan, C., Cooper, B. (2020), "Debugging incidents in Google's distributed systems". Communications of the ACM. 2020. Vol. 63, No. 10. P. 40–46. DOI: 10.1145/3397880

Langhnoja, H.K., Hetal, A., Joshiyara, P. (2019), "Multi-Objective Based Integrated Task Scheduling In Cloud Computing". 2019 3rd International conference on Electronics, Communication and Aerospace Technology (ICECA), Coimbatore, India, 12–14 June 2019. DOI: 10.1109/iceca.2019.8821912

Zolotariov, D., (2021). "Automated Deployment of a Software Environment for Microservices in a Rapidly Changing Technology Stack". Innovative Technologies and Scientific Solutions for Industries. Vol. 4 (18), Р. 23–30. DOI: 10.30837/itssi.2021.18.023

Danielsson, J., Seceleanu, T., Jagemar, M., Behnam, M., Sjodin, M. (2020), "Resource Depedency Analysis in Multi-Core Systems". In: 2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC), 13–17 July 2020, Madrid, Spain. DOI: 10.1109/compsac48688.2020.00021

Liu, F., Guo, W. (2019), "Optimized Min-Min Dynamic Task Scheduling Algorithm in Grid Computing". Advances in Intelligent Systems and Computing. Cham: Springer International Publishing. Р. 745–752. DOI: 10.1007/978-3-030-25128-4_92

Stavrinides, G. L., Karatza, H. D. (2019), "Scheduling Single-Task Jobs along with Bag-of-Task-Chains in Distributed Systems". ICFNDS '19: 3rd International Conference on Future Networks and Distributed Systems, Paris France. New York, NY, USA. Р. 1–6. DOI: 10.1145/3341325.3342023

Yadav, S., Mohan, R., Yadav, P. K. (2018), "Fuzzy based task allocation technique in distributed computing system". International Journal of Information Technology. Vol. 11, No. 1. P. 13–20. DOI: 10.1007/s41870-018-0172-6

Li, C., Liu, F., Wang, B., Philip Chen, C. L., Tang, X., Jiang, J. & Liu, J. (2022), "Dependency-Aware Vehicular Task Scheduling Policy for Tracking Service VEC Networks". IEEE Transactions on Intelligent Vehicles. P. 1–15. DOI: 10.1109/tiv.2022.3224057

Kozyriev, A., Litvin, S. "Methods of Creating Service-oriented Software Systems". CEUR Workshop Proceedings. Vol. 3171, 2022. Р. 763–774. available at: https://www.scopus.com/inward/record.uri?eid=2-s2.0-85134768124&partnerID=40&md5=6ed7a3d1bcc4e527548c2ade3019562d

Shubin, I., Karataiev, O. (2023), "Reuse of Information Based on The Interpretation of Knowledge". Сучасний стан наукових досліджень і технологій в промисловості. No. 2 (24). P. 62–71. available at: https://doi.org/10.30837/itssi.2023.24.062 (last accessed: 01.10.2023)

Published

2023-09-30

How to Cite

Kozyriev, A., & Shubin, I. (2023). Method of planning data processing tasks in distributed systems with limited information about available resources. INNOVATIVE TECHNOLOGIES AND SCIENTIFIC SOLUTIONS FOR INDUSTRIES, (3(25), 27–39. https://doi.org/10.30837/ITSSI.2023.25.027