Method of planning data processing tasks in distributed systems with limited information about available resources
DOI:
https://doi.org/10.30837/ITSSI.2023.25.027Keywords:
software engineering; distributed systems; data processing; database; task scheduling; algebra of finite predicates.Abstract
In today's digital landscape, distributed data processing systems (DDPs) are becoming increasingly critical to efficiently process, analyze, and manage large volumes of data. These systems are often used in commercial, scientific and social domains to process complex data in real-time or batch mode. One of the key components of such systems is task scheduling, which is an extremely complex process, particularly when information about resource requirements is not complete or accurate. The subject of research are algorithms, methods and approaches used for scheduling tasks between nodes in distributed systems. The purpose of the study is to create an optimized method of task planning in the RSOD with limited availability of information about available resources. The task of the research: to analyze the limitations of modern methods for scheduling tasks in distributed data processing systems (DDS); optimize the method of scheduling tasks based on metadata between RSOD nodes, based on the methodology of searching for nearest neighbors using the method of localized hashing and the algebra of finite predicates; develop the architecture of the software solution and its implementation based on the optimized method; test the algorithm on the example of a video decoding task. The following methods were used: statistical algorithms and techniques such as classification and cluster analysis were used to predict resource requirements, and visualization techniques assisted in the analysis and interpretation of results. As a result of the work: the limitations of modern methods for the distribution of tasks in distributed data processing systems (DDPs) were analyzed; an optimized method of task planning based on metadata in RSOD was created, based on the methodology of searching for nearest neighbors using the method of localized hashing and the algebra of finite predicates; the processes in the modified nearest neighbor search algorithm are detailed; the architecture of the software solution was developed, which integrates an optimized method of task planning based on metadata and resource allocation; validation of the software solution was carried out with the help of a practical scenario – the use of the created algorithm in the planning task for decoding video information. The conclusions of this study confirmed that the proposed method, based on the methodology of localized hashing and the use of finite redicate algebra, is effective even with insufficient or limited information about resource needs. This highlights the possibility of using dynamic scheduling strategies to adapt to changing load conditions and resource availability.
References
Список літератури
Endo, P., de Almeida Palhares, A., Pereira, N., Goncalves, G., Sadok, D., Kelner, J., Melander, B. & Mangs, J.E. Resource allocation for distributed cloud: concepts and research challenges. IEEE Network. 2011. Vol. 25, No. 4. P. 42–46. DOI: 10.1109/mnet.2011.5958007
Sunyaev A. Principles of Distributed Systems and Emerging Internet-Based Technologies. Internet Computing. Cham: Springer International Publishing, 2020. Vol. XVIII. 413 р. DOI: 10.1007/978-3-030-34957-8
Siva Prasad B. V., Sucharitha G., Venkatesan K. G., Patnala T. R., Murari T., Karanam S. R., Optimisation of the Execution Time Using Hadoop-Based Parallel Machine Learning on Computing Clusters. Computer Networks, Big Data and IoT. Singapore, 2022. P. 233–244. DOI: 10.1007/978-981-19-0898-9_18
Kamran M. Fundamentals of Smart Grid Systems. Elsevier Science & Technology Books, 2023. 500 р. URL: https://shop.elsevier.com/books/fundamentals-of-smart-grid-systems/kamran/978-0-323-99560-3
Hasimi L., Penzel D. A Case Study on Cloud Computing: Challenges, Opportunities, and Potentials. Studies in Systems, Decision and Control. Cham, 2023. P. 1–25. DOI: 10.1007/978-3-031-27506-7_1
Zolotariov D. Microservice Architecture for Building High-Availability Distributed Automated Computing System in A Cloud Infrastructure. Сучасний стан наукових досліджень і технологій в промисловості. 2021. No. 3 (17). P. 13–22. DOI: 10.30837/itssi.2021.17.013
Tom L., Bindu V. R. Task Scheduling Algorithms in Cloud Computing: A Survey. Inventive Computation Technologies. Cham, 2019. P. 342–350. DOI: 10.1007/978-3-030-33846-6_39
Chen C., Shi H., Wang Z., Yu Z. A Task Scheduling Algorithm Based on Big.LITTLE Architecture in Cloud Computing". In: 2020 6th International Conference on Big Data and Information Analytics (BigDIA), 4–6 December 2020, Shenzhen, China IEEE. 2020. Р. 94–99. DOI: 10.1109/bigdia51454.2020.00023
Chan C., Cooper B. Debugging incidents in Google's distributed systems. Communications of the ACM. 2020. Vol. 63, No. 10. P. 40–46. DOI: 10.1145/3397880
Langhnoja H. K., Hetal A Joshiyara P. Multi-Objective Based Integrated Task Scheduling In Cloud Computing. 2019 3rd International conference on Electronics, Communication and Aerospace Technology (ICECA), Coimbatore, India, 12–14 June 2019. 2019. DOI: 10.1109/iceca.2019.8821912
Zolotariov D. Automated deployment of a software environment for microservices in a rapidly changing technology stack. Сучасний стан наукових досліджень і технологій в промисловості. 2021. No. 4 (18). P. 23–30. DOI: 10.30837/itssi.2021.18.023.
Danielsson J., Seceleanu T., Jagemar M., Behnam M., Sjodin M., Resource Depedency Analysis in Multi-Core Systems. 2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC), Madrid, Spain, 13–17 July 2020. 2020. DOI: 10.1109/compsac48688.2020.00021
Liu F., Guo W. Optimized Min-Min Dynamic Task Scheduling Algorithm in Grid Computing. Advances in Intelligent Systems and Computing. Cham, 2019. P. 745–752. DOI: 10.1007/978-3-030-25128-4_92
Stavrinides G. L., Karatza H. D. Scheduling Single-Task Jobs along with Bag-of-Task-Chains in Distributed Systems. ICFNDS '19: 3rd International Conference on Future Networks and Distributed Systems, Paris France. New York, NY, USA, 2019. Р. 1–6. DOI: 10.1145/3341325.3342023
Yadav S., Mohan R., Yadav P. K. Fuzzy based task allocation technique in distributed computing system. International Journal of Information Technology. 2018. Vol. 11, No. 1. P. 13–20. DOI: 10.1007/s41870-018-0172-6
Li C., Liu F., Wang B., Philip Chen C. L., Tang X., Jiang J., Liu J. Dependency-Aware Vehicular Task Scheduling Policy for Tracking Service VEC Networks. IEEE Transactions on Intelligent Vehicles. 2022. P. 1–15. DOI: 10.1109/tiv.2022.3224057
Kozyriev A., Litvin S. Methods of Creating Service-oriented Software Systems. CEUR Workshop Proceedings. Vol. 3171, 2022. Р. 763–774. URL: https://www.scopus.com/inward/record.uri?eid=2-s2.0-85134768124&partnerID=40&md5=6ed7a3d1bcc4e527548c2ade3019562d
Shubin I., Karataiev O. Reuse of Information Based on The Interpretation of Knowledge. Сучасний стан наукових досліджень і технологій в промисловості. 2023. No. 2 (24). P. 62–71. URL: https://doi.org/10.30837/itssi.2023.24.062 (дата звернення: 01.10.2023).
References
Endo, P., de Almeida Palhares, A., Pereira, N., Goncalves, G., Sadok, D., Kelner, J., Melander, B. & Mangs, J.E. (2011), "Resource allocation for distributed cloud: concepts and research challenges". IEEE Network. Vol. 25(4), Р. 42–46. DOI: 10.1109/mnet.2011.5958007
Sunyaev A. (2020), "Principles of Distributed Systems and Emerging Internet-Based Technologies". Internet Computing. Cham: Springer International Publishing, 2020. Vol. XVIII. 413 р. DOI: 10.1007/978-3-030-34957-8
Siva Prasad, B. V., Sucharitha, G., Venkatesan, K. G., Patnala, T. R., Murari, T., Karanam, S. R. (2022), "Optimisation of the Execution Time Using Hadoop-Based Parallel Machine Learning on Computing Clusters". Computer Networks, Big Data and IoT. Singapore, 2022. P. 233–244. DOI: 10.1007/978-981-19-0898-9_18
Kamran, M. "Fundamentals of Smart Grid Systems". Elsevier Science & Technology Books, 2023. 500 р. available at: https://shop.elsevier.com/books/fundamentals-of-smart-grid-systems/kamran/978-0-323-99560-3
Hasimi L., Penzel D. (2023), "A Case Study on Cloud Computing: Challenges, Opportunities, and Potentials". Studies in Systems, Decision and Control. Cham. P. 1–25. DOI 10.1007/978-3-031-27506-7_1
Zolotariov, D. (2021), "Microservice Architecture for Building High-Availability Distributed Automated Computing System in A Cloud Infrastructure". Сучасний стан наукових досліджень і технологій в промисловості. No. 3 (17). P. 13–22. DOI: 10.30837/itssi.2021.17.013
Tom, L., Bindu, V. R. (2019), "Task Scheduling Algorithms in Cloud Computing: A Survey". Inventive Computation Technologies. Cham. P. 342–350. DOI: 10.1007/978-3-030-33846-6_39
Chen, C., Shi, H., Wang, Z., Yu, Z. (2020), "A Task Scheduling Algorithm Based on Big.LITTLE Architecture in Cloud Computing". In: 2020 6th International Conference on Big Data and Information Analytics (BigDIA), 4–6 December 2020, Shenzhen, China IEEE. Р. 94–99. DOI: 10.1109/bigdia51454.2020.00023
Chan, C., Cooper, B. (2020), "Debugging incidents in Google's distributed systems". Communications of the ACM. 2020. Vol. 63, No. 10. P. 40–46. DOI: 10.1145/3397880
Langhnoja, H.K., Hetal, A., Joshiyara, P. (2019), "Multi-Objective Based Integrated Task Scheduling In Cloud Computing". 2019 3rd International conference on Electronics, Communication and Aerospace Technology (ICECA), Coimbatore, India, 12–14 June 2019. DOI: 10.1109/iceca.2019.8821912
Zolotariov, D., (2021). "Automated Deployment of a Software Environment for Microservices in a Rapidly Changing Technology Stack". Innovative Technologies and Scientific Solutions for Industries. Vol. 4 (18), Р. 23–30. DOI: 10.30837/itssi.2021.18.023
Danielsson, J., Seceleanu, T., Jagemar, M., Behnam, M., Sjodin, M. (2020), "Resource Depedency Analysis in Multi-Core Systems". In: 2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC), 13–17 July 2020, Madrid, Spain. DOI: 10.1109/compsac48688.2020.00021
Liu, F., Guo, W. (2019), "Optimized Min-Min Dynamic Task Scheduling Algorithm in Grid Computing". Advances in Intelligent Systems and Computing. Cham: Springer International Publishing. Р. 745–752. DOI: 10.1007/978-3-030-25128-4_92
Stavrinides, G. L., Karatza, H. D. (2019), "Scheduling Single-Task Jobs along with Bag-of-Task-Chains in Distributed Systems". ICFNDS '19: 3rd International Conference on Future Networks and Distributed Systems, Paris France. New York, NY, USA. Р. 1–6. DOI: 10.1145/3341325.3342023
Yadav, S., Mohan, R., Yadav, P. K. (2018), "Fuzzy based task allocation technique in distributed computing system". International Journal of Information Technology. Vol. 11, No. 1. P. 13–20. DOI: 10.1007/s41870-018-0172-6
Li, C., Liu, F., Wang, B., Philip Chen, C. L., Tang, X., Jiang, J. & Liu, J. (2022), "Dependency-Aware Vehicular Task Scheduling Policy for Tracking Service VEC Networks". IEEE Transactions on Intelligent Vehicles. P. 1–15. DOI: 10.1109/tiv.2022.3224057
Kozyriev, A., Litvin, S. "Methods of Creating Service-oriented Software Systems". CEUR Workshop Proceedings. Vol. 3171, 2022. Р. 763–774. available at: https://www.scopus.com/inward/record.uri?eid=2-s2.0-85134768124&partnerID=40&md5=6ed7a3d1bcc4e527548c2ade3019562d
Shubin, I., Karataiev, O. (2023), "Reuse of Information Based on The Interpretation of Knowledge". Сучасний стан наукових досліджень і технологій в промисловості. No. 2 (24). P. 62–71. available at: https://doi.org/10.30837/itssi.2023.24.062 (last accessed: 01.10.2023)
Downloads
Published
How to Cite
Issue
Section
License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Our journal abides by the Creative Commons copyright rights and permissions for open access journals.
Authors who publish with this journal agree to the following terms:
Authors hold the copyright without restrictions and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0) that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-commercial and non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
Authors are permitted and encouraged to post their published work online (e.g., in institutional repositories or on their website) as it can lead to productive exchanges, as well as earlier and greater citation of published work.