Performance evaluation of LU matrix decomposition using the SYCL standard

Authors

DOI:

https://doi.org/10.15587/2706-5448.2023.284518

Keywords:

SYCL standard tools, parallel computing, LU decomposition, SYCL performance, numerical methods

Abstract

The object of this study is the performance of the SYCL standard tools when solving the LU matrix decomposition problem. SYCL is a fairly new technology for parallel computing in heterogeneous systems, so the topic of evaluating the performance of the standard on specific tasks in the field of parallel computing is relevant. In the study, the algorithm of parallelized LU decomposition of a square matrix was implemented by means of the SYCL standard and standard C++, and an experiment was conducted to test the implementation in a heterogeneous system with several types of processors. During testing, the program received square matrices of various dimensions as input, and the output was the execution time of the LU schedule on the selected processor. The obtained results, presented in the form of tabular and graphic data, show the advantage of the implementation of the SYCL standard over ordinary C++ by more than 2 times when using a graphics processor. It was experimentally shown that the implementation on SYCL is almost not inferior in speed to the implementation on ordinary C++ when executed on a central processor. Such results are caused both by the high possibility of parallelizing the LU schedule algorithm itself, and by the great work of the developers of the standard on its optimization.

The obtained results indicate the possibility of speeding up the solution of the LU decomposition of the matrix and similar algorithms by means of SYCL when using heterogeneous systems with processors optimized for data parallelism. The results of the study can be used in justifying the choice of technology for solving LU matrix decomposition problems or problems with a similar parallelization scheme.

Author Biographies

Dmytro Nasikan, National Technical University of Ukraine «Igor Sikorsky Kyiv Polytechnic Institute»

Department of System Design

Vadym Yaremenko, National Technical University of Ukraine «Igor Sikorsky Kyiv Polytechnic Institute»

Postgraduent Student, Assistant

Department of System Design

References

  1. Alpay, A., Heuveline, V. (2020). SYCL beyond OpenCL: The architecture, current state and future direction of hipSYCL. Proceedings of the International Workshop on OpenCL. doi: https://doi.org/10.1145/3388333.3388658
  2. Lal, S., Alpay, A., Salzmann, P., Cosenza, B., Hirsch, A., Stawinoga, N. et al. (2020). SYCL-Bench: A Versatile Cross-Platform Benchmark Suite for Heterogeneous Computing. Lecture Notes in Computer Science. Cham: Springer, 629–644. doi: https://doi.org/10.1007/978-3-030-57675-2_39
  3. Diop, T., Gurfinkel, S., Anderson, J., Jerger, N. E. (2013). DistCL: A Framework for the Distributed Execution of OpenCL Kernels. 2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems, 556–566. doi: https://doi.org/10.1109/mascots.2013.77
  4. Ozcan, C., Sen, B. (2012). Investigation of the performance of LU decomposition method using CUDA. Procedia Technology, 1, 50–54. doi: https://doi.org/10.1016/j.protcy.2012.02.011
  5. Ghysels, P., Synk, R. (2022). High performance sparse multifrontal solvers on modern GPUs. Parallel Computing, 110, 102897. doi: https://doi.org/10.1016/j.parco.2022.102897
  6. Mittal, R. C., Al-Kurdi, A. (2002). LU-decomposition and numerical structure for solving large sparse nonsymmetric linear systems. Computers & Mathematics with Applications, 43 (1-2), 131–155. doi: https://doi.org/10.1016/s0898-1221(01)00279-6
  7. Lambers, J. (2021). «The LU Decomposition» in MAT 610 – Numerical Linear Algebra, Sec. 3.2. Available at: https://www.math.usm.edu/lambers/mat610/class0125.pdf
  8. Yang, A., Liu, C., Chang, J., Guo, X. (2020). Research on Parallel LU Decomposition Method and its Application in Circle Transportation. Journal of Software, 5, 1250–1255. doi: https://doi.org/10.4304/jsw.5.11.1250-1255
  9. Peng, S., Tan, S. X.-D. (2020). GLU3.0: Fast GPU-based Parallel Sparse LU Factorization for Circuit Simulation. IEEE Design & Test, 37 (3), 78–90. doi: https://doi.org/10.1109/mdat.2020.2974910
  10. SYCL Working Group, «SYCL™ 2020 Specification (revision 7)» (2023). The Khronos Group. Available at: https://registry.khronos.org/SYCL/specs/sycl-2020/pdf/sycl-2020.pdf
Performance evaluation of LU matrix decomposition using the SYCL standard

Downloads

Published

2023-06-30

How to Cite

Nasikan, D., & Yaremenko, V. (2023). Performance evaluation of LU matrix decomposition using the SYCL standard. Technology Audit and Production Reserves, 3(2(71), 6–9. https://doi.org/10.15587/2706-5448.2023.284518

Issue

Section

Information Technologies