Optimization of software code for high-level synthesis during hardware implementation of the computationally-loaded algorithms

Authors

DOI:

https://doi.org/10.30837/2522-9818.2025.3.189

Keywords:

embedded systems; high-level synthesis; C code optimization; System-on-Chip.

Abstract

The subject matter of the work is the impact of code optimization methods of highly intensive algorithms, used in digital signal processing, on hardware costs and performance when implemented on different platforms. The goal of the work is to conduct a comparative analysis of the impact of the effects of three C-code optimization approaches: loop unrolling, switching to fixed-point arithmetic, and their combinations, on performance and hardware costs when implementing matrix multiplication, fast Fourier transform, and wavelet transform algorithms using high-level synthesis (HLS) tools on system-on-chip (SoC) platforms, personal computers (PCs), and single-board computers. The following tasks were solved in the article: implementation of highly intensive algorithms based on selected hardware platforms and using HLS; comparison of execution time of algorithms with and without different optimization methods; comparison of hardware costs for algorithms’ implementations with and without different variants of optimization; formulate conclusions about the impact of different C-code optimization methods on performance and hardware costs on different target platforms. The following methods were used: C/C++ code optimization methods, diagnostic experiments using high-level synthesis tools to implement digital signal processing algorithms on the selected hardware platform, and statistical data collection using Python. The following results were obtained: for algorithms based on arithmetic operations, code optimization provided up to 30% reduction in execution time on ARM platforms. For algorithms based on the Fourier transform, complex optimization reduced execution time by up to 90% on processor devices. For programmable logic (FPGA), none of the optimization methods provided a significant execution acceleration. However, the transition to fixed arithmetic reduced hardware costs by 40–80% regardless of the algorithm type. Conclusions. The choice of a C code optimization strategy significantly impacts the efficiency of algorithm implementation on processor architectures. In contrast, optimizing the data types used plays a key role for FPGAs. In contrast, for FPGAs, optimizing the data types used plays a key role.

Author Biographies

Oleksandr Shkil, Kharkiv National University of Radio Electronics

PhD, associated professor, associated professor of design automation department

Oleh Filippenko, Kharkiv National University of Radio Electronics

PhD, associated professor, associate professor of infocommunication engineering department named by V.V. Popovsky

Dariia Rakhlis, Kharkiv National University of Radio Electronics

PhD, associated professor, associated professor of design automation department

Inna Filippenko, Kharkiv National University of Radio Electronics

PhD, associated professor, associated professor of design automation department

Valentyn Korniienko, Kharkiv National University of Radio Electronics

PhD student of design automation department

References

References

Tratt, L. (2025), "Four kinds of optimisation", available at: https://tratt.net/laurie/blog/2023/ four_kinds_of_optimisation.html (last accessed: 10.02.2025).

Xu, K., Zhang, G. L., Yin, X., Zhuo, C., Schlichtmann U., Li B. (2024), "Automated C/C++ program repair for high-level synthesis via large language models", ACM/IEEE international symposium on machine learning for CAD (MLCAD '24), 09-11 September 2024, Salt Lake City, USA, P. 1–9. DOI: https://doi.org/10.1109/mlcad62225.2024.10740262

Licht, J. de Fine, Besta, M., Meierhans, S., Hoefler, T. (2021), “Transformations of high-level synthesis codes for high-performance computing”, IEEE Transactions on Parallel and Distributed Systems, 2021,Vol. 32 (No. 5), P. 1014‒1029. DOI: https://doi.org/10.1109/tpds.2020.3039409

Ahmad A., Du L., Zhang W. (2024), “Fast and practical Strassen’s matrix multiplication using FPGAs”, 34th International Conference on Field-Programmable Logic and Applications (FPL’24), 2-6 September 2024, Torino, Italy, P. 311–317. DOI: https://doi.org/10.1109/fpl64840.2024.00050

Khan G. N., Iniewski K. (2017), Embedded systems code optimization and power consumption: chapter in Embedded and Networking System, 2017, P. 97–116. DOI: https://doi.org/10.1201/b15497-8

Almorin, H., Gal, B. L., Crenne, J., Jego, C., Kissel, V. (2022), “High-throughput FFT architectures using HLS tools”, 29th IEEE International Conference on Electronics, Circuits and Systems (ICECS), 24–26 October 2022, Glasgow, United Kingdom, P. 1‒4. DOI: https://doi.org/10.1109/ icecs202256217.2022.9970886

Gayathri, S. (2022), “Improved FIR filter using Schonhage-Strassen algorithm based multipliers”, International Journal of Science and Research (IJSR), 2022, Vol. 11 (No. 11), P. 1103–1106. DOI: https://doi.org/10.21275/sr221120132156

Juang, W.-H., Wu, M.-C., Sheu, Y.-H., Shieh, J.-Y., Hsieh, T.-H. (2023), “A cost-efficient hardware accelerator design for 2D sliding discrete fourier transform”, International Conference on Consumer Electronics - Taiwan (ICCE-Taiwan), 17-19 July 2023, PingTung, Taiwan, P. 595‒596. DOI: https://doi.org/10.1109/icce-taiwan58799.2023.10227037

Kumar, A. (2015), “New trends and challenges in source code optimization”, International Journal of Computer Applications, 2015, Vol. 131(No. 16), P. 27–32. DOI: https://doi.org/10.5120/ijca2015907609

Lee, Y., Youn, J., Nam, K., Oh, H., Paek, Y. (2023), “Optimizing hardware resource utilization for accelerating the NTRU-KEM algorithm”, Computers, 2023, Vol. 12 (No. 259), P. 1‒14. DOI: https://doi.org/10.3390/computers12120259

Zhao, R., Cheng, J., Luk, W., Constantinides, G. A. (2022), “POLSCA: polyhedral high-level synthesis with compiler transformations”, 32nd International Conference on Field-Programmable Logic and Applications (FPL), 29 August – 2 September 2022, Belfast, United Kingdom, P. 235‒242. DOI: https://doi.org/10.1109/fpl57034.2022.00044

Qian, X., Shi, J., Shi, L., Zhang, H., Bian, L., Qian, W. (2022), “Scheduling information-guided efficient high-level synthesis design space exploration”, IEEE 40th International Conference on Computer Design (ICCD), 23-26 October 2022, Olympic Valley, USA, P. 203‒206. DOI: https://doi.org/10.1109/iccd56317.2022.00038

Hong, H., Xiao, C., Wang, S. (2024), “Rethinking high-level synthesis design space exploration from a contrastive perspective”, 42nd International Conference on Computer Design (ICCD), 18-20 November 2024, Milan, Italy, P. 179–182. DOI: https://doi.org/10.1109/iccd63220.2024.00035

Ferikoglou, A., Kakolyris, A., Kypriotis, V., Masouros, D., Soudris, D., Xydis, S. (2023), “Data-driven HLS optimization for reconfigurable accelerators”, 61st ACM/IEEE Design Automation Conference, 23-27 June 2023, San Francisco, USA, P. 1–16. DOI: https://doi.org/10.1145/3649329.3658471

Basalama, S., Cong, J. (25), “Stream-HLS: towards automatic dataflow acceleration”, ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA '25), 27 February 2025 ‒ 1 March 2025, Monterey, USA, P. 103‒114. DOI: https://doi.org/10.1145/3706628.3708878

Si, Q., Schaefer, С. B. (2023), “ADVICE: automatic design and optimization of behavioral application specific processors”, Great Lakes Symposium on VLSI (GLSVLSI'23), 5-7 June 2023, Knoxville, USA, P. 327‒332. DOI: https://doi.org/10.1145/3583781.3590214

Downloads

Published

2025-09-25

How to Cite

Shkil, O., Filippenko, O., Rakhlis, D., Filippenko, I., & Korniienko, V. (2025). Optimization of software code for high-level synthesis during hardware implementation of the computationally-loaded algorithms. INNOVATIVE TECHNOLOGIES AND SCIENTIFIC SOLUTIONS FOR INDUSTRIES, (3(33), 189–202. https://doi.org/10.30837/2522-9818.2025.3.189