INCREASING THE SPEED OF THE PERMUTATION MULTIPLICATION OPERATION DUE TO USE OF SIMD INSTRUCTIONS

Authors

DOI:

https://doi.org/10.24025/2306-4412.3.2021.245347

Keywords:

processor, algorithm, vector, register, SSSE3, AVX2

Abstract

Algorithms for performing permutation multiplication using SIMD (Single Instruction Multiple Data) instructions of modern processors are developed and investigated in the article. The purpose of the article is to increase the performance and reduce the power consumption of the processor during the executing of typical permutation operations by creating algorithms for performing permutation operations using SIMD instructions. The scientific novelty of the article lies in the proposed approach to the hardware implementation of permutation multiplication, which through the use of SIMD instructions can reduce the execution time of this operation and, consequently, increase performance and reduce CPU power consump-tion. The practical value of the article is determined by the developed algorithms for permutation multiplication, the effectiveness of which is confirmed by comparison with classical multiplication procedures. An analysis of SIMD instructions that can be used to perform permutation operations is performed. The developed algorithms are based on the use of advanced processor instructions that allow you to perform operations on data presented in vector format. The advantages of using SIMD instructions to increase the speed of permutation operations are practically identified and investigated. The analysis and comparison of the speed of permutation operations with and without the use of SIMD instructions are performed. Described implementations of permutation multiplication algorithms using SIMD instructions can be used to accelerate permutation operations, in particular in the three-pass cryptographic protocol based on permutations. The use of SIMD instructions makes it possible to speed up permutation multiplication operations up to 2.6 times (depending on the hardware). The developed algorithms can be used in the implementation of methods based on a large number of permutation multiplication operations, which will significantly increase the speed of their execution.

Author Biographies

A. O. Lavdanskyi, Cherkasy State Technological University

к.т.н., доцент

E.V. Faure, Cherkasy State Technological University

д.т.н., професор

V. O. Shcherba, Cherkasy State Technological University

старший викладач

References

W. Mula, and D. Lemire, "Faster Base64 encoding and decoding using AVX2 instruc-tions", ACM Transactions on the Web (TWEB), vol. 12, no. 3, pp. 1-26, 2018.

A. Faz-Hernández, and J. López, "Fast im-plementation of Curve25519 using AVX2", in International Conference on Cryptology and Information Security in Latin America, 2015, pp. 329-345.

A. Lemmetti, A. Koivula, M. Viitanen, J. Vanne, and T. D. Hämäläinen, "AVX2-optimized Kvazaar HEVC intra encoder", in 2016 IEEE International Conference on Im-age Processing (ICIP), 2016, pp. 549-553.

W. Mula, N. Kurz, and D. Lemire, "Faster population counts using AVX2 instruc-tions", The Computer Journal, vol. 61, no. 1, pp. 111-120, 2018.

M. J. Flynn, "Very high-speed computing systems", Proceedings of the IEEE, vol. 54, no. 12, pp. 1901-1909, 1966.

M. J. Flynn, "Some computer organizations and their effectiveness", IEEE Trans. Com-put., vol. 21, no. 9, pp. 948-960, 1972.

"Intel® Intrinsics Guide". [Online]. Availa-ble at: https://www.intel.com/content/ www/us/en/docs/intrinsics-guide/index.html.

"GeForce RTX 3090 Graphics Card | NVID-IA". [Online]. Available at: https://www.nvidia.com/en-eu/geforce/ graphics-cards/30-series/rtx-3090/.

A. Shcherba, E. Faure, and O. Lavdanska, "Three-pass cryptographic protocol based on permutations", in 2020 IEEE 2nd Interna-tional Conference on Advanced Trends in Information Theory (ATIT), 2020, pp. 281-284.

E. V. Faure, O. O. Kharin, and A. O. Lav-danskyi, "Evaluation of properties of signal-code structures synthesized on the basis of lattice theory for inseparable factorial codes", Visnyk Cherkaskogo derzhavnogo tehnologichnogo universitetu, no. 3, pp. 40-47, 2020 [in Ukrainian].

E. V. Faure, V. V. Shvydkyi, A. I. Shcherba, O. O. Kharin, and B. A. Stupka, "Method of cyclic synchronization based on permuta-tions", Visnyk Cherkaskogo derzhavnogo tehnologichnogo universitetu, no. 4, pp. 67-76, 2020 [in Ukrainian].

E. V. Faure, V. V. Shvydkyi, A. O. Lavdan-skyi, and O. O. Kharin, "Methods of factori-al coding of speech signals", Radio Elec-tronics, Computer Science, Control, no. 4, pp. 186-198, Nov. 2019.

E. Faure, A. Shcherba, Y. Vasiliu, and A. Fesenko, "Cryptographic key exchange method for data factorial coding", vol. 2654, p. 643, Aug. 2020.

"Programming using AVX2. Permutations". [Online]. Available at: https://software.intel.com/content/www/us/ en/develop/blogs/programming-using-avx2-permutations.html.

"google/benchmark: A microbenchmark support library". [Online]. Available at: https://github.com/google/benchmark.

Published

2021-10-22

How to Cite

Lavdanskyi, A. O., Faure, E., & Shcherba, V. O. (2021). INCREASING THE SPEED OF THE PERMUTATION MULTIPLICATION OPERATION DUE TO USE OF SIMD INSTRUCTIONS. Bulletin of Cherkasy State Technological University, (3), 36–43. https://doi.org/10.24025/2306-4412.3.2021.245347

URN