ANALYSIS OF THE QUADRATIC SIEVE METHOD BASED ON THE ONLINE MATRIX SOLVING

Integer factorization is one of the oldest problems in mathematics. However, major breakthroughs have occurred over the past 30 years, especially after the introduction of public-key cryptography, and in particular, the RSA cryptosystem. We can say that if factorization is solved effectively, the RSA cryptosystem will be extremely vulnerable. That is why the RSA Security company funded a factorization contest called “RSA challenge”. It is interesting that latest introductions of factorization algorithms are closely related to the RSA challenge. The authors carried out the studies [1, 2] of methods of cryptographic analysis of the RSA algorithm. However, [3] shows that known examples of compromise of the RSA algorithm work only for specific implementations, and usually are not more effective than the factorization problem. In 1994, factorization of the RSA-129 number was performed by means of the quadratic sieve algorithm (QS) [4]. This fact was perceived as a great surprise, since it was believed that the RSA-129 number is very difficult to factorize. The quadratic sieve method (QS) is inferior to the general number field sieve method. However, for numbers up to 100 decimal digits, it is still the best [5, 6]. Modification of the quadratic sieve algorithm will allow reducing the running time of the algorithm and increasing the limit value of the factorized number for which the algorithm of the quadratic sieve method is the best. Therefore, the study of new ways to reduce its computing complexity is relevant. 2. Literature review and problem statement


Introduction
Integer factorization is one of the oldest problems in mathematics. However, major breakthroughs have occurred over the past 30 years, especially after the introduction of public-key cryptography, and in particular, the RSA cryptosystem. We can say that if factorization is solved effectively, the RSA cryptosystem will be extremely vulnerable. That is why the RSA Security company funded a factorization contest called "RSA challenge". It is interesting that latest introductions of factorization algorithms are closely related to the RSA challenge.
The authors carried out the studies [1,2] of methods of cryptographic analysis of the RSA algorithm. However, [3] shows that known examples of compromise of the RSA algorithm work only for specific implementations, and usually are not more effective than the factorization problem. In 1994, factorization of the RSA-129 number was performed by means of the quadratic sieve algorithm (QS) [4].
This fact was perceived as a great surprise, since it was believed that the RSA-129 number is very difficult to factorize.
The quadratic sieve method (QS) is inferior to the general number field sieve method. However, for numbers up to 100 decimal digits, it is still the best [5,6].
Modification of the quadratic sieve algorithm will allow reducing the running time of the algorithm and increasing the limit value of the factorized number for which the algorithm of the quadratic sieve method is the best.
Therefore, the study of new ways to reduce its computing complexity is relevant.

Literature review and problem statement
At the moment, there are several factorization methods and their modifications, basic of which have been considered in [7]. These methods are characterized by exponential and subexponential computing complexity.
In the quadratic sieve method, for the factorized N number, integers x are tried to be found such that can be decomposed into small prime factors -factor base elements, i. e., the numbers p=2 and other smallest primes p, for which N is the quadratic residue modulo p. Such values of y are called B-smooth [8].
The number L a of the factor base elements in the basic version of the QS method is recommended [2,4,6] to be equal to where the maximum prime number in the factor base B is called smoothness boundary. The purpose of the algorithm is to find a set of B-smooth numbers, on the basis of which [9] it is possible to obtain the value of X such that 2 2 ( ) (mod ), where Y(X) is the product of a number y(x), determined as in (1).

ACCELERATION ANALYSIS OF THE QUADRATIC SIEVE METHOD BASED ON THE ONLINE MATRIX SOLVING S . V y n n y c h u k
The algorithm of the QS method works in two stages: the stage of forming a set of at least L a +2 B-smooth numbers, on the basis of which one can obtain equal squares modulo N, and the data processing stage, where all collected information is placed in a matrix, the processing of which results in a solution [10][11][12][13].
At the first stage, the sieving interval is selected, the factor base is constructed and the sieving procedure is implemented. The most time-consuming part of the quadratic sieve algorithm is the sieving process when looking for B-smooth numbers based on the selection of x values in (1). In the general case (according to [14]), the size of the sieving interval can be obtained by the formula: The factor base size is one of the key parameters that determine the efficiency of the sieving algorithm. Too large factor base requires the search for a large number of B-smooth numbers, which increases the total execution time of the algorithm [1,5,13]. When the size is less than necessary, it will not be possible to find enough B-smooth numbers. The ratio (2) is the recommended number of the factor base elements, obtained on the basis of numerical experiments. Then the algorithm of the QS method searches B-smooth numbers in the quantity not less than 2.
a L + If enough B-smooth numbers are not found on the sieving interval, it is possible to increase both the size of the factor base and the sieving interval, which leads to a significant increase in the algorithm execution time.
For the quadratic sieve algorithm, a number of modifications related to the acceleration of the sieving process and solution of the matrix have been proposed.
To increase the number of possible B-smooth numbers, [12] proposes to memorize y(x)=y 1 (x)y 2 (x) such that y 1 (x) is a smooth number and В<y 2 (x)<В 2 . In the presence of two such numbers у with the same y 2 , their product becomes B-smooth. In [15], it is suggested to check whether y 2 (x) is an integer square. There is no need to look for a pair for such numbers as in the previous modification, they are called conditionally B-smooth and referred to a set of B-smooth.
In [5], the method of paralleling of the sieving process, known as MPQS (multiple polynomial quadratic sieve), has been proposed. It has been noted that the step of matrix solution cannot be parallelized, so steps have been taken to accelerate it [13].
In [16], it has been described that the number of units in the power matrix is much smaller than the number of zeros. For large numbers of 10 100 or more, the ratio of the number of zeros to the number of units only grows. Most of the memory allocated for storing the matrix is used to store zeros. Therefore, instead of storing a two-dimensional matrix, it is proposed to store only units digit positions.
In all provided publications, it was considered that the stage of matrix solving requires a mandatory finding of the quantity of B-smooth numbers, not less than 2.
a L + In [17], a modification of the quadratic sieve algorithm has been proposed, in which, based on the current analysis of B-smooth numbers, the highest sequence number of the factor base element р(і) is determined for each i-th B-smooth number, for which the exponent in the decomposition of B-smooth will be odd. If during obtaining a set of B-smooth it turns out that L max +2 B-smooth numbers are found, for which р(і)≤L max , then a matrix with the number of the factor base elements of L max ≤L a can be formed. That is, both the size of the sieving interval and the size of the matrix can be reduced.
In the modified algorithm presented in [17], it is possible to achieve a decrease in the size of the sieving interval and the matrix only when in the set of L max +2 B-smooth numbers all odd powers of factors are assigned to the sequence numbers of the factor base elements, which do not exceed L max ≤L a . However, there are cases when the solution of the factorization problem is possible with a much smaller number of B-smooth numbers, where the factor base elements used in them can be placed randomly, not only among the smallest values. The option of y(x) value, which is an integer square, is possible. Then the factorization problem is solved by the Fermat's method [1,6,7]. In other cases, methods of identification of such a subset of the factor base elements were not found in the scientific literature, as well as means of early identification of a set of B-smooth, for which the vectors, formed on the basis of odd exponents of the factor base elements, generate a linearly dependent subsystem. In this study, for the early identification of such a set of B-smooth numbers, it is proposed to use the online matrix diagonalization, when diagonalization continues with each occurrence of a new B-smooth and ends upon receipt of a zero vector.

The aim and objectives of the study
The studies were aimed at evaluating the efficiency of the modified quadratic sieve algorithm, which simultaneously implements the process of sieving and finding B-smooth numbers and the process of finding a zero vector in the diagonalization of the power vector matrix.
To achieve this aim, the following objectives were accomplished: -to construct an algorithm for the online matrix solving in the quadratic sieve method; -to analyze the influence of the online matrix solving on the speed and the result of factorization; -to conduct a comparative estimation of complexity and time of implementation of the modified quadratic sieve method with the basic quadratic sieve and general number field sieve method.

Method of the online matrix solving of B-smooth numbers
In this study, we consider the problem of finding the sizes of the sieving interval and the factor base, where the factor base contains L а elements, to be solved.
In the proposed algorithm that implements the online matrix solving, the additional vector Vs[L a +1] is used.
The search for a zero vector of the power matrix is presented below by the following steps: 1. Upon the occurrence of a new B-smooth number, the power vector Vnew, which corresponds to it, is introduced in the matrix.
2. We perform analysis of vector Vnew: a. The position k0 of the first non-zero value of the vector Vnew and the position of the vector itself in the matrix kv are calculated.
b. If the zero value of the vector Vnew is absent, then the zero vector is found. Go to step 4.
3. The element of the vector Vs with the number that is equal to the first non-zero value of the added vector Vnew-k0 is checked.
a. If the element is empty, then the position of the added vector Vnew-kv: Vs[k0]=kv is introduced in this element. Go to step 1.
b. If this element is not empty, then a row that needs to be added to the vector Vnew is found using the value in this element. Go to step 2.
4. It is clarified whether the root value obtained is equal to N. If not, the factorization problem is solved, otherwise go to step 5.
5. Information about the B-smooth number with zero values of the transformed matrix elements is removed. Go to step 1.

Examples of application of the online matrix solving algorithm
Consider the efficiency of the proposed modification on the examples. Example 1. We choose p=401 and q=103. These prime numbers form the number for factorization p*q=N=41303. We calculate the size of the factor base A=6 by the formula (2). Using the formula (4), we obtain the sieving interval M= =203. The initial value Х0=xx= N =204.
After sieving of y(x) options through the factor base, we obtain B-smooth numbers. These numbers are shown in Table 1. Table 1 Results of sieving of y(x) options For the basic quadratic sieve algorithm, these numbers are not enough to form a matrix and obtain a solution.
Note that the vectors in bold in Table 1 form a zero vector.
The method for the online matrix solving managed to find a zero vector with the current number of B-smooth numbers and factorize N.
Example 2. Let p=7624217, q=98269, N=749224180373, the factor base size: L a =29, sieving interval: The The basic quadratic sieve method needed to check 5535 x options to get 30 B-smooth and factorize the number.
As can be seen from Table 2, the modified method has checked 194 x options, having obtained only 3 B-smooth.
This turned out to be enough to obtain a zero vector. The modified algorithm obtained a solution thirty times faster than the basic algorithm. Table 3 gives examples of the obtained acceleration factors of the modified algorithm with an approximate step of 20. Acceleration is calculated by the ratio of the number of X sieved by the basic method to the number of X sieved by the modified method.
The method of the online matrix solving in some cases allows forming a zero vector of already obtained vectors for B-smooth numbers even if the basic method failed to obtain enough B-smooth to form a matrix. So for one of the numerical experiments, 200 prime numbers in the interval from 23663 to 152065567 with a floating step were selected and ten thousand N for factorization were generated. The basic algorithm failed factorization in 686 cases of 10000. The modified algorithm reduced this figure to 503 cases. The time of factorization of all successful cases also decreased from 3941 seconds to 3301 seconds, which is 16 percent.
Factorization of one number of 22 10 in size by the basic quadratic sieve method may take up to 2 minutes, so the studies for a large number of much larger numbers were not conducted because this requires substantial computing resources. For the efficiency analysis, control ranges for N close to 12 ( 2) 10 log , k N + + where k=0, 1,.., 5 were chosen. Table 2 Results of sieving of x options  Table 3 Comparative analysis of the modified quadratic sieve method with the basic quadratic sieve method.  The efficiency analysis was conducted on several grounds: 1. The number of sieved X for the basic and modified methods.
The relative reduction of the total number of sieved X (in percentage) is given in Table 4. Since the procedure of sieving the test values of x is the most time-consuming, the estimation of the number of sieved x is an important component that characterizes the effectiveness of the algorithm, and such estimation is accurate. At the same time, it is impossible to take into account the time for the matrix re-solving in case of wrong solutions. Therefore, the estimation by the number of sieved x is not complete.
2. The total execution time of the factorization task. The calculated percentage of reduction of the total execution time of the factorization program is given in Table 5. It should be noted that the estimation of calculation time is not accurate. One and the same task can be performed for different (close) time associated with the running of the operating system task scheduler. Therefore, the data presented in Table 5, obtained on the basis of one calculation for each method are approximate. Estimation of the error of such data was not carried out, since each of the calculations required much time, which makes it impossible to conduct a statistically significant number of experiments.
According to the results from Table 4 respectively. Fig. 1 shows the graph of these functions.

Fig. 1. Acceleration graph of the online matrix solving method
For further estimation, we will use the formula (5), since the comparison of execution time has errors, though takes into account the time to find p and q. Therefore, by the formula (5) we obtain that for numbers of 10 100 in size, the modified quadratic sieve algorithm based on the online matrix solving has an acceleration of about 5.76 percent, and for numbers of 10 130 in size -5.45 percent.

Comparative estimation of complexity of the modified method of quadratic sieve with general number field sieve
To compare the relative efficiency of the QS and GNFS methods, we assume that ( ) where the coefficient K is determined from the condition that the QS method is better than GNFS for the numbers 10 110 , but for N>10 129 GNFS will be better, and for some 10 110 <N<10 129 the relations are the same the value of which will allow characterizing the coefficient K [18,19]. From Table 6, it follows that Т K (N) grows with the increasing number of decimal digits N. However, the interesting fact is that with increasing the number of decimal digits N per unit, Т K (N) increases 1.045 times for lgN=113 and reaches the value of 1.0459 for lgN=160, with a smooth monotonous growth. That is, regardless of the boundary value lgN, at which the QS and GNFS methods have the same computing complexity, any option of the improved QS method, for which the boundary value lgN is increased by one, requires its computing complexity to be reduced at least 1.045 times.
For the performed calculations, (1) 1. o = was chosen. It is possible to assert with full confidence that the dynamics for (1) 1 o ≠ will be similar to that given above.

Discussion of the results of the study of the efficiency of the modified quadratic sieve method
The speed of the quadratic sieve method depends on such heuristic values as the size of the factor base and the sieving interval.
It is shown that for the selected 10,000 numbers of 10 13 in size, the modified algorithm managed to reduce the number of failed factorizations from 686 cases to 503 relative to the basic quadratic sieve algorithm. This became possible due to the fact that in the modified algorithm there is no need to obtain L a +2 B-smooth numbers prior to diagonalization of the matrix, as in the case of the basic method. A zero vector in a number of cases can be obtained much earlier, as illustrated in the examples given.
Among other important characteristics of this method, it should be noted that when used, the same operations as in the basic quadratic sieve method are performed, only their order is changed. That is, in the worst cases when the required number of B-smooth is L a +2, the computing complexity of the modified method will be the same as in the basic one.
But it can be significantly reduced if the set of B-smooth numbers, for which the power matrix vectors form a linearly dependent system, are found quickly.
The peculiarities of the proposed modification include the fact that while simultaneously searching for B-smooth and diagonalizing the matrix, problems with the required amount of computer memory may arise, since it is known that when factorizing the RSA-129 number for solving the matrix, a supercomputer was used, which was not required to obtain B-smooth numbers.
In the analysis of the relationships between the computing complexity of the quadratic sieve and general number field sieve (GNFS) methods, it was found that an increase in the factorized number N by one decimal digit decreases the computing complexity of GNFS compared with QS 1.045 times (by 4.5 %) for numbers 10 125 -10 130 and this value varies quite slowly. Therefore, we can assume that any modifications to the quadratic sieve method, which, with the growth of N, reduce its computing complexity a number of times, asymptotically close to a constant, will not be competitive with GNFS with sufficiently large N.
The estimation of acceleration of the modified method for numbers up to 10 130 , which is approximately 5.45 percent, shows that the proposed modified method allowed reducing the computing complexity of the basic QS method, but the value of the factorized numbers N, for which the method would be the best, increased only by a decimal digit.
Further improvements to the quadratic sieve method, which would provide a much more significant reduction in its computing complexity, should be related to approaches aimed at reducing the sieving interval and the size of the factor base, which in relative terms should be the greater, the higher N.

Conclusions
1. The algorithm for the online matrix solving, which accelerates the basic quadratic sieve method was developed. In some cases, 10, 100 and more times acceleration is possible. The average reduction of the computing complexity of the modified method for numbers up to 10 130 , according to the estimates obtained, is 5.45 percent. This effect is associated with the possibility of obtaining a zero vector in some cases much earlier than L a +2 B-smooth numbers are found, which is provided in the algorithm of the basic method and illustrated in the examples given.
2. On the basis of numerical experiments, it is shown that the online matrix solving allows the factorization of the number in some cases where the basic quadratic sieve algorithm (standard sieving interval and size of the factor base) failed to form a matrix for obtaining a solution. Namely, for the selected 10,000 numbers of 10 13 in size, the modified algorithm managed to reduce the number of failed factorizations from 686 cases to 503 relative to the basic quadratic sieve algorithm.