Studying the Properties of a Robust Algorithm for Identifying Linear Objects, Which Minimizes a Combined Functional

This paper addresses the task of identifying the parameters of a linear object in the presence of non-Gaussian interference. The identification algorithm is a gradient procedure for minimizing the combined functional. The combined functional, in turn, consists of the fourth-degree functional and a modular functional, whose weights are set using a mixing parameter. Such a combination of functionals makes it possible to obtain estimates that demonstrate robust properties. We have determined the conditions for the convergence of the applied procedure in the mean and root-mean-square measurements in the presence of non-Gaussian interference. In addition, expressions have been obtained to determine the optimal values of the algorithm's parameters, which ensure its maximum convergence rate. Based on the estimates obtained, the asymptomatic and non-asymptotic values of errors in estimating the parameters and identification errors. Because the resulting expressions contain a series of unknown parameters (the values of signal and interference variances), their practical application requires that the estimates of these parameters should be used.<br><br>We have investigated the issue of stability of the steady identification process and determined the conditions for this stability. It has been shown that determining these conditions necessitates solving the third-degree equations, whose coefficients depend on the specificity of the problem to be solved. The resulting ratios are rather cumbersome but their simplification allows for a qualitative analysis of stability issues. It should be noted that all the estimates reported in this work depend on the choice of a mixing parameter, the task of determining which remains to be explored.<br><br>The estimates obtained in this paper allow the researcher to pre-evaluate the capabilities of the identification algorithm and the effectiveness of its use in solving practical problems.


Introduction
Underlying many of the tasks related to processing information (processing and filtering of complex signals, identifying and managing objects, predicting time sequences, classification, etc.) is the task of building a model of the following form:  ( ) 1 2 , ,.. T N * * * * θ = θ θ θ is the vector of the desired parameters N×1; ξ(k) is the interference that implies minimizing some of the predefined quality functional (identification criteria). A quadratic functional, the most widely used in practice, leads to various identification algorithms, making it possible to obtain the estimates of the desired vector θ * at the normal interference distributions, that is, ( ) ( ) 2 0, . k N ξ ξ σ  Based on this assumption, the LSM-solution is asymptotically optimal with minimal variance in the class of non-displaced grades. However, this assumption does not generally hold in real-world conditions as almost always the a priori information about distributions is typically inaccessible, or interference is clogged with non-Gaussian noise. This results in some measurements being significantly removed from the core of the data. thereby forming so-called "tails". The instability of the LSM evaluation in the presence of such interference was the basis for the development of an alternative, robust assessment in statistics, which was aimed at eliminating the effects of interference [1][2][3][4][5][6][7][8][9].
If one has information about the interference ξ belonging to a certain class of distributions, the task is simplified. In this case, it is possible to obtain the maximum plausibility (M-assessment) assessment by minimizing the optimal criterion, which is the reverse logarithm of the interference distribution function. If such information is not available, a non-quadratic criterion must be applied to assess the vector of θ * parameters. This ensures that the estimate obtained is robust. One of these criteria is a modular one, whose minimization leads to a symbolic algorithm.

Literature review and problem statement
Theoretical research into the properties of the symbolic evaluation algorithm was reported for the first time in [1]. The practical application of this criterion in the task of identifying an object in the presence of pulse interference was considered in [2][3][4][5][6]. In particular, the effectiveness of the affinity projection symbolic algorithm was studied in [2,3]; the affinity projection symbolic algorithm with a variable gain factor was used in [4]. It should be noted that the symbolic algorithms, while ensuring the robustness of the resulting assessment, have a low convergence rate. Therefore, in order to speed up the evaluation process, a normalized symbolic identification algorithm was proposed and examined in [5]. A simple-to-implement algorithm that uses a root mean square error and the estimated interference power to correct the length of the step is studied in [6].
The positive properties of the modular criterion are used in the so-called combined criteria, the most common of which is the combined functionals proposed in [7,8]. They include a quadratic functional that provides optimal ratings for the Gauss distribution, as well as modular, producing an estimate that is more robust to the distributions with heavy "tails" (emissions). It should be noted, however, that the effectiveness of the robust estimates obtained depends significantly on the many parameters used in these criteria. The cited works provide some recommendations for choosing these parameters. In most cases, however, they are selected based on the experience of the researcher [9]. The task of robust neural network training based on the functionals [6,7] by Huber and Hempel is considered in [10][11][12]; some practical recommendations on the choice of the functionals' parameters are devised. A more general issue of robust assessment in the presence of interference with asymmetrical distributions was investigated in [13]. However, the task of choosing the functionals' parameters remains to be resolved.
Such a criterion was for the first time proposed in [14]. In [14][15][16][17][18], this criterion was used to solve the problem of identification in the presence of pulse interference. The stability of the normalized algorithm was studied in [15]; the applied identification problem was solved in [16]. An adaptive combination of normalized filters was proposed in [17]; the convergence of the identification algorithm was studied in [18], where the task of selecting the optimal parameters' values of the algorithm was addresses.
The minimum fourth-degree criterion was proposed in [19], the properties of which were studied in [19][20][21][22][23][24]. Thus, the stability of the normalized algorithm in the presence of the non-Gaussian input signals was considered in [20]; the process of the algorithm normalization was described in [21]; papers [22,23] considered the global stability of the appropriate algorithms; the problem of stochastic analysis of the stability of the adaptive algorithm was tackled in [24]. The task of increasing the convergence rate of a given algorithm by using the optimal setting step parameter was studied in [25,26]. Paper [27], in order to ensure the robustness and stability of the algorithm, proposed using a variable step parameter that takes into consideration the energy of the error (in the terms of the least squares). Study [28] proposed a modification of the algorithm of the method of the least fourth degree based on a quasi-Newtonian procedure. Finally, work [29] addressed the implementation of a given algorithm using quantum computations.
A combined assessment criterion to speed up the identification process, which uses the combination of the quadratic criterion and the fourth-degree criterion, is proposed in [30]. In [31], a given approach was used to speed up the identification process in the presence of pulse interference. The properties of the adaptive algorithm to minimize this combined criterion were studied in [32].
A combined criterion consisting of the fourth-degree and modular criteria was proposed in [33]; the specificity of its work was considered.
As revealed by an analysis of the above studies into the issue of the robust identification of control objects, the application of the combined criterion is quite effective. In addition, such an approach is much easier than when using traditional criteria. However, available papers do not include the results of studying the features of the robust algorithms for evaluating a model's parameters built by using the combined criterion.
All this allows us to argue that it is appropriate to conduct a study on the analysis of the properties of the robust identification algorithm, which minimizes a combined functional, allowing for the combination of the LSM and LMM benefits.

The aim and objectives of the study
The aim of this work is to investigate issues related to the convergence and stability of the gradient algorithms that identify the parameters of a linear object in the presence of non-Gaussian noise.
To accomplish the aim, the following tasks have been set: -to investigate the convergence of the robust identification procedure (to obtain analytical estimates of the convergence in the mean and the mean square of the gradient algorithm for minimizing the combined functional); to determine the most achievable (asymptomatic) values of errors in estimating the parameters and identification errors in the conditions under consideration; to define conditions for the stability of the steady identification process; to simulate the process of identification of a stationary linear object in the presence of non-Gaussian noise.

Studying the convergence of the robust identification procedure
The following combined functional is effective enough to ensure the robust properties of the estimates received. [ ] 0,1 l ∈ -a mixing parameter.
When using criterion (2), the gradient minimization procedure takes the following form where γ(k) is some parameter that affects the speed of the algorithm's convergence. A given procedure combines the properties of LSM with the properties of LMM as, at λ=1, we have, from (3) the LSM algorithm, and, at λ=0, the LMM algorithm; that makes it possible to eliminate the non-Gaussian interference. By varying the λ parameter, one can change the properties of the algorithm.
Introduce an evaluation error which makes it possible to write down an expression for e(k) in the following form: is the a priori identification error.
Since it is assumed that ( ) ( ) Considering (5), rewrite expression (7) in the following form: Consider the convergence of procedure (3) in the absence of interference, that is, ξ(k)=0. In this case, we shall use the approach applied in [25,26].
Introduce the Lyapunov function ( ) ( ) 2 V k k = θ  and consider its increment After multiplying (7) on the left by ( ) T k θ  and considering that, in the case in question, after simple transformations, we obtain the following condition of convergence of the algorithm Thus, the algorithm convergence condition (3) is met if the γ parameter satisfies the following inequality An expression for the optimal value of the γ parameter, which provides the maximum convergence rate, is determined from the equation obtained by the differentiation of (8) for γ and equating the derivative to zero. Thus Examine the statistical properties of evaluation procedure (3) in the presence of measurement interference, that is, Suppose the interference is not correlated with usable signals. Having written down (3) relative to the errors of the assessment, we have (7).
Consider the mathematical expectation Given (5), after averaging both sides of (7), we obtain It is easy to see where 2 a e σ is the root mean square value of the e a (k) error; R xx -the correlation matrix of the input signal.
Let us take a closer look at the expression For the case when the signal is Expression (15) is derived from a Price's theorem, whereby for two random Gaussian quantities x and y with zero mathematical expectations, the following form is true where σ y is the root mean square value of y.
Taking into consideration the properties of interference and expressions (8) hence, it follows that procedure (3) will converge on average if the γ parameter satisfies the following inequality To study the convergence of the algorithm in root mean square, let us consider the Lyapunov function Multiplying both sides of (7) on the left by ( ) T k θ  , considering (5), we obtain It is easy to obtain formulae to calculate the expressions included in (20 The formulae below are derived similarly: 1 ; { } { }   Substituting these expressions in (20) and taking into consideration the statistical properties of the signals and interference, in particular, 1 . x If the algorithm converges, the value of 1 .
It follows from (22) that procedure (3) will converge in the root mean square (the increment of the Lyapunov function will be negative) when the following condition is met that is, if the γ parameter satisfies the following inequality The optimal value of this parameter, which ensures the maximum convergence rate of the algorithm, which is obtained by solving the following equation As can be seen from (24), the γ opt magnitude depends on the dimensionality of the object N under study, the statistical properties of the signals and interference, the magnitude σ 2 , a e and the mixing parameter λ. Usually, the dimensionality of an object is known, the parameter λ can be chosen by the researcher. The statistical characteristics of the signals and interference (especially the moments of the fourth order) are often unknown. Therefore, a given formula makes it possible to determine the effect of other parameters on the properties of the algorithm.

Determining the asymptomatic values of assessment and identification errors
Ratio (23) can be used to obtain an expression for an asymptotic assessment error requires that the parameter γ should be chosen as variable and, with the growth of k, to strive to zero, that is, to meet the Dvoretskiy conditions [35].
Fitting (24) to (6) As it follows from (26), the magnitude of the asymptomatic error of identification depends on the dimensionality of the examined object N, the statistical properties of the signals and interference, the magnitude of the γ parameter and the mixing parameter λ. If the dimensionality of an object is known, and the parameters γ and λ can be chosen by the researcher, the statistical characteristics of the signals and noise (especially the moments of the fourth and sixth orders) are often unknown. Therefore, a given formula is rather of theoretical interest as it characterizes the limits of the algorithm.

Determining the stability of the steady evaluation process
When studying the stability of an evaluation process, we shall take the approach proposed in [24].
Write down expression (22) in the following form 1 .
x D M k = l ξ + − l σ Nonlinear differential equation (27) describes the dynamics of algorithm (3). Obviously, the convergence depends on the magnitude of the initial error To study the conditions of stability, one needs to find the equilibrium points.
Consider the steady state of the evaluation process. Assuming ( ) ( ) ( ) 1 , y k y k y + = = ∞ record (27) in the following form: Since c>0, (28) can be rerecorded in the following form A given equation has three roots that define the equilibrium points and take the following form [36] ( ) ( ) Depending on the q and r values, three cases are possible 3 2 0; q r + < In the first case, as shown in paper [24], equation (29) has either three negative physical roots or one negative and two positive physical roots. Negative roots are of no interest as ( ) y k is a square, that is, it must be non-negative. To study the positive roots, work [24] investigated the behavior of curve (27) y(k+1) and defined the condition for a stable equilibrium point.
In particular, the convergence will be monotonous if And, since c>0, this condition is met at To make sure under which parameters of the algorithm (34) holds, we shall substitute in this inequality the expressions for a, b, and c. After simple transformations, we obtain  (35) shows that meeting the condition (34) depends on the dimensionality of the problem N, the magnitudes of γ and λ, as well as the statistical properties of the signals and interference 2 , Since N is defined by the problem being solved, the degree of robustness of the solution is determined by λ, the only freely chosen parameter is γ. Inequality (35) can be used to obtain the conditions that this parameter must satisfy to ensure the stability of the estimation process.
To this end, rewrite inequality (35) in the following form: Substituting in (37) the expressions for P, R and Q, one can obtain a rather cumbersome analytical expression for choosing γ. Even for a simple case corresponding to λ=1, studied in [24], the analysis is very complex and requires significant simplifications. A qualitative analysis of inequality (37) shows that the value of the parameter depends on all the quantities included in (36).
The lowest positive root γ 0 (39) corresponds to the resistance limit. At the same time, as noted in [24], the existence of interference, even when the parameters ( ) 0 θ are initiated based on the values close to , * θ leads to the instability of the evaluation process.
Finally, for the third case, when (32) is met, equation (29) may have three physical negative roots, which is of no interest, or one physical negative root, but two complex ones. This case corresponds to the lack of stability in the learning process.

Modeling the identification process
We have considered the problem of identifying a stationary linear object, which is described by equation (1) The chosen input signal x(k) was the sequences of the normally distributed quantities x(k)~N(0; 1). When testing the robustness of the algorithms, independent noise was added to the output signal of the object, with the Laplace distribution (α=1.0) and clogging Gaussian noise with σ=48. The histogram of such interference is shown in Fig. 1.
The results of the simulation at different values of the λ parameter are shown in Fig. 2, 3. Fig. 2, a, c shows the diagrams of setting the model parameters when selecting λ=1, λ=0.5, and λ=0, respectively; Fig. 3, a-cthe identification errors.

Fig. 1. Interference histogram
As the modeling results show, when using only the fourth-degree criterion (Fig. 3, a), it becomes impossible to evaluate the parameters of the model in the presence of mixed interference. When using only a modular criterion, an estimate is possible, but the convergence of the algorithm stretches over time (Fig. 3, b). The application of the mixed criterion (Fig. 3, c) is optimal. a b c Fig. 3. Identification error: a -at λ=1; b -λ=0.5; c -λ=0

Discussion of results of studying the convergence of a gradient algorithm for the identification of a linear object
The issues of the convergence of the gradient algorithms to identify the parameters of a linear object in the presence of non-Gaussian noise have been investigated. We have determined the conditions of the convergence of algorithms in the absence of interference (11), on average (18), and root mean square (23), if the noise is present.
The expressions have been derived to determine the optimal values of the algorithm's parameters, ensuring its maximum convergence rate in the absence and presence of interferenceformulae (12) and (24), respectively.
Based on the estimates obtained, the most achievable (asymptomatic) values of identification errors (25) and parameters assessment errors (26) under the conditions under consideration have been determined. Even though the derived expressions contain a series of unknown parameters (the signals 2 x σ and interference 2 ξ σ variances), they allow for qualitative analysis. These formulae are more of theoretical interest as they characterize the limits of the algorithm.
A non-linear differential equation (27), describing the dynamics of the algorithm, was used to study the stability of the steady identification process. It has been shown that meeting stability condition (34) depends on the dimensionality of the problem N, the quantities γ and λ, as well as the statistical properties of the signals and interference. Since N is defined by the problem being solved, λ characterizes the degree of robustness of the solution derived, the only freely chosen parameter is γ. The condition for the choice of γ (37), which is rather cumbersome, has been obtained. Its simplification, however, allows for a qualitative analysis of stability issues.
In modeling the identification process, the non-Gaussian noise was added to the object's output signal (with the Laplace distribution and the clogging Gaussian noise). If one uses the fourth-degree criterion only, it becomes impossible to evaluate the parameters of the model in the presence of a b c Fig. 2. Diagram of setting the model's parameters: a -at λ=1; b -λ=0.5; c -λ=0 mixed interference. The results of the simulation show that with only the fourth-degree criterion, studied in [19][20][21][22][23][24][25][26][27][28], the evaluation of model parameters in the presence of mixed interference becomes ineffective. Using only a modular criterion makes it possible to obtain an assessment of the parameters but the identification process is delayed. In this case, the best option is to apply the combined criterion considered in the present work.
The limitations of this study include the need for information on the statistical properties of the usable signals and interference. In addition, the effectiveness of solving the identification problem depends significantly on the choice of the mixing parameter λ that determines the robustness of the assessment. At present, there are no general recommendations for the choice of λ. Since the algorithms considered are designed to solve the problem of real-time identification, it seems appropriate to further develop effective procedures for assessing the statistical characteristics of signals and interference and the rules for selecting the weighting parameter. This is even more important because a given approach can be applied to the identification of the dynamic objects, represented, for example, by a pseudolinear regression model. It is therefore appropriate to conduct research on the development of recommendations for the choice of the λ parameter for different types of distributions and their possible combinations.
The estimates obtained in this paper allow the researcher to pre-evaluate the capabilities of a given algorithm and the effectiveness of its application when solving practical problems.
In conclusion, the study reported in this paper is a continuation and the advancement of earlier research, the results of which are described in [18,33].

Conclusions
1. The convergence of the identification algorithm, which minimizes a combined functional, consisting of the fourth-degree and modular functionals, and ensuring the robustness of estimates has been investigated. The conditions for the convergence of algorithms have been determined, in the absence of interference, as well as on average and root mean square if it is present. Expressions have been obtained to determine the optimal values of the algorithm's parameters, which ensure its maximum convergence rate in the absence and presence of interference.
2. Based on the estimates obtained, the most achievable (asymptomatic) values of identification errors and errors in the assessment of parameters in the conditions under consideration have been determined. Although the expressions derived contain a series of unknown parameters (the signals 2 x σ and interference 2 ξ σ variance), they allow for qualitative analysis. The resulting formulae are more of theoretical interest as they characterize the limits of the algorithm.
3. The stability of the steady identification process using a non-linear differential equation describing the dynamics of the algorithm has been investigated. It has been shown that meeting stability condition (34) depends on the dimensionality of the N problem, the quantities γ and λ, as well as the statistical properties of the signals and interference. Since N is defined by the problem being solved, λ characterizes the degree of robustness of the solution, the only freely chosen parameter is γ. Its simplification, however, makes it possible to perform a qualitative analysis of stability issues.
4. The identification process was simulated when there was a non-Gaussian noise in the object's output signal (with the Laplace distribution and the clogging Gaussian noise). The results of the simulation have confirmed the effectiveness of the developed approach. However, this approach requires information on the statistical properties of the usable signals and interference. In addition, the effectiveness of solving an identification problem depends significantly on the choice of the mixing parameter γ that determines the robustness of the assessment.
The estimates obtained in this work allow the researcher to pre-evaluate the limits of the algorithm and the effectiveness of its application when solving practical tasks.