IDENTIFICATION OF THE STATE OF AN OBJECT UNDER CONDITIONS OF FUZZY INPUT DATA

The modernization of the methods for identification of the state of objects under conditions of fuzzy input data, described by their membership functions, was performed. The selected direction of improvement of traditional methods is associated with the fundamental features of solving this problem under actual conditions of a small source data sample. Under these conditions, to solve the problem of state identification, it is advisable to transfer to the technology of description of source data, based on the mathematical apparatus of fuzzy mathematics and less demanding in terms of information. This transition required the development of new formal methods for solving specific tasks. In this case, the procedure for solution of the fuzzy system of linear algebraic equations was developed for multidimensional discriminant analysis. To solve the clustering problem, a special procedure of comparison of fuzzy distances between objects of clustering and centers of grouping was proposed. The selected direction of improvement of the traditional method for regression analysis was determined by impossibility of using the classical least squares method under conditions when all variables are described fuzzily. This fact led to the need to construct a special two-step procedure for solving the problem. In this case, the linear combination of the measure of distance of the sought-for solution from the modal one and the measures of compactness of membership function of the explained variable are minimized. The technology of fuzzy regressive analysis was implemented in the important practical case when the source fuzzy data are described by general membership functions of the (L-R) type. In addition, the analytic solution to the problem in the form of calculation formulas was obtained. The discussion showed that the modernization of the classical methods for solving the problem of the state identification, considering the fuzzy nature of representation of source data, made it possible to identify objects under actual conditions of a small sample of fuzzy source data


Introduction
Let us state the general principles for solving the problem of identification of the state of an object. Information base is formed according to the results of measuring the values of a set of controllable parameters (features) of an object. Identification technologies provide linkage between these values and the state of an object. To solve this problem, a number of special mathematical methods are traditionally used: multidimensional discriminant analysis, clustering, and regression analysis. However, the application of these techniques is significantly complicated, when the source data for identification of objects are determined fuzzily [1,2]. The absence of the corresponding mathematical apparatus that makes it possible to solve the problem under these conditions, determines the relevance of the research.

Literature review and problem statement
The technology of multidimensional discriminant analysis is as follows [3]. Let an observation object be in one of two states Н 1 and Н 2 . The state of an object is determined by values p of indicators х 1 , х 2 ,…, x n . The numeric values of the controlled indicators are supposed to be normally distributed random magnitudes. In this case, mathematical expectations are determined by vector М 1 =(m 11 , m 12 ,…, m 1p ), if an object is in state Н 1, and by М 2 =(m 21 , m 22 ,…, m 2p ), if an object is in state Н 2 . It is also assumed that the elements of the matrix of coefficients of correlation between indicators K=(k ij ), i=1, 2,…, p and j=1, 2,…, p do not depend on the state of an object. To assess the state of an object by results of measurements of the controlled indicators, the discriminant function 1 . 1 . 2 C = ζ + ζ Now, the decisive rule is stated: an object is in state Н 1 , if for a specific set of values of controlled indicators    11 2 , ,..., n x x x the corresponding value of discriminant function satisfies inequality: As shown in [4], selection of values of coefficients а і, , і=1, 2,…, р, values 1 , ξ 2 ξ and С ensures a minimum of the total probability of confusing the states, equal to р(Н 1 /Н 2 )+ +р(Н 2 /Н 1 ).
We will note the general shortcomings of the traditional method of multidimensional discriminant analysis. First, only double-alternative diagnosis was implemented in the method. It is not enough when solving many practical problems [5]. Second, in the traditional method, discriminant surface is a hyperplane. The coefficients of its equation are found by the statistical characteristics of the two points that represent a subspace of the phase space of observations. In this case, the error of diagnoses confusion can be very large [6]. The real accuracy of estimates of conditions using specific multifactor discriminant models is unpredictable and significantly depends on the nature and characteristics of sample data, their volume, uniformity, the sense of controlled indicators [7]. In addition, it should be noted that in a range of works, for example, in [8,9], the assumption of Gaussian character of random observation values is used, which considerably limits their application areas. Another method for solving the problem of the objects set recognition is more reliable. Let us proceed to consideration of fuzzy clustering.
Let the results of measurements p of the indicators of each object make up a set of points of p-dimensional phase space. Cluster analysis technology makes it possible to split the source sets into m subsets (by the number of possible object types (Н 1 , Н 2 ,…, Н m ). In this case, the points belonging to one subset -a cluster, in some selected (specified) sense are "close" to each other and "far" from the points of other subsets of clusters [10]. A lot of different methods of clustering are known. Most of them in different variants implement the following simple procedure that is described, for example, in [11]. The number of clusters is known a priori and the grouping center (that is, sets of coordinates of typical points for corresponding states of an object) is assigned for each of them. Now, we perform an iteration procedure, at each step of which the distances to the centers of cluster grouping are found for the successive distributed point and the shortest of these distances is selected for point joining. The most important element of the clustering technology is the procedure of comparison of distances. There are also other ways of implementing this procedure [12,13]. In all cases, it is assumed that the coordinates of points and grouping centers are measured precisely (or the estimation error is distributed normally). This limits the range of application of these methods in the context of fuzzy source data.
The merit of the clustering method is the ease of implementation and unambiguous interpretation of the results. The drawback of this method is low informative value. In fact, the fact that a point belongs to a cluster does not contain any important information about the location of this point in the cluster, that is, whether this point is at the center of the cluster, or near the boundary with a neighboring cluster. Consider a more informative method for identification that is based on regression analysis.
Regression analysis is a powerful, effective method that describes the relationship between some of the selected indicators of a control object and its directly measured characteristics, indicators, and parameters. A general drawback of this method is the lack of the grounded choice of controlled indicators and the procedure of determining the coefficients of the model (1).
Consider the well-posed statement of the problem on regression analysis. Controlled indicators 1 2 ( , ,..., ), p x x x presumably affecting the resulting indicator of the quality of functioning of object y, are selected by any well-grounded method. The relationship between the explaining variables 1 2 ( , ,..., ) p x x x and resulting variable y is described by Kolmogorov-Gabor polynomial, which in its simplest form, is: y a a x a x a x = + + + + + ε To find the unknown coefficient of mode 0 1 , ,..., p a a a a series of n experiments is carried out. In this case, every experience 1 2 ( , ,..., ) is put in correspondence its result y j , j=1,2,…n, that is: . ; .
In the classic theory of regression analysis (a Gauss-Markov model), it is assumed that random measurement errors y j in each experience are not correlated and normally distributed with zero mathematical expectation and known constant dispersion. In this case, estimates of unknown coefficients 0 1 , ,..., p a a a are obtained by the least squares method, through minimizing criterion Vector A, minimizing this criterion is determined from ratio:  We will perform a brief analysis of the described methods for identification of states. The need for improvement of ideas, methods and technologies of identification, which has occurred in recent years, is linked to the formed understanding the inadequacy of the theoretical-probabilistic models of uncertainty for most actual problems of evaluation of the state of an object. The main causes that these models are unsatisfactory include a small source data sample, as well as the change of the conditions for functioning of a control object. Thus, in terms of a small sample of a priori source data [14], the hypothesis of normality of observed data cannot be either properly justified or rejected, which questions the legality of using the central limit theorem. In [15], for the same reason, the errors of statistical estimates of mathematical expectations and variances of the controlled indicators can be unpredictably large. This circumstance will inevitably lead to respectively large errors when solving the system of equations (1), estimates of coefficients of discriminant functions and, consequently, as a result, the identification error will be great. The approach that implies refusal form a priori assumptions about normality of the observed values of parameters of an object in favor of the model of fuzzy mathematics is natural in this situation [16]. This mathematical apparatus is much less sensitive to the sample volume and makes it possible to determine reliably the key structural elements of this theory -the membership function of numeric values of observed indicators even under conditions of a small source data sample. In this case, the simplest variant of the solution of the problem of diagnosing the state of an object using the technology of multidimensional discriminant analysis is calculation and use of theoretical-probabilistic analogues of statistical characteristics of observed magnitudes.
As regards a regression analysis, the transition to the description of source data in terms of fuzzy mathematics initiated the development of new technologies. In [17,18], the membership function of the resulting variable, which is compared with experimental membership function, is determined. The fundamental drawback of this approach is that the accuracy of estimation of the values of independent and dependent variables in practice differ considerably. That is why the result of solving this problem not necessarily will provide a minimum of total fuzziness of the described variable. This raises doubts as to the correctness of description of the relationship between explaining and explained variables. The absence of analysis of result proximity to the modal value of fuzzy explained variable, obtained by results of statistical treatment of source data in [19], decreases the effectiveness of the proposed method. Non-compact function of membership of a fuzzy value of the explained variable in [20] increased the error of the proposed solution.

The aim and objectives of the study
The aim of this study is to modernize traditional identification methods taking into consideration the fuzziness of source data.
To achieve this aim, it is necessary to solve the following problems: -to develop a fuzzy method for discriminant analysis; -to develop a method of fuzzy clustering; -to develop an effective method for fuzzy regression analysis.

Modernization of methods for identification of the state of objects under conditions of fuzzy source data
Let us assume that according to the results of previous studies for each indicator of object x i , a set of values for this indicator was determined for the case when an object is in state Н 1 , and set ( for the case when an object is in state Н 2 . According to these data, we will obtain the description, for example, of the triangle membership function of parameter х i . In this case, we have: We will calculate the values of the main theoretical-probabilistic characteristics of fuzzy magnitudes x i , i=1, 2,…, p. Let us introduce functions: These functions are non-negative and the integral of them is equal to unity. That is why they can be interpreted as the density of probability of random magnitudes and used to calculate their mathematical expectations: To calculate the estimates of the elements of the correlation matrix, we will determine: (1) as well as: Then theoretic-probabilistic analogue of dispersion of indicator х і for the whole set of its observations is equal to: The derived estimates for mathematical expectation and correlation factors will be subsequently used for the standard scheme of calculation of set a i , i =1, 2,…, p, by solving the system of linear equations (1) and subsequent actions during solution of the problem of diagnosing the object state.
Let us proceed to consideration of fuzzy clustering. Let the coordinates of the points (the results of measurement of the controlled indicators), as well as clusters grouping centers, be assigned in a fuzzy way by their membership functions. Then, the membership function of fuzzy distance for any pair (point -grouping center) can be obtained by the known rules for performing the operations over fuzzy numbers [21].
For example, membership function of the i-th coordinate of the k-th grouping center and membership function of the same coordinate of the j-th point are assigned by membership functions of the (L-R) type: x is the modal value of the і-th coordinate of the k-th grouping center, ij x is the modal value of the і-th coordinate of the j-th point, , ik α ij α are the left fuzziness coefficients, , ik β ij β are the right fuzziness coefficients. To calculate membership function of fuzzy distance between the k-th grouping center and the j-th point by the і-th coordinate, we will use the following rules of performing the operations over fuzzy numbers of the (L-R) type [22]. Let  In this case, parameters of fuzzy distance by the i-th coordinate between the j-th point and the k-th grouping center and square of this distance are determined by ratios: (1) (1) , , , Now, parameters of membership function of fuzzy square of the distance between the k-th grouping center and the j-th point are equal to: The strict approach to solving this problem is proposed in [22] and is implemented as follows. Let fuzzy numbers x and y be assigned by their membership functions ( ) x µ and ( ). y µ The degree of preference of number x to number y is determined from formula: And degree of preference of y to x is determined from formula: Then number x is "larger" than y, if ( , ) ( , ), x y y x η > η and number x is "smaller" than y otherwise. Practical implementation of this procedure is complicated. That is why different heuristic approaches are used in practice to solve the problem of comparison of fuzzy numbers [23,24]. One of them is implemented as follows. Degrees of belonging of x and y to corresponding sets on the set of levels 1 2 , ,..., . Now we will consider that fuzzy number x is "larger" than y, of for all , r ν r=1, 2,…, s, the inequality is satisfied: (1) (2) , r r r r x x y y ν ν ν ν + ≥ + and at least one of them is strictly satisfied. If this condition is not satisfied, neither of numbers x and y has any advantage over the other.
We will note the shortcomings of the above approach. Firstly, it is not clear at how many levels it is necessary to perform v-section. Secondly, the approach can be difficult to implement. Thirdly, the described approach will have a specific result only in the case of an obvious advantage of one number over the other, for example, if there is no intersection of membership functions of compared numbers.
Owing to this, another more simple and reliable approach with the result, which is interpreted unambiguously, is proposed.
Let x and y be the fuzzy triangular numbers with membership functions:  Thus, we obtain the following rules: 1) if the carrier of fuzzy result of subtracting is positive, then the minuend is larger than subtrahend; 2) if the carrier of the result is negative, then the minuend is smaller than the subtrahend; 3) if the carrier covers zero and its negative section is larger than the positive one, then the minuend is smaller than the subtrahend; 4) if the carrier covers zero and its negative section is smaller than the positive one, then minuend is larger than subtrahend.
Let us assume that the coordinates of grouping centers of objects are assigned. Then in the clustering problem, the shortest distance, determining the cluster, to which this point is necessary to join, is selected by the results of comparison of fuzzy distances from the next point to the cluster centers according to specified rules. Results of solving the clustering problem for a training set of objects are used to specify the coordinates of clusters grouping centers.
Let us proceed to solution of the third problem. The effective way of improving the quality of the solution of the problem of regression analysis, which ensures getting a solution satisfying two natural requirements, is proposed in [21]: 1) proximity of the result to the modal value of a fuzzy explained variable, obtained by the results of statistical treatment of the source data; 2) membership function of fuzzy value of the explained variable should be maximum compact.
Let us choose Gaussian membership function to describe fuzzy source data: The problem is solved in two stages [25]. At the first stage, the system of linear algebraic equations is composed relatively to the unknown values of coefficients , x Membership function of fuzzy value j z is determined from ratio: Thus, according to criterion (4), we estimate the set of coefficients of regression equation , i a i=1, 2,…, p, which ensure minimum blurring (maximum compactness) of membership function of result and minimum deviation from modal set (0) , і а i=1, 2,…, p. A significant drawback of this approach is complexity of solving the system of equations, obtained in this case.
which even in the simplest case, when fuzziness of source data is described in the Gaussian form, can be solved only numerically. Difficulties of solution increase additionally, if we use a general expression in the form of a function of the (L-R)-type to describe ij x : We will simplify expression for criterion (4), bearing in mind (5). Compactness of fuzzy number ij x can be estimated by the sum of the left and right fuzziness coefficients , ji ji ji с = α + β i=1, 2,…p, j=1, 2,…n, The measure of compactness of number j z will be equal to: Using the method undetermined Lagrange multipliers. We will introduce the Lagrangian function:  (7), find , 2 λ we have: Now, considering (8), (9), we will obtain: Thus, in the case that is important for practice when fuzzy source data are described by general functions of the (L-R) type, the analytic decision in the form of calculation formulas was obtained.

Discussion of results obtained in the modernization of methods for the identification of object states
Classic methods have a number of drawbacks, such as the lack of theoretical substantiation of the selection of the identification method; rigidity of the mechanism of conversion of the source data into the end result of identification; lack of informative value of results, which permits ambiguity of their interpretation.
Canonical technologies are based on the use of the theoretic-probabilistic description of the results of direct measurement of the controlled indicators of an object and the resulting indicators, assessing effectiveness of its functioning. The formed belief in the need to improve the approach to describing actual uncertainty of the source data led to the use for these purposes of the models and methods of the fuzzy sets theory and solution to the corresponding problems.
The modernization of the classic methods for solving the problem of identification of the state taking into consideration the fuzzy nature of the reproduction of the source data. In this case, we obtained analytical ratios describing the procedure for getting the ultimate results in a particular case that is important for practice when fuzzy source data are described for the functions of the (L-R) type.
The advantage of the proposed identification methods under conditions of uncertainty in comparison with the classic methods is explained by the possibility of solving this problem under actual conditions of a small sample of fuzzy source data. In this case, the proposed methods under uncertainty conditions can be adapted to solving the identification problems for any types of membership functions.
The identification methods proposed in research make it possible to reduce the identification error at a small sample of ( ) ( ) fuzzy source data. In this case, these solutions can be applied in technologies using multidimensional discriminant analysis, clustering or regression analysis. The major limitations when using the suggested methods include: -the developed procedure does not ensure the adequate solution of the identification problem in case when source data are qualitative; -the procedure is focused on the description of fuzzy source data by membership functions of the (L-R) type, while using membership functions of another type, the procedure of solving the problem gets complicated.
Further research into technology of the problems of identification of the state of objects in the context of fuzzy input data can be performed in the following areas: 1) improvement of the method of solving fuzzy systems of linear algebraic equations; 2) development of the methods for fuzzy optimization [28]; 3) studying the results of application of the developed method for membership functions of different types.

Conclusions
1. The fuzzy method for discriminant analysis, which is the development of the corresponding classical method, was proposed. It was established that when using it, it becomes possible to identify objects under the actual conditions of a small sample of fuzzy source data. Identification is possible by increasing the adequacy of descriptions of uncertain input data when constructing the discriminating surface. It should be noted that under this approach, the refusal from the theoretical-probabilistic technology of the source data description is a principal issue.
2. It was established that when using refined procedure of grouping with the separation of the grouping center, clustering becomes possible even at a small sample of fuzzy source data. This refined procedure provides for the use of the developed method of fuzzy clustering. Unlike similar methods, the original problem of comparison of fuzzy triangular numbers is reduced to the simpler problem of comparing a fuzzy number with zero.
3. It was established that when using the improved procedure for solving regression equations, it has become possible to assess regression coefficients analytically, if the source data are represented by general membership functions of the (L-R)-type. This improved procedure involves the use of the developed method of fuzzy regression analysis. In this case, under conditions of a small sample of fuzzy data, the adequacy of regression models improves due to taking into consideration the differences in the description of exogenous and endogenous variables.