EVALUATION TO DETERMINE THE EFFICIENCY FOR THE DIAGNOSIS SEARCH FORMATION METHOD OF FAILURES IN AUTOMATED SYSTEMS

In recent years, the way people control various kinds of technology has radically changed. The leading place in this process is taken by automated systems, SCADA (Supervisory Control and Data Acquisition) systems included. Modern SCADA systems are widely spread all over the world to control technological processes in different areas such as 3. Oppelt, U. Improvement on fire detectors by using multiple sensors [Text] / U. Oppelt // Fire & Safety. – 2006. – Available at: http://www.securitysa.com/regular.aspx?pklregularid=2502 4. Ding, Q. Multi-Sensor Building Fire Alarm System with Information Fusion Technology Based on D-S Evidence Theory [Text] / Q. Ding, Z. Peng, T. Liu, Q. Tong // Algorithms. – 2014. – Vol. 7, Issue 4. – P. 523–537. doi: 10.3390/a7040523 5. Cheng, C. One fire detection method using neural networks [Text] / C. Cheng, F. Sun, X. Zhou // Tsinghua Science and Technology. – 2011. – Vol. 16, Issue 1. – P. 31–35. doi: 10.1016/s1007-0214(11)70005-0 6. Cestari, L. A. Advanced fire detection algorithms using data from the home smoke detector project [Text] / L. A. Cestari, C. Worrell, J. A. Milke // Fire Safety Journal. – 2005. – Vol. 40, Issue 1. – P. 1–28. doi: 10.1016/j.firesaf.2004.07.004 7. Radonja, P. Generalized profile function model based on neural networks [Text] / P. Radonja, S. Stankovic // Serbian Journal of Electrical Engineering. – 2009. – Vol. 6, Issue 2. – P. 285–298. doi: 10.2298/sjee0902285r 8. Tsai, Y. C. The Design and Implementation of Early Fire Detection and Hierarchical Evacuation Alarm System, Master Thesis [Text] / Y. C. Tsai // Graduate Institute of Networking and Communication Engineering. – Taiwan, 2007. 9. Ristic, J. Decision algorithms in fire detection systems [Text] / J. Ristic, D. Radosavljevic // Serbian Journal of Electrical Engineering. – 2011. – Vol. 8, Issue 2. – P. 155–161. doi: 10.2298/sjee1102155r 10. Andronov, V. Increase of accuracy of definition of temperature by sensors of fire alarms in real conditions of fire on objects [Text] / V. Andronov, B. Pospelov, E. Rybka // Eastern-European Journal of Enterprise Technologies. – 2016. – Vol. 4, Issue 5 (82). – P. 38–44. doi: 10.15587/1729-4061.2016.75063 11. Acclimate intelligent multi-criteria sensor MIX-2251TMB [Electronic resource]. – Mircom. – Available at: http://www.mircom. com/media/datasheets/CAT-5919_MIX-2251TMB_ACCLIMATE_Intelligent_Multi-Criteria_Sensor.pdf


Introduction
In recent years, the way people control various kinds of technology has radically changed.The leading place in this process is taken by automated systems, SCADA (Supervisory Control and Data Acquisition) systems included.Modern SCADA systems are widely spread all over the world to control technological processes in different areas such as
It is known that most of the failures in the system operation occur due to errors of operational and maintenance personnel who have insufficient qualifications (and/or limited time resources with increased information flow) for supervisory control, quality assistance, and restoration of system operability.Therefore, the pressing problem is to develop highly reliable and fault-tolerant SCADA systems by improving the methods of their automatic self-diagnostics in real time with the possibility of auto-recovery of their operability after reversible failures.

Literature review and problem statement
The modern SCADA system is a distributed multilevel and multitasking hardware-software complex (HSC) operating in real time [1,2].It is a complex, dynamic and difficult-to-formalize diagnostic system with a changing structure and functionality in the process of its life cycle.The effectiveness of SCADA work and the reliability of the data at all levels of its hierarchy depend on the operability of backbone nodes, data transmission channels, peripheral equipment and the consistency of the software system in general [3].
Problems in SCADA maintenance due to low-level diagnostics and the limited ability to restore SCADA operability after reversible failures have a negative impact on the quality of control for the Technological Control Object (TCO).
The methodology used to determine the efficiency of technological diagnostic systems is considered in work [4].It is based on the calculations of diagnostic object indicators as well as technical diagnostic tools.The author also uses the indicator of the diagnosis method effectiveness in terms of the accuracy of diagnostic operations and includes the length of diagnosis time.
The method of SCADA diagnostics is presented in work [5].It is based on a specific, unique alarm (index analysis) generated by the system in the event of a malfunction.The main idea is that each alarm relates to a particular fault in the system.These alarms are always generated by the system when a fault occurs, but are never generated for any other fault.The alarm is activated when a fault is detected in the system.It is generated according to predetermined relevance indexes for a changing set of faults in different contexts.However, this method does not always make it possible to identify the required fault in a changing set.In this case, this fault is considered indistinguishable and cannot be reliably diagnosed.
The application of SCADA to monitor a power station in Amareleja (Spain) is described in work [6].The system allows detection of not only current faults, but also allows human operators to conduct long-term forecasts of equipment status and abnormal trends in the operation of the facility.However, the system is aimed at diagnosing the operation of the TCO.It does not perform at all levels of the system hierarchy from the TCO to SCADA, which generally reduces the reliability of diagnostics for the entire complex "object-system".
The method of fault diagnostics for oil transformers is given in work [7].It is based on the application of the operational control method performed by SCADA which combines information from a set of data sources and uses neural networks with back propagation of error.This diagnostic method gives a high coefficient of the diagnostic accuracy (up to 93 %) because it uses flexible strategies applied by expert systems.However, the diagnostic method presented requires large data sets to be processed, which significantly affects the time duration to establish the diagnosis.
Methods of fault diagnostics for wind turbines controlled by SCADA are examined in work [8].The authors provide a review of a number of diagnostic methods, including methods based on artificial neural networks, fuzzy-logic methods, and combined methods such as adaptive neuro-fuzzy inference.The selection of an optimal set of monitored parameters and quality of data (a noiseless signal from sensors) is very important to establish the diagnosis.The frequency of data interrogation by SCADA is critical too.If this frequency is measured in minutes, then the important information for diagnostics can be lost.Therefore, it is necessary to preprocess the raw data from SCADA to establish an accurate diagnosis, especially when time limits apply.
The method of fault detection for industrial automation processes is described in work [9].The method takes into account the work of automated systems with insignificant noise of the data measured and the possibility of constant monitoring of a complete set of monitored parameters.The authors use neural networks and regression analysis methods to approximate the functional dependence of continuous variables for the technological processes over time.They also show the application of the HMM (Hidden Markov Model) to monitor continuing process variables as a sequence of hidden states of discrete variables under different system modes.The detection of current faults at the same time reaches 87 %, but the failure prediction error rates for various operating modes of the system remain high enough.
The concept of automated system diagnostics based on Latent Variable (LV) models is considered in work [10].The authors define process parameters which exceed the monitoring statistics on the basis of the archived data array collected by SCADA.These parameters are related to the operation of the equipment and the behavior of the process or the malfunction in the system.The approach is based on the ability of the operational personnel of the enterprise to give an expert assessment of the current technological process and the operation of the equipment.Decision support systems based on knowledge can be applied to automate the process of forming an expert judgment.In this case, LV models must be unique and identifiable.It also requires processing large arrays of missing data, monitoring their integrity within the LV area which is set by the trainable sample.Restrictions on the use of LV models (built on the archive data of the system) consist of limiting the space of latent variables determined by these archival data.LV models also cannot be used to extrapolate TCO operation modes for which monitoring statistics are not available.
Having analyzed the literature in the field of automated system diagnostics, we can make the conclusion that there has been no adequate resolution for the following problems in the methods of diagnostics applied today: -the intensity of the diagnostic information flows causes significant difficulties when it is necessary to process them operationally by the corresponding operational services of the enterprise; -a large amount of data is considered low-level diagnostic information which requires the highly professional knowledge of technologists, IT specialists, and system integrators in order to process; -for most automated control systems, the Expert System (ES) methodology is used only to diagnose the state of the Technological Control Object (TCO) and not the states of the system hierarchy levels from TCO to SCADA.
Thus, today the unsolved part of the problem is the development of methods for complex automatic high-level diagnostics of the operability at all levels of the system hierarchy in the automated system from TCO to SCADA with the minimization of the diagnosis time to real time.

The aim and objectives of the study
The aim of the research is to increase the efficiency of the automated control system for an industrial enterprise by providing automatic self-diagnostics for its components -SCADA software and hardware in real time.
It is necessary to solve the following tasks to achieve the set aim: -formation of a mathematical model to diagnose the operability of an integrated complex TCO -SCADA using the methodology of Expert Diagnostic Systems; -conducting a study of the patterns for the formation of the diagnostic search space with reference to the knowledge base of the Expert System for a conflicting set of input data -diagnostic codes generated by the SCADA system in the event of failures; -development of a method to diagnose failures in an integrated complex TCO -SCADA with the minimization of the diagnosis search formation time; -performing an analysis of the effectiveness for the diagnostic method developed.

Mathematical model to diagnose the operability of automated systems based on the methodology of expert systems
We wish to put forward a scheme of interaction between SCADA and the Expert System (see Fig. 1) to solve the problem of automatic SCADA self-diagnostics using an Expert Diagnostic System as the diagnostic subsystem [11,12].
The following designations are used to identify the above symbols in Fig. 1: DRTES -Diagnostic Real Time Expert System; DB -Database (basic components of the subject area, their properties and the relationships between them); KB -Knowledge Base (rules to construct structure variants from database components); IM -Inference Machine; SDC -System Diagnostic Codes; Human Machine Interface (HMI) is the interface between processes and human operators.
SCADA subsystem based on the Expert System derives a conclusion about the SCADA technical state, its functional components and processes.The conclusion is based on the analysis of the input data of the Expert System diagnostic codes generated by SCADA in real time.
The diagnostic model of TCO and SCADA is shown in Fig. 2.
The following designations are used to identify the above symbols in Fig. 2: Х -a set of input parameters (these include controllable parameters of TCO and SCADA); Y -a set of output parameters (this is a diagnosis of the TCO and SCADA state; localization and identification of failures in hardware and software of TCO and SCADA; recommendations for the recovery/auto recovery of their operability; recommendations for failure prevention, repair, hardware  upgrades, software updates and recovery/auto recovery included); Z -a set of factors which can impede the system's reliability (for example, a power failure, a hardware failure, a DoS attack, external electromagnetic effects on system equipment, etc.);A -the operator of the model represented by a set of algorithms and/or functions; P -processes.A process is an executable program module that runs on a backbone node.A set of processes form SCADA software; K -Control Process Points (CPP).A Control Process Point is a virtual location with a given instruction in the executable process code.When a failure occurs, it is possible to give the result based on the diagnostic code if the process passed/didn't pass its control point.In other words, it is a diagnostic code about the process execution error.The set of diagnostic codes allows us to determine failures in the system; C -System Diagnostic Codes (SDC).These include software error codes, exceptions of classes and functions, process return codes, states of uncertainty (i.e. the partial or complete absence of information after the timeout is expired, etc.).Diagnostic Codes are generated in the process of SCADA operation; F -Types of SCADA Failures (TSF); Н -the relationship between a Control Process Point and Process.Q -the relationship between System Diagnostic Codes and Control Process Points as a disjunction of a set with possible values of the SDC for a given CPP; G -the relationship between a given set of SDC determined at the CPP to the relevant TSP.
Thus, the relationship G determines the TSF in SCADA using the set of SDC coming at the input of the ES.
G 1 ÌG -the relationship between a set of SDC and TSP which is completely determined at the set of SDC.
Suppose the SCADA subsystem based on the ES receives a set of input data ( ) X t D ordered by the time of generation for SDC in SCADA: { } 1 ( ) ( , , )( ) ( , , )( ), ,( , , )( ), ,( , , )( ) , where X(∆t) -a subset SDC C x , generated by SCADA in the event of off-nominal passing /non-passing a subset CPP K y which belongs to a subset of processes P z ; ∆t=(t k -t 1 ) -the time interval to generate the SDC in SCADA for the given input data presented.Suppose the Knowledge Base (KB) of the ES for diagnosing a failure f∈F contains an alternative set of rules G f ⊂G for the logical derivation of the diagnosis, determined at the set , , , , , , , , , , , .
Skolem standard form for the formula of the deductive derivation rule in the Expert System Knowledge Base is as follows [13,14]: which is interpreted as follows: "it is required to prove the truth of deducibility for the conclusion from a set of true premises".
The following designations are used in formula ( 5): x i -a propositional variable corresponding to a certain SDC in the Expert System Database determined at the CPP; g j -a condition (premise), corresponding to accepted rules of failure determination; w -conclusion; n -the number of propositional variables; m -the number of conditions.

Formation of the diagnosis search space for failures in the automated system at the conflicting set of input data
We can determine the diagnosis search space as follows: Consider an example of the diagnosis search space formation [13].
For instance: X -a set of SDC determined at CPP in the Expert System Database; G 1 -a system of rules in the Expert System Knowledge Base to determine a failure in the system; X(∆t) -a set of input data in the Expert System for a time interval ∆t.

X t x x x x D =
Required to find: the diagnosis search space for a set of input data X(∆t).To do this, it is necessary to form subsets n m Y on the basis of the set 2 X(∆t) .The elements of the given set are represented by m-tuples, where: n=|X(∆t)| -the number of elements in the set of input data X(∆t); m -the length of the tuple.
We consolidate the results of the formation of n m Y based on the set of input data X(∆t) into Table 1.
There is a set of rules in the Expert System Knowledge Base G m,j ( , ∆t) formed at the set of input data in the Expert System X(∆t).These input data form the diagnosis search space F m,j : , \ , , where i f X − -the set of SDC to derive a diagnosis f i and the values of its elements are not determined at the set of the input data X(∆t).

Table 1
Representation of m-tuples on the basis of input data 2 , , , X − =AE, then the set of rules G fi is completely determined at the set of input parameters X(∆t) for the current state of the Expert System Knowledge Base.Otherwise, it is partly determined.
We consolidate the results of the alternative search space formation into Table 2.

Table 2 Alternative spaces for diagnosis search
No. , , , , , ) , , According to Table 2, the diagnosis search space is defined as follows: , , .
There are priority rules with absolutely determined parameters for the elements of Boolean 2 X(Dt) with maximum cardinality to establish the diagnosis.
Formation of the diagnosis search space using the rules from the Expert System Knowledge Base with partially defined parameters is carried out by finding a complement to the Boolean element 2 X(Dt) for a subset of an ordered basis set of arbitrary cardinality.

Application of data structure m-tuples in the method for failure diagnostics of an automated system
Consider the properties of the input data set in the Expert System and methods to work with them in detail [15].The set is based on the elements of m-tuples n m Y as part of Boolean 2 X(Dt) .
Suppose we have a finite ordered increasing basis set: where i -the ordinal number of the element x in the set X; n=|X| -the number of elements or the cardinality of the set X; "<" -comparison operator for the elements of the set X, defined individually, depending on the type of elements of the set X; 2 X -Boolean of the set X; where n m Y -a subset of Boolean.The elements of the given subset are m-tuples.They consist of the elements from the set X.The set X is ordered by a right-hand search of its indexes in m-tuples from the lower boundary to the upper.n m k -cardinality of the subset . , .
, n m j i -a tuple of indexes of basic elements in the tuple , .
n m j y We can specify the following dependencies between the ordered sets given above: -indexing elements from the ordered basis set X: -searching for elements of the basis set X using its index: ; -projecting ordered sets of m-tuples n m Y onto elements of the basis set X with cardinality n: ; -forming ordered sets of m-tuples n m I from the elements of the ordered basis set I with cardinality n:

:
; -projecting ordered sets of m-tuples n m I onto elements of the basis set I with cardinality n: ; -searching for elements of the set n m Y using its index: .
The following methods are proposed for consideration to carry out the search and projection of elements in ordered sets: -determining the values in the tuple of indexes , n m j i using the ordinal number j for the element of the ordered set n m Y [15] : ; -determining the complement to the element , n m j y to the basis set X:  : .
Consider formal rules to determine the dependency j← , n m j i based on the method A 2 .
n m j i Solution: 1. We verify that our input data are correct: X and m 1 [1, ) , 2. We form Table 3 with the dimension (m+1)×4 to calculate the parameters , , .

Areas of index definitions
3. We verify that input data are correct: , , , .
The condition of correctness , 4. We form Table 4 with the dimension ( ,1 which determine the number of elements in the set n m Y .These elements are obtained by performing an ordered right-hand search of indexes beginning with i h in the given area: , , 5. We form Table 5 with the dimension ( ,1 Calculation of , ( ) ) The calculation of , ( ) to fill in Table 5 is carried out by using formulas (39), ( . We calculate j min using formula (40). .

Comparative evaluation of the performance of algorithms with sequential access to the Boolean of input data and usage of method A 2
We will use the asymptotic estimate O(f(n)) of the increased rate in the number of operations f(n) with increasing n to estimate the productivity of the methods.O(f(n)) is calculated for the worst case, when the input data of the length n require the maximum time for the execution of the algorithm.Thus, O(f(n)) is a "truncated" estimate of the execution time of the algorithm which shows the asymptotic changes of f(n) when n→∞.
The data structure "m-tuples based on ordinary sets of arbitrary cardinality n" is a list with sequential access containing 2 n elements.Therefore, when searching for an element at the end of the list (the worst case), the function to calculate the number of operations is as follows: Accordingly, the estimated time of the algorithm execution "Searching for an element in a list with sequential access" is as follows: .
Thus, the given algorithm refers to algorithms with the exponential time of execution.
Method A 2 allows us to transform a list with sequential access to a list with direct access when using the data structure "m-tuples based on ordinary sets of arbitrary cardinality n." We calculate the estimation of the execution time for method A 2 .
The function to calculate the number of operations for method A 2 is as follows: , where f t3 (n), f t4 (n), f t5 (n) -functions to calculate the number of operations when filling in Tables 3, 4 and 5, respectively.When filling in Table 3, for the worst case (m=n) we have: Accordingly, the time estimation when filling in Table 3: We will calculate the function of the number of operations for an element from Table 4 f C (n) to calculate f t4 (n).According to this, an element from Table 4 in general can be calculated using the following formula: for m<n/2 we have: For m>n/2 we have: ( 1)) : ( : 0)), (40) ))?( : ( )) : ( : ( )). ( 42) Thus, in the worst case for m=n/2 we have: Accordingly, the estimated time to calculate an element from Table 4 is as follows: The function to calculate the number of operations when filling in Table 4 is as follows: In the worst case for m=n/2 we have: ( ( )) ( ).
We will calculate the function of the number of operations for an element from Table 5 f J (n) to calculate f t5 (n).In general, the element from Table 5 is calculated using formula (41) and consists of maximum n-m summands: For the worst case (m=1): Based on (59), the estimated time to calculate an element from Table 5 is as follows: ( ( )) ( ).
The function to calculate the number of operations when filling in Table 5 is as follows: In the worst case for m=n/2 we have: ( ( )) ( ).
Therefore, the estimated time for method A 2 is as follows: Thus, A 2 is an algorithm of cubic execution time.

The results to evaluate the performance of algorithms with sequential access to the Boolean of input data and using method A 2
We form a comparative table of changes in the number of operations when n is increasing for the cubic (method А 2 ) and exponential (sequence access) time of execution (Table 6).Graphs of functions of increasing numbers of operations f(n), f 2 (n) for sequential access to elements of the given data structure and using algorithm A 2 are shown in Fig. 3.We perform a comparative analysis to estimate the time of execution for the given methods (Table 7).
According to Table 7, we can see that if we work with the data structure "m-tuples based on ordinary sets of arbitrary cardinality n" as a list with sequential access, it is quicker to get access to elements of this data structure unlike using method A 2 when n<10.If n=10, the execution time of the methods which work with the data structure as a list with sequential access as well as method A 2 is approximately the same.
If n≥40, method A 2 of direct access to the elements of the data structure is several orders of magnitude faster than the methods of sequential access to them.Algorithms that implement the methods of sequential access with exponential execution time are generally extensive and refer to the complexity class NP-hard (non-deterministic polynomial).The application of method A 2 takes the problem to the complexity class P (polynomial), that is, the algorithm to implement A 2 is executed in polynomial time.According to Cobham's thesis, complexity class P belongs to the fast-executing complexity class.1.We developed a diagnostic mathematical model using the methodology of Expert Systems to study the processes of functional diagnostics for the automated system operability in order to minimize the time needed to establish the diagnosis.The distinctive features of this model are as follows: -complete diagnostics of the entire integrated complex TCO -SCADA.This can be achieved by processing a complete set of input data in the diagnostic model -diagnostic codes generated by SCADA at all levels of the hierarchy (hardware platform -operating system -software development and execution environment -SCADA configuration software -run-time SCADA software -software to control and monitor TCO); -minimization of the time for the diagnosis search formation space when working with a conflicting set of input data in the model developed -system diagnostic codes.Minimization of the diagnostic time is achieved due to the fact that we apply methods to work with the data structure "m-tuples based on ordinary sets of arbitrary cardinality n" instead of methods with sequential access to the elements of the Boolean of input data.
2. Investigation of the patterns for the diagnosis search formation space for the Boolean of input data showed that the access time to the elements of the Boolean of input data increases depending on the number of input data n: -using the exponential functional dependence 2 n for sequential access to the elements of the data structure; -using the cubic functional dependence n 3 for direct access to the elements of the data structure.
3. A distinctive feature of the method developed to form the diagnosis search space for the integrated complex TCO -SCADA is the minimization of the access time to the elements of the Boolean of input data 2 X(Dt) as well as the minimization of the time to obtain the complement to the element of Boolean 2 X(Dt) to a subset of the ordered basis set.This allows us to quickly determine conflicting groups of input data.
4. The effectiveness analysis of the diagnostic method developed by the criterion of the diagnosis search duration showed that the algorithm realizing this method allows us to segue from the tasks of the NP-hard complexity class in general to the tasks of the P complexity class, unlike the diagnostic method with sequential access to the elements.This approach allows us to minimize the time to establish the diagnosis to real time.This has a significant impact when processing a set of input data of arbitrary n≥40 (Table 7).

Conclusions
1.The method developed for the diagnosis search formation space changes the functional dependency of the execution time estimation for the algorithm in accordance with the number of its input data n from exponential O(2 n ) to cubic O(n 3 ).
2. The effectiveness of the method developed is confirmed by the fact that the formation time of the diagnosis search space is minimized to real-time, which is especially noticeable for the number of input data n≥40.
3. The method developed for automatic self-diagnostics for the integrated complex TCO -SCADA allows creating methods and algorithms for SCADA automatic self-recovery after reversible failures in real time.
4. The method developed to diagnose the operability of an integrated complex TCO -SCADA is a prerequisite for creating a methodology and a mathematical basis for the design and implementation of maintenance-free hardware-software complexes.

Fig. 1 .
Fig. 1.Scheme of interaction between SCADA and the Expert System

yyJI
-m-tuple is an element of the set n m Y and consists of the elements from the basis set X; m -the length of the tuple; j -an index (sequence number) of a tuple in the set of tuples ; Suppose I -an ordered increasing set of indexes which are elements of the set X: -an ordered increasing set of indexes which are elements of the set n m Y : -a set of tuples which contains indexes of basic elements for the tuples of the set n m Y : sets of m-tuples n m Y from the elements of the ordered basis set X with cardinality n: 2 : elements of ordered sets of m-tuples n m Y using the elements of ordered sets of m-tuples n elements of ordered sets of m-tuples n m Y using the elements of ordered sets of m-tuples n :

Fig. 3 .
Fig. 3. Graphs of the speed of increasing numbers of operations for methods with sequential and direct access to the given data structure

Table 6
The number of operations for methods with cubic and exponential time of execution

Table 7
Estimation of the execution time for methods developed towork with given data structure