A MODEL DEVELOPED FOR TEACHING AN ADAPTIVE SYSTEM OF RECOGNISING CYBERATTACKS AMONG NON- UNIFORM QUERIES IN INFORMATION SYSTEMS

Active expansion of information and communication systems (ICS) and mission-critical information systems (MCIS) in many countries around the world is accompanied by the emergence of new threats to cybersecurity (CS), as evidenced by the growing number of incidents related to information protection and identified vulnerabilities in MCIS. Global development of corporate information systems (CIS) and MCIS, particularly in segments such as e-business (EB) in production industries, transport and communications requires constant tracking of cyber threats and vulnerabilities of technical components, software (SW), and database management systems. One of the priorities of cyberdefence, which contributes to the timely detection of cyberattacks and prevents their implications for CIS and MCIS, is to develop systems of intellectual recognition of cyberattacks (SIRCA). For such systems, it is always important to maximize the applicability of the models and algorithms for detecting cyberattacks that allow taking into account not only the presence and length of query queues in CIS or MCIS but also the possibility of using additional information about the structure of the input streams or any change made by attackers to the queries intensity, attack A MODEL DEVELOPED FOR TEACHING AN ADAPTIVE SYSTEM OF RECOGNISING CYBERATTACKS AMONG NONUNIFORM QUERIES IN INFORMATION SYSTEMS


Introduction
Active expansion of information and communication systems (ICS) and mission-critical information systems (MCIS) in many countries around the world is accompanied by the emergence of new threats to cybersecurity (CS), as evidenced by the growing number of incidents related to information protection and identified vulnerabilities in MCIS.
Global development of corporate information systems (CIS) and MCIS, particularly in segments such as e-business (EB) in production industries, transport and communications requires constant tracking of cyber threats and vulnerabilities of technical components, software (SW), and database management systems.One of the priorities of cyberdefence, which contributes to the timely detection of cyberattacks and prevents their implications for CIS and MCIS, is to develop systems of intellectual recognition of cyberattacks (SIRCA).For such systems, it is always important to maximize the applicability of the models and algorithms for detecting cyberattacks that allow taking into account not only the presence and length of query queues in CIS or MCIS but also the possibility of using additional information about the structure of the input streams or any change made by attackers to the queries intensity, attack

Literature review and problem statement
The issue of improving the models of recognising complex cyberattacks by cyberdefence systems in CIS and MCIS has been the subject matter of many studies.In [1,2], models are suggested for cyberattacks detection systems (CADS) that take into account the presence and length of query queues in CIS [3], but the authors do not consider the possibility of changing the inflow rate of queries to the server.
There are studies on models and algorithms for detection of cyberattacks that take into account queries in modules of "client-bank" systems, electronic invoices and communication systems [4,5], the flow rate of requirements [6,7], the interval between the requirements [8,9], and the types of queries [10][11][12].However, these studies do not consider the use of information systems of a variable structure, including the ones equipped with multiple servers and cyberdefence components, that are able to deal with a complex behaviour of query queues in terms of their heterogeneity (conflict).Thus, most studies that have been devoted to the issue of intellectual recognition of cyberattacks directed against CIS or MCIS concern only the basic features of cyberattacks.These publications do not take into account the changing modes of CIS or MCIS in case of a query loss due to blocking heterogeneous flows by relevant protection systems as a result of complex cyber interventions such as targeted attacks or when queries are lost due to the queue overflow in CIS and MCIS servers.
A large number of publications are devoted to the problems of designing systems of recognising cyberattacks (SRCA).Models of detecting cyberattacks that are based on finite automata (FA) are described in detail in [13,14].Methods of computational intelligence in SRCA are explored in [15][16][17].Such systems are still under construction.In [18,19], a Bayesian network model is suggested for SRCA.However, analysis of these studies reveals that in most cases such SRCA are based on decisions being made with the help of a statistical analysis of the presence of anomalies, threats and cyberattacks, without taking into account the possibility of implementing complex targeted cyberattacks.A widespread use of such SRCA is prevented by a significant complexity of the operational set-up of the repository of object recognition templates.
Many studies are devoted to the models and methods of detection based on the use of Markov chains [20][21][22][23].The typical disadvantage of most SRCA that are proposed in these studies is lack of an opportunity to quickly replenish the repository of cyberattacks' patterns, as they almost always use only one methodology of recognition.
In the above studies, which are of interest in solving the problems of identifying cyberattacks, the models used are based only on information about query inflows and saturation streams [6,8,15,16,20].Recent cyberattacks have become extremely complex.Narrowly focused, systematic and shared attacks, which are known as persistent sophisticated threats, are able to hide from anti-viruses and are not detected by firewalls and intrusion detection systems [9,17,22].These targeted threats either have no signatures or are well disguised [4,23].
Thus, further research should be aimed at developing methodological and theoretical bases for creating systems of intellectual recognition of cyberattacks, which would involve using additional information about the structure of the incoming flow, a possible change produced by attackers on the query intensity, the speed and duration of the impulse, and other parameters of cyberattacks.

The aim and tasks of the research
The aim of the undertaken research is to develop a model for training in the created adaptive system of intelligent recognition of cyberattacks to help take into account and store in the repository the patterns of sophisticated cyberattacks that have variable intensity of the incoming flow of queries in CIS or MCIS.
To achieve the purpose of the study, it is necessary to do the following tasks: -to develop a model of intelligent recognition of complex targeted cyberattacks with variable parameters of query streams in CIS or MCIS; -to carry out simulation tests on cyberattacks for heterogeneous query flows in information systems.

A model of an intelligent recognition module for cyberattacks of heterogeneous flows of queries in information systems
The mathematical description of the module of SIRCA for heterogeneous flows of queries is presented as follows: where IS is a set of input signals that determine the state of cybersecurity in CIS or MCIS; T is a set of time points for the data on the state of information security (IS) of the object of protection; SS is a signature space for recognising a certain class of cyberattacks; S Ω is the space of the functional states of IS; KB is a knowledge base to identify cyberattacks; 2 MX is an instructional matrix (standard) that is stored in the repository of SIRCA; 2 MB is an instructional binary matrix; o 1 and o 2 are operators that form the instructional input and the binary matrices of SIRCA, respectively.
The SIRCA structure is shown in Fig. 1.The operator O :MB MR Θ → is used to divide the space of cyberattack features into two classes of recognition.The parameter of the features (PF) is used to test the statistical hypothesis that the object of recognition belongs to a simulated class of cyberattacks.After evaluating the statistical hypotheses by using an oy operator, a plurality AR Q is formed to contribute to the accuracy of recognising a cyberattack in SIRCA.It is assumed that q is the number of the statistical hypotheses, and g=q 2 is the quantity of SIRCA characteristics.The operator oµ generates an exploit kit (EK) plurality, which allows performing the procedure of evaluating the effectiveness of attack recognition within the class.The operator ob is used to optimize the system of control deviations from the patterns of cyberattacks.The set SW is consistently closed by the operator  The query streams are considered to be heterogeneous under the following conditions: (1) there is no possibility to summarize the incoming flows of queries and to reduce the problem of recognition in SIRCA when it concerns suspicious queries about a one-dimensional case; (2) applications from heterogeneous streams are processed in intervals that do not overlap; (3) the system contains the so-called "intervals of inaccessibility" during which the streams are unattended, for example in the case of analysing queries by an intrusion detection system in MCIS.
The recognition system a priori contains the most intensive input streams of queries (streams that are primarily important in terms of the servicing speed) and streams of low intensity.The functional diagram of such cyberattacks is shown in Fig. 2.
Let us assume that the incoming flows of queries k 1 , k 2 , and k 3 are formed in some random environment (RE).The state of the RE may determine the probabilistic structure of the query flows.The variants can be as follows: (1) if the RE is in a state of ( ) 0 c , the incoming demand streams are regular query streams, which are the typical mode of CIS or MCIS; (2) if the RE acquires the state of ( ) 1 c , the incoming streams are streams of packets (a query flow is a sequence of "packets" [21,23,24]).
It is assumed that: k 1 is a priority stream of queries that come with low intensity, k 2 is a stream of queries of a normal priority and low intensity, and k 3 is a priority stream of queries coming with the highest intensity.
The informational flow k 1 means that the dynamics of the system reflect the availability of applications in the storage NO 1 and the incoming queries down this stream.Priority of an appropriate flow is the prerequisite for operational maintenance of the queries that have come in the CIS or MCIS.For example, for the flow k 3 , the priority means that a gap or absence of queries for the stream k 1 facilitates continuity in servicing the queries of the stream k 3 .
According to the topology of CIS or MCIS and the assumptions about the state of the RE, the work of the maintenance equipment (ME) is organized, for example, for servers of MCIS and elements of SIRCA.According to the graph, let us mark the states of the system as ( ) r S , r 1,7.
= The states of the system form a plurality The system is in a state of ( ) r S for the time r , r 1,7.

τ =
The ME performs the task of analysing and meeting the requirements, and it also controls the input streams and forms queues in the NO i .Selection of queries from the queues is made according to their priority and by using the strategies of service designated as α 01 , α 02 , and α 03 .The state ( ) − for j=1, 2, and 3 entails that the ME meets the service requirements of the stream k j .In the state ( ) S for j=1, 2, and 3, the queries of all the incoming streams are left unattended.In the state ( )

+
Let us consider a situation where hackers that attack a system can create a queue.Accordingly, the output streams in the system at the maximum load and with the ME functioning continuously are transformed into saturation flows marked as k′ 1 , k′ 2 , and k′ 3 , unlike the real query flows -k 1 , k 2 , and k 3 -in the system.
We considered such options of the intelligent recognition of cyberattack threats: (1) when packets are sent at a zero rate within a time scale of queries that pass to the addressee and back; (2) when during a cyberattack the attacker can vary the impulse duration; (3) when there are minimal random values within a time scale of queries that pass to the addressee and back.
All random objects -which are analysed further, were used to construct a cyberattack model, and are related to the process of servicing queries -are addressed in the probability space ( ) ( ) ,A, * Ω Ρ of elementary random events ω ÎΩ with a probability of the query penetration into the system -P(A).The incoming flows of queries are described Fig. 2. A functional diagram of cyberattacks with mixed flows of queries: S 1 is the entry into a CIS or MCIS, S 2 is the scan of the available resources (AR) in a CIS or MCIS, S 3 is the waiting for the response about the presence of AR, S 4 is the connection to the AR, S 5 is the data transmission in the CIS or MCIS, S 6 is the data transmission to the available resources (automated workstations (AWSs) or personal computers (PCs), and S 7 is the loading of the query dispatch to the servers of the CIS or MCIS by using a nonlocal way.Any query stream k j in the system is described as a random sequence of the vector {( i i j,i ,v , τ η ); i≥0}, where j,i η is the number of applications that are patterned like i , ν which are respectively received during the time interval , + τ τ  ë within this flow.In SIRCA, the application sample is determined by the marker i ν in the form of a binary matrix of signs [25,26] stored in the repository as well as by the state of the RE.To simplify the model, the behaviour of the random environment is described by a homogeneous Markov sequence { } i ;i 0 ν ³ of two states: of (0)  c as a flow of queries with low intensity and of (1)  c as a high flow of applications with the probability of a transition a, b 0 a b 1.
£ < << In accordance with the accepted restrictions, changes in the intensity of the flow are not frequent; therefore, the normal operation of MCIS with a low-intensity stream of applications is more typical than with a stream of a large number of queries.Thus, according to the research findings, within the time r , τ with the ME in the state ( ) r S , the intensity of queries will remain unchanged.The random elements i ;i 0 ν ³ are correlated as ( ) c ,c , and { } i ;i 0 ω ³ is a consistent set of independent random variables of a known distribution.For the model, the distribution is assumed as uniform in the interval ( ) 0,1 .The maintenance equipment at any time of 0 τ > is in a state of ( )

S
S. τ Î The control of the incoming flows of queries and the transition between the states of the ME, according to the graph and taking into account the previous comments, are described as follows: where j,i f(w) ψ = is the length of the queue in NO j down the stream k j for i=0, 1, ..., and k.
Given the decision rules axi DR(p ) [25], which determine the system states in case of threats to information security, we have received recursive dependencies for intelligent recognition of sophisticated cyberattacks, where the attacker creates a situation in which where p axi stands for signs of unlawful activities (a cyberattack) in the segment of the network; 1,s 3,s 3,s l , l , l′ are the whole values of 1,s 1 3,s 5 3,s 7 , , , µ Τ µ Τ µ Τ and j,s µ is the service intensity down the stream k j if the system is in a state of ( ) s c or ( ) h c , whereas at 3 w 1: For the probability of ( ) ( ) ( ) Given the recurrent expressions (2) through ( 6) and using the instruments of simulating the environment MATLAB 7 and Simulink, we have developed a simulation model to analyse the impact of cyberattacks on the functionality of a segment of CIS or MCIS if the attacker uses heterogeneous flows of queries in the system.

A simulation model of cyberattacks in heterogeneous flows of queries in information systems
The simulation model (for a segment of MCIS) consists of one data line and three stations (automated workstations, AWSs) that send regular claims for data transfer down the line (Fig. 3, a).The query settings are formed according to the data in Fig. 3, b: AWS1 reflects the low-intensity priority stream k 1 , AWS2 means the low-intensity flow k 2 , and AWS3 stands for the priority stream of the greatest intensity k 3 .We also assume that the time is discrete and it varies from 0 to Q S ,c ,x,y w x, n , Q S ,c ,0,y 0, n , Q S ,c ,0,y 0, w l y, ∑ some value T. The AWSs are independent of one another, and at any moment there is a certain probability of any station to send a data request or to empty the line.In the unit of traffic analysis, using the unit of threats recognition [25] and the predetermined crucial rules of axi DR(p ), it is possible to obstruct any related attacks and unauthorized network activities.The yellow colour in the diagram shows the components that are used to visualise the traffic or separate heterogeneous flows of queries to the server of a CIS.The green colour represents the components that allow changing the parameters of inhomogeneous flows of queries -the presence and length of the query queues in a CIS or MCIS, the structure of the input streams k 1 , k 2 , and k 3 , the attacking intensity change in the queries, the attack speed, and the impulse duration.
To implement the process of intellectual recognition of individual classes of threats, cyberattacks and anomalies in the simulation model via the expansion pack Fuzzy Logic Toolbox, there were drawn up the rules for the system of recognition shown in Fig. 4. a b Fig. 3.A scheme of the cyberattack simulation modelling for heterogeneous flows of queries in a segment of a CIS or MCIS: a is a segment of a CIS or MCIS (we used library components of MATLAB); b is a unit for simulating a cyberattack for heterogeneous flows of queries Fig. 5 below shows a subsystem of obstructing queries from AWSs as part of a CIS or MCIS in detecting an abnormal queue of queries coming from a terminal.
To study the possibility of detecting cyberattacks with mixed flows of queries, a simulation experiment was conducted in a segment of the computer network of a CIS.The network was working normally, and then it was subjected to an attack.To visualize the signals, we had designed a special unit -"Signal Visualization" (Fig. 6), which allowed analysing the basic parameters of the segment of the MCIS at the level of transmitted data packets, including a change in the number of queries R during the time interval t.The simulation modelling was used to study the modes of CIS or MCIS for cases of blocking queries whenever they deviate from the "normal" mode.The relevant results of the simulation modelling are presented in the next section.

The results of a simulation modelling of cyberattacks for heterogeneous flows of queries in information systems
The simulation model (SM) was used to check the validity of the results of implementing cyberattacks, such as "denial of service" and "buffer overflow", in the ICS of a CIS.The source data were the results of measuring the parameters of the received incoming streams in the SM.The simulation and the analytical calculation [3, 4, 21-23, 25, 26] of the bandwidth used in the CIS or MCIS were conducted for different sets of heterogeneous implementations of the streams k 1 and k 3 .Table 1 shows the data obtained during the simulation experiment -the time of the delay and the probability of the query loss, as well as the comparative parameters of the expected features.Accordingly, T cf.pr. and P pr. are the comparative tentative and probable parameters of the delay and loss in processing the queries, T cf.cl. and P cl. are the network parameters that were calculated by the classic method, whereas T cf.sug.and P sug.are the network parameters that were calculated by the suggested models.The analysis and comparison of the results have produced a conclusion about the adequacy of calculating the characteristics of SIRCA elements in the network segment of a CIS or MCIS.
Table 2 shows the results of a simulation modelling in terms of a cyberattack "denial of service" to the server and the workstation within the CIS model.
The average error of the calculated probability of the query loss V q as a result of such cyberattacks does not exceed the standard deviation in the frequency of losses F q in the series of the experiments.Table 2 The results of the simulation modelling of a CIS segment under a cyberattack "denial of service" The number of the modelled sessions  The analysis of the obtained results shows that the likelihood of penetrating into the system can be significantly increased if the attacker uses the tactics of assigning a high-priority status to a low-intensity flow and if the cyberintrusion is sufficiently prolonged.In this case, the attacker does not necessarily change the parameters of the flow k 3 , which has the highest intensity and the top priority in the system (Fig. 8).
Fig. 8.The distribution of the total (P0 and P1) and average (Pc0 and Pc1) flows of queries when attackers create heterogeneous queries Fig. 9 shows a graph of the dependence of the theoretical estimation of the query loss probability in a CIS or MCIS on the number of steps in the suggested iterative procedure of ( 2) through (6).The present graph suggests a conclusion about the required number of iterations for a given accuracy of the simulation model.
During the study, we found that the likelihood of solving the problem of recognising sophisticated cyberattacks in heterogeneous flows of queries as well as network types of cyberattacks constituted 85-98 %, depending on the type of the attack.
The analysis of the results of the simulation experiment allows making a conclusion that the suggested model of recognising sophisticated cyberattacks in non-uniform flows of queries is more accurate, by 5-7 %, than the other existing models.
Fig. 9.The dependence of the theoretical estimation of the probability (Р) of losing a query (application) on the number of steps (n) in the iterative procedures Thus, a successful cyberattack at the information resources of a CIS or MCIS, especially of the "denial of service" type, does not necessarily create a large number of queries to the server or reduce the traffic bandwidth.There is a fairly high probability of success in exploiting the system vulnerability by creating a low-intensity priority flow and changing its parameters such as the package speed (lowspeed attacks) or the impulse duration, etc.
According to preliminary estimates, the developed simulation models make it possible to reduce by 25-30 % the time for setting up SIRCA projects for a CIS or MCIS.

Discussion of the model testing results and prospects for further research
The described models of implementing cyberattacks with mixed flows of queries in CIS or MCIS are not only of independent practical interest, but they are an example of a possible formalisation of describing other complex scenarios of cyberattacks.
It has been determined that Markov models of processes are widely used in the analysis and synthesis of CIS and MCIS, and their properties set certain limitations to the real signals used, but this is quite sufficient to develop meaningful methods of analysis and synthesis of complex cyberdefence systems.As each state of the system can be characterised by a set of values of quantised digital signals that are typical of i S , the quantity of gradations -the quantisation levels -in the signs of cyberattacks in the SIRCA system acts as a universal set whose capacity is equal to the maximum quantisation level, characteristic of a particular model.
The downside of the model is cumbersome calculations, which complicates the practical use of the system of Markov chains in modelling the considered processes.However, the exponential approximation simplifies the estimation of the cyberattack probability.
The presented approach allows making quantitative estimation of the probability of network threats and attacks in the computer networks of CIS or MCIS with regard to the time factor and, thereby, increases the validity of measures to protect information.
Scientific and practical research in the form of hardware and software applications and educational materials during the years of 2014 and 2015 were introduced at the state enterprise "Design and engineering office for automating control systems at the Ukrainian railways" of the Ministry of Infrastructure of Ukraine, as well as in the information security service of the computing centre of the Near-Dnipro Railways and the State University of Telecommunications as part of the research project "Safety-05P".
The results that were previously presented in [25, 26] and the results of the tests of the individual modules of SIR-CA have facilitated the development of a decision-making support system and an expert system, and the repository of the cyberattacks' patterns has been expanded.

Conclusion
1.The study was focused on developing a model of intelligent recognition of sophisticated cyberattacks, which, unlike the existing ones, takes into account the change in the intensity of the incoming flows of queries in information systems.It helps assess the quality of a CIS functioning with regard to a possibility that attackers will change the parameters of the cyberattack.
2. The tests and the justification of the suggested model were carried out by using simulation modelling in the environment of MATLAB and Simulink.It has been found that the suggested model of recognising sophisticated cyberattacks is by 5-7 % more accurate than the other existing models if attackers use non-uniform flows of queries.The developed simulation models enable a 25-30 % decrease in the setup time for projects of cyberdefence systems, including SIRCA for CIS or MCIS.

Introduction
As international literary sources [1,2] claim, by the year 2020 and in future, 5G mobile communication systems will be able to provide mobile users with unlimited high-speed access to information at any place and any time.To achieve the set goal, a considerably large variety of applications and devices is needed and networks of mobile communication and broadband wireless access currently have them.Due to this fact, there emerged a necessity to implement longterm technological methods in 5G systems aimed at solving problems of mobile user access and issues of effective link resources utilization.The key long-term technological solutions implemented in the mobile communication systems of the 5th generation are [1,2]: -application of evolutional massive (multi-dimensional) multi-antenna MIMO technologies; -ability to effectively use the modes of dynamic 3D-beamforming.This will allow considerable increasing the signal power for remote users in high frequency bands and improving coverage in ultradense micro-and picocells; -application of micro-, pico-and femtocells in areas of ultradense user location, which decrease the load on macrocells, with the division of transmission of user traffic and control signals between macro-and microcells in different frequency ranges; -implementation of the full duplex in common bandwidth (transmission and reception are on the same frequencies);

M . M o s k a l e t s
PhD, Associate Professor* Е-mail: mykola.moskalets@nure.ua

S . T e p l i t s k a y a
PhD, Associate Professor* Е-mail: svitlana.teplytska@nure.ua*Department of Telecommunication Systems Kharkiv National University of Radio Electronics Nauky ave., 14, Kharkiv, Ukraine, 61166

7 S
, the servicing is performed for the stream k 3 .According to the graph, with each r=1, 2, 3, or 4, the state ( ) r S becomes the state( )

Fig. 7 ,
Fig. 7, 8 show the main results of modelling heterogeneous query flows k 1 , k 2 , and k 3 in an MCIS.Therefore, in the case of creating heterogeneous priority flows of queries in an MCIS, the data processing time increases 1.5-3.5 times.

Fig. 7 .
Fig. 7.The distribution of the total (P0 and P1) and average (Pc0 and Pc1) flows of queries during normal operation of a segment of a CIS or MCIS