ASSESSMENT OF PERFORMANCE OF A DISTRIBUTED INFORMATION SYSTEM BASED ON TIME PROFILE

The problem of determining performance parameters for distributed information systems, which are formed by heterogeneous hardware, is examined. The information system, designed to perform distributed computing, based on the interaction between elements, is presented in the form of a combination of operation processes of software objects (agents) in interaction with operators. To provide a possibility of algorithmic and quantitative analysis of the system’s operation process, the authors used time diagrams, which can be obtained based of time profiles of serial and distributed actors. Construction of the time profile of an actor is provided by knowledge of the sequence of performed actions and the average time to perform each action. Based on the knowledge of a typical time profile, estimations of different kinds of performance were obtained and the ratio, linking the main forms of performance to a variety of quantitative characteristics of the system, was derived. As the key indicator of performance assessment, it is proposed to apply throughput, which for a serial actor is the ratio of loading of an actor and total time of actions per task. It was shown that within a single distributed actor, total throughput remains constant regardless of redistribution of the number of tasks between serial actors. As another performance indicator, it is proposed to apply response time of the system as the average time to fulfil a task by an actor. Relationship between response time and throughput of the system was established analytically. It was determined by modeling that at an increase in throughput, response time of the system decreases, reaching 0 at a certain ratio of throughput and the number of fulfilled tasks. The introduction of these ratios, in addition to the key indicators, also makes it possible to determine derivative parameters of the system, such as minimal computation time, average time of waiting for requests and amount of memory, required to fulfil tasks. In this case, it was determined that the minimum computation time is a magnitude, dependent on the capacity of an actor and the number of actions performed, as well as the ratio of computation time and exchange time. The average time of waiting for a request is the difference between total operation time and direct time for fulfilling the task by actors. The amount of required memory is determined based on knowledge of amount of memory, involved in performance of certain processes and atomic operations. Presented ratios make it possible to evaluate quantitatively parameters of distributed information systems and to synthesize systems with assigned parameters of throughput and response time


Introduction
Most modern information systems of enterprise resource planning (ERP-systems) are based on software/hardware complexes, distributed in space, intended for the coordinated solution of various tasks [1].When creating such information systems (IS), similar to any project, there is always the risk of unsubstantiated resource consumption.A priori
Regulatory documents define requirements for preliminary calculations of technical and economic feasibility of IS creation at the design stage.At the same time, the current practice of IS performance assessment is only indicative.Under such conditions, it is difficult, and sometimes impossible, to make a decision regarding the expediency of funding a project of IS creation or improvement.The most rational approach to making an adequate decision regarding the feasibility of funding a project is to obtain accurate estimations of performance of a would-be IS based on the application of prognostic models.Knowing the prediction of IS performance, it is also possible to accurately determine economic indicators of a would-be IS.
These provisions are becoming even more relevant in the case of considering systems with distributed architecture -distributed information systems (DIS).The constant growth in computational capabilities of hardware platforms and non-uniformity of upgrading computer equipment at an enterprise leads to the situation when the material basis for DIS today is made up of hardware with non-homogeneous capacity.Operation of these tools should be coordinated taking into account productivity and individual characteristics of the software structure.In the case when mobile hardware platforms are employed, the existence of substantial weight and dimension limitations necessitates designing software taking into consideration computational capabilities of the equipment.
In addition, the features of application of the specified DIS include: -multi-level hierarchical nature of architecture; -lack of general centralized control over a computing process; -lack of coordinated time in a system; -existence of conflicts and shared resources.Given this, it is a relevant problem to theoretically substantiate the processes of DIS operation and to estimate performance efficiency taking into consideration technical parameters of the constituent components.Another aspect is the need to consider the possibilities of DIS implementation on different hardware under conditions of limited hardware capabilities.

Literature review and problem statement
The vast majority of theories of construction of distributed information systems do not cover all the above features comprehensively.Existing theories of distributed computations were created primarily for the purpose of constructing a mathematical apparatus to specify the behavior of DIS elements and to study their equivalent transformations.In addition, the problem of determining DIS performance involves construction of such mathematical models, which would make it possible to vary the feature space while constructing prognostic factors.One of such approaches is based on the use of a calibration method [2].However, the proposed model is difficult enough for scaling in the case of description of large systems, including hundreds and thousands of nodes.
Traditional models for parallel systems imply a stable composition of computational space and are not applied for the description of distributed systems, which is why performance of distributed applications is described, as a rule, at a qualitative level.Technologies of distributed computations [3], which are used for solving long-term problems involving heterogeneous computing resources of an enterprise, have appeared and have been developed recently.A specific feature of such DIS is a heterogeneous and dynamically-variable structure of resources -computational nodes can be connected to the system and leave it at any time during operation of the system.At the same time, although the model in [3] takes into consideration some features of mobile DIS, it does not make it possible to quantitatively estimate the overall performance of such a system.
Paper [4] proposed an approach to assessing productivity of distributed computing systems, which includes nodes of various capacities.In this case, the nodes participate in calculations in accordance with a specific schedule -the function that takes value 1 at the moments when a node is allocated for calculations, and 0 otherwise.Limitation of the model [4] is determined by the necessity of the a priori knowledge of the system's schedule.In paper [5], for such systems with a "schedule", authors propose a procedure for performance analysis, based on a comparison of the system's operation to that of a reference hypothetic system.Variation of the reference system allows obtaining quantitative assessment of various characteristics, which was implemented in practice in BNB-Grid system [6].At the same time, the process of determining parameters of a reference system is performed at the discretion of researchers.In addition, such an approach does not make it possible to a priori, based on analytical calculations, calculate quantitative characteristics of the system.
In article [7], it is proposed to estimate performance of DIS with a service-oriented architecture by taking into account a dependence of the throughput and volume of the buffers of the base telecommunications network on the probability of loss of requests in a system.Determining the parameters is carried out by modeling using the hierarchical colored Petri networks, although the proposed apparatus does not make it possible to perform an algorithmic analysis of the network operation.
The research in the area of the theory of distributed computations and parallel processes is the closest to theoretical solution of the problem of constructing a model of distributed process of DIS functioning.Thus, in paper [8], assessment algorithms are based on the estimation of time, required for solving the problem, which depends on a number of factors, such as the architecture of a system, environment of parallel computations, properties of the utilized software.At the same time, the authors point out the difficulties when it comes to comprehensive consideration of these factors.In article [9], the emphasis is on the assessment of properties of flows and the impact on the overall performance of a system.However, one can observe some difficulties taking into account the issues of flows' synchronization, especially at the joint use of objects and variables, data distribution between flows, and coordination of computational load.Article [10] gives a comparative analysis of performance assessment technologies, which are used in the modern paradigms of distributed computing (Cloud Computing, Jungle Computing and Fog Computing).It is noted that, although each of these paradigms has its own tools of performance assessment, they all are post-factor and require sufficient statistics of the system's operation.In this case, the a priori performance assessment remains a challenge.
In addition, these studies do not take into consideration individual hardware features of the elements of a system, behavior in the operation process and internal communication capabilities of the elements of a system.
Thus, still unsolved is the problem of comprehensive estimation of DIS productivity taking into account the features, which were indicated previously, of construction of a multilevel architecture, lack of centralization and coordination of time, the existence of conflicts and shared resources.

The aim and objectives of the study
The aim of present research is to develop a procedure for assessing performance of elements of a distributed information system, deployed on heterogeneous hardware, which would enable carrying out both algorithmic and quantitative analysis of operation processes.
To accomplish the set goal, it is necessary: -to determine the structure and develop description of the operation process of an information system, designed for distributed execution of tasks; -to develop an approach to evaluation of the main kinds of DIS performance; -to identify key performance indicators and the order of their computation; -to carry out DIS modeling according to key performance indicators.

Structure and operation process of a distributed information system
The structure of a human-machine system, designed to perform distributed computations (distributed fulfillment of tasks) based on information interaction between the elements of a system can be shown in Fig. 1 [11].
The operation process of such DIS is determined by interaction of operation processes of program objects (agents) А i,j , i=1…m, j=1…n i , which operate under the guidance of containers K i,j , i=1…m, j=1…n i , with operators O i,j , i=1…m, j=1…n i and logical environment of the platform of existence П i,j , i=1…m, j=1…n i in a certain physical environment of a distributed information system.Totality П i,j , O i,j , А i,j , K i,j , i=1…m, j=1…n i forms a serial actor B i,j , i=1…m, j=1…n i .In turn, the totality of serial actors forms a distributed actor.
Schematic concept of this interaction can be described as follows.Each operator O i,j , i=1…m, j=1…n i performs a certain sequence of actions, determined by its operation schedule.In this case, it operates in the software environment, installed on its plaftorm П i,j , i=1…m, j=1…n i (software of ERP-system, distributed applications, etc.).
The sequence of actor's operations is made up of separate steps s i,j,k , k=1…K i,j , j=1…n i , i=1…m, where K i,j is the total number of steps of the j-th actor of the і-th level, that are registered by the logical environment of the platform and are visible for its agent А i,j , i=1…m, j=1…n i .
Based on the actions of the operator, the agent, belonging to it, determines the logical sequence of its actions.A part of these actions involves contact and information transmission to other agents, and another part -to the operator.Addressing agents and an operator is accompanied by transmission of messages of an appropriate type (Fig. 2).
Such an approach makes it possible to visualize the operation of a system in the form of the network graph, each step of which can be subsequently assessed in terms of its duration.Totality of time segments that correspond to specific steps with regard to downtime of elements will determine total duration of the system's operation.The ratio of operation duration and total application time makes up the essence of the concept of system performance.In this case, the arrows on the graph (Fig. 2) can be interpreted as the process of control transfer in the network.Fig. 1.Structure of a distributed information system (distributed actor)

Key performance parameters of a distributed information system
To provide the possibility for evaluation of algorithmic and quantitative aspects of DIS operation, we will use the approach [12, 13], where the basic performance types can be obtained based of the time diagram of operation of a computer system.That is why the goal is to obtain the time diagram of the system's operation.This diagram can be described using the time profile.
Time profile of operation of distributed system CS will be vector function where g i (t), i=1…n is the time profile of serial actor SE i ∈CE.Time profile of serial actor SE i will be definite function g i (t), determined on R, with the region of values -a set of tuples having the form of <p, s, q, a i >, where p∈P is the process, s∈S(p) is the step, q∈cx(s) is the action, a i ∈w(q i ) is the atomic process of serial actor SE i .The task of construction of G(t) for system CS is divided into two stages: 1.For each SE i , to determine the sequence of performed actions.
2. For each action, to determine the time, required for given SE i to perform it.
An example of the time profile of a serial actor while working with the geoinformation application of DIS is shown in Fig. 3.
Here, we consider a certain process p: <navigation>, including the steps of operation of DIS elements: s(p)=<motion, analysis, action, waiting>.
Each step implies a certain totality of operations in system cx(s)=<object formation, activation, computation,…> that determines functionality of agents in a system.In turn, each action consists of the set of atomic processes, performed by the machine during implementation of actions w(q i )=<start, request,…, stop, pause>.
Each atomic process has some duration of its execution, the total duration of totality of atomic processes will determine the total time for program execution while solving certain problems.
Knowing the time profile of behavior of the system, it is possible to obtain evaluations of different performance types, as well as to derive the ratios, linking the main performance forms to various quantitative characteristics of the system's operation.
Let us have distributed system CS, which consists of Q serial actors.We will introduce variables that characterize CS operation: T is the duration of operation period of CS for the astronomic time; J is the number of tasks, fulfilled by CS within observation period.
The task can be implied as performance of an operation, a function, a step, an order, etc.
We will designate: = J X T is the throughput of system CS; U i is the amount of astronomic time, which the i-th serial actor from CS spent fulfilling a task from J; T is the load of the i-th actor; N i is the total number of actions (an action is a component of a task), performed by the i-th serial actor within period of time T.
is the average operation time of the i-th serial actor (it is actor's capacity, which depends on performance of its hardware and the process organization); is the average number of actions of the i-th object per task from J.
In these designations, the following statement will be true.
Throughput of the i-th serial actor i=1...Q can be determined from ratio To verify this expression, it is necessary to perform: As a result, This ratio is similar to the equation of saving, which is often used in models of physical systems.
An example of dependence of throughput X on capacity Pw i and the average number of actions of En i at W i =1 is shown in Fig. 4.
As we can see, an increase in the number of actions per task En i or a decrease in capacity (an increase in time Pw i ) definitely leads to a decrease in the throughput of an actor.Let us also determine the ratio between capacity of serial actor Pw i , the number N i of the actions it performs while solving a problem and the ratio of computation time U i and exchange time Тo i : where Т ≥Тo i +U i .Since the service now is a solution of one task, We will determine the minimum computation time, which must be allocated for one exchange at given throughput.
To do this, we will represent Тo i =m•Tt i , that is, split exchange time into m parts.According to the introduced designations, it is necessary to determine g = .From (2), we obtain: .
Hence, expressing g, we obtain: Now the task regarding splitting the tasks on distributed system CS can be stated as follows: 1. Given: N is the total number of actions to be performed within time Т; 2. Required: to distribute these actions between Q actors so that: Knowing time profile G(t), it is possible to obtain the set of operation variables W, N i , J and T. Let us take fulfillment of a process in a system as a task, and atomic process of a serial actor as an action.Then: where ( ) where δ i (t) is the function that takes value of 1 at the moment of a change in value of g i (t) and equals to 0 at the rest of the moments of time.
Another important kind of performance is the system's response time.Let N agents, which receive requests, be located on CS.As a task, we will take request fulfillment by the j-th agent ( j=1…N).Let us assume that N does not change within the observation period.We will introduce the following designations: J is the number of fulfilled requests within observation period Т; J' is the number of tasks that got into the system within observation time;

N k Z T r k J
The relationship between response time and throughput of a system can be determined based on ratio: which can be proved as follows: ( ) ( ) .

N N k k z k r k NT
In the case when CS is a serial actor, or, using the previously introduced designations, it can be written as hence, we obtain the proof that it is true Using the previously introduced determination of throughput, we obtain: An example of dependence of response time R on X and ratio ′ J J for N=5 and Z=2 is shown in Fig. 5.As we can see, at an increase in the throughput of system X, response time of the system decreases, reaching 0, when Fig. 5. Dependence of response time of the system on throughput and relative number of failed requests Based on the two examined kinds of performance, we will determine the characteristics that describe usage of resources (memory) in the system.Let function Stg(p,t) determine the number of units of memory, engaged in process p at moment t.We will designate:

Str m p J
where N is the number of agents that are present at moment t=0, that is, we do not take into account dynamics of agents' functioning, and the sum is taken by all p in the system.Then the amount of memory, used in the system at moment t, is equal to In these designations, average amount of memory, used in the system, can be expressed as or, using expression for m(t): .

M Stg p t t T J J Stg p t t m p m p T J T J
Hence, we obtain Using the formula of response time and substituting expression X from (3) in it, we obtain The indicated ratios make it possible to give quantitative assessment of the parameters of distributed information systems and to synthesize systems with assigned parameters of throughput and response time.

Discussion of results of application of the procedure for performance assessment of elements of the distributed information system
Application of time profiles for solving the problem of analysis of performance of the distributed information system allows solving the problem of the a priori assessment of parameters of designed distributed systems with the purpose of deployment on heterogeneous hardware platforms.As it is evident from the conducted research, initial data for computations are time indicators of performing of elementary operations while fulfilling the tasks by the system.These indicators directly depend on technical characteristics of hardware platforms of actors and directly determine performance parameters of the system as a whole.The accuracy of determining of the system's performance will depend on accuracy of determining of time for performing separate atomic operations.
Knowing time profile of the system, it is possible to obtain the key performance characteristics -throughput and response time.Obtained ratios allow us to estimate minimum computation time, required for one information exchange with regard to the limit of the total time for solving a problem.In addition, the analytical relationship between technical performance of an actor, the number of actions it performs during the program operation and time of exchange and computation was established.Application of the proposed procedure is particularly necessary at the early stages of DIS development with the aim of optimal design of systems' software.
The created model is a theory that describes dynamics of interaction of processes of applied programs with operation of physical environment of the performance platform.Both a hierarchical structure of a computation system and distributed nature of logical and physical environment are displayed in it.In this case, behavior of the process of fulfilling a program by a distributed actor does not depend on operation time of serial actors, capable of functioning independently on one another.
The general problem of performance analysis is represented in a mathematical form of the problem of construction of the time profile -the vector of the function that describes the time diagram operation of a distributed computing system.In addition to possibility of evaluation of the main types of performance (throughput and response time), the proposed approach also allows estimation of other effectiveness indicators: minimum time of computations that falls on one exchange at an assigned total computation time; ratio of exchange time and computation time, etc.
The benefits of the present research include simplicity and accessibility of the proposed approach to determining performance of distributed systems based on construction of time profiles.As the construction of time diagrams requires only knowledge of time to fulfil separate elementary operations, after which, thanks to simple arithmetic operations, the total operation time of an actor is calculated and correlated with the total operation time of the system.Application of presented ratios is particularly useful at the early design stages, as well as during verification of models of computing systems.
Calculation of DIS hardware parameters, based on the knowledge of the timeframe to fulfil elementary operations by serial actors makes it possible to avoid major drawbacks, inherent in the classic approaches to construction and assessment of DIS performance.
Thus, the proposed model does not require determining of prognostic factors, required for implementation of the approach based on the use of the calibration method [2].In this case, the proposed model can be scalable for the sufficient number of actors, which is limited only by existence of primary information regarding operation parameters.
In addition, the proposed model agrees well with the paradigm of distributed computation of long-term problems with non-regular actors [3].In this case, the variable parameter is the number of serial actors and their hierarchical subordination to the distributed actor.
Unlike the models of performance assessment in systems that operate by schedule [4,5], the proposed model does not require a priori knowledge of the operation schedule and determining of standard parameters of the process.However, the model still requires knowledge of duration of elementary operations that constitute the essence of the task that is being fulfilled.
Comparison of the proposed approach with modeling with the help of hierarchical colored Petri networks [7] indicates that there is no necessity of a priori sufficiently complex calculations of probability of requests' loss in the system.And, compared with modern distributed computing technologies (Cloud, Jungle and Fog Computing) [10], there is no need to collect sufficient amount of statistics of system's operation for its post-factor assessment.
In addition, the proposed approach takes into account individual hardware features of the elements of a system, behavior in the operation process and internal communication capabilities of individual elements.
One of the challenging issues of application of the model of operation process based on time profiles is the necessity of clear knowledge for each of the actors of: -sequence of actions performed; -time to perform elementary operations that make up each action.
In addition, it is also necessary to have knowledge regarding redistribution of tasks (control transfer), performed by serial actors that form a certain distributed actor.
At constant expansion of the range of tasks, complicating of procedures for their coordination between actors, it causes the growth in computational complexity exponentially and will serve as a limiting factor for subsequent application of the model.The way out of this situation can be generalization (aggregation) of separate standard units of actors and the tasks they fulfil.
The direction for subsequent research in the framework of the proposed approach can be models and methods of operation of distributed systems and parallel computations taking into account algorithmic and structural features of the physical and logical DIS environment.A priori knowledge of operation parameters will allow not only assessment of DIS performance, but also creation of conditions for development of protocols of joint operation of DIS elements, redistribution of tasks, optimization of control, etc.
One of the challenging issues that require separate studies is research into the influence of technical parameters of a computing system on performance parameters.In fact, architecture and hardware parameters (CPU, memory) can significantly alter the time profile of fulfillment of the same tasks.Therefore, these aspects require separate (individual) consideration.

Conclusions
1. We determined the structure and developed description of the operation process of an information system, designed for distributed fulfillment of tasks.In this case, it was shown that it is appropriate to present the operation process of a distributed information system as interaction of operation processes of software objects (agents) with operators and logical environment of the platform of existence in a certain physical environment of the system.The totality of platforms, operators, agents and appropriate containers forms a serial actor.In turn, the totality of serial actors forms a distributed actor.The sequence of actions of an actor consists of separate steps, which are registered by the logical environment of the platform, which allows constructing the time profile of its operation.
2. Construction of time profiles for actors makes it possible to assess the main kinds of performance of a distributed information system.The task of construction of the time profile is fulfilled at two stages: the sequence of performed actions is determined for each serial actor; the time, required by a given actor to complete an action, is determined for each action.Time profile is constructed based on knowledge of time for performance of atomic operations.
3. It is proposed to use throughput of a serial actor and response time of the system as the key indicators, according to which performance of a distributed information system is assessed.Based on these indicators, there is a possibility to determine the minimum computation time, average task fulfillment time, and average time of waiting for a request.Thus, the system of performance indicators of the information system is formed, which makes it possible to carry out both algorithmic and quantitative analysis of its operation processes.
4. Modeling of the distributed information system according to the key performance indicators proves validity of the proposed approach and allows clear assessment of the relationship between the key performance indicators.Thus, the research into dependence of throughput on the capacity of an actor and the average number of actions per task indicates that an increase in the number of actions per task, or a decrease in capacity definitely lead to a decrease in throughput of an actor.In turn, an increase in throughput leads to a decrease in response time of the system.
The direction for further research in this area can be a wide range of problems of theoretical substantiation of parameters of operation processes of the elements of distributed information systems.

Introduction
Intensive exploration of mineral deposits, transportation and construction of oil and gas pipelines is impossible with-out modern earth moving machinery, capable of developing strong, frozen and rocky soils.
Taking into consideration that frozen and hard soils have elevated strength, exploration by the earth moving machines

DEVELOPING A MATHEMATICAL SUBSTANTIATION FOR THE PHYSICAL MODELLING OF THE SOIL-RIPPING EQUIPMENT WORK PROCESS L . P e l e v i n
PhD, Professor* Е-mail: pelevin_leonid@ukr.net

I e . G o r b a t y u k
PhD, Associate Professor* Е-mail: gek_gor@i.ua

S . Z a i c h e n k o
Doctor of Technical Sciences, Associate Professor Department of Electromechanical equipment energy-intensive industries National Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic Institute" Peremohy ave., 37, Kyiv, Ukraine, 03056 Е-mail: zstefv@gmail.com

Fig. 4 .
Fig. 4. Dependence of throughput of the system on capacity of an actor and average number of actions per task r(k) is the time of useful work of the k-th agent; z(k)=T-r(k) is the downtime of the k-th agent.Average time of task fulfillment can be expressed as PhD, Associate Professor** Е-mail: vadshal@i.ua*Department of Construction Machinery*** **Department of Foundations of Professional Training*** ***Kyiv National University of Construction and Architecture Povitroflotskyi ave., 31, Kyiv, Ukraine, 03037 twhere t k is the moment of process fulfillment, for example, addressing Stop in profile g i .
p (t) is the characteristic function of the step of the process p, i. e. χ p (t)=1 if g i (t)=<p.s.q.a>, or χ p (t)=0 otherwise.