DEVELOPMENT OF THE METHOD OF DISTANCES FOR PROCESSING EXPERT ESTIMATES IN INFORMATION SYSTEMS

Many businesses and organizations in today’s world today constantly face competitive environment, which encourages looking for new methods and tools for performance efficiency. This requires the mechanisms for identifying drawbacks and insufficiently effective processes at enterprises. The most common means include the monitoring of operational quality of all divisions at an enterprise, as well as reporting [1, 2]. However, these time-tested tools cannot always reveal such hidden problems as customer dissatisfaction with the work of a certain division, suppliers’ irritation regarding the speed of interaction, etc. That contributes to spoiling the image of the enterprise and, as a consequence, reduces its competitiveness. In order to identify those problems that cannot be defined through the internal monitoring of structures’ work at organizations and enterprises, other tools and methods are required, which necessitates new research and developments in this area. One of such ways is to conduct external and internal assessments of different objects in the activity of enterprises for different indicators. Such estimates require processing and analysis of results to derive the ultimate estimate at sufficient accuracy. Organization and estimation techniques are of different types; therefore, the methods for processing their results differ as well and thus need various studies. Assessment of performance indicators in most studies is performed by a small number of experts, selecting whom is a difficult and time-consuming task. In addition, many assessments are performed by comparing them. However, at present many enterprises, for the self-estimation of indicators of their activities, require a rapid and independent assessment, not only by experts, but by customers, suppliers, employees, as well as other stakeholders. One of the important tasks of today is the assessment of various performance indicators of enterprises by using modern rapid and effective tools and methods for processing results. One of such tools is the information technology, which makes it possible to perform an independent assessment. However, such an assessment at present provides mostly statistical reports with quantitative indicators while the reliability is not checked; there are almost no methods to check it. DEVELOPMENT OF THE METHOD OF DISTANCES FOR PROCESSING EXPERT ESTIMATES IN INFORMATION SYSTEMS


Introduction
Many businesses and organizations in today's world today constantly face competitive environment, which encourages looking for new methods and tools for performance efficiency. This requires the mechanisms for identifying drawbacks and insufficiently effective processes at enterprises.
The most common means include the monitoring of operational quality of all divisions at an enterprise, as well as reporting [1,2]. However, these time-tested tools cannot always reveal such hidden problems as customer dissatisfaction with the work of a certain division, suppliers' irritation regarding the speed of interaction, etc. That contributes to spoiling the image of the enterprise and, as a consequence, reduces its competitiveness.
In order to identify those problems that cannot be defined through the internal monitoring of structures' work at organizations and enterprises, other tools and methods are required, which necessitates new research and developments in this area.
One of such ways is to conduct external and internal assessments of different objects in the activity of enterprises for different indicators. Such estimates require processing and analysis of results to derive the ultimate estimate at sufficient accuracy. Organization and estimation techniques are of different types; therefore, the methods for processing their results differ as well and thus need various studies.
Assessment of performance indicators in most studies is performed by a small number of experts, selecting whom is a difficult and time-consuming task. In addition, many assessments are performed by comparing them. However, at present many enterprises, for the self-estimation of indicators of their activities, require a rapid and independent assessment, not only by experts, but by customers, suppliers, employees, as well as other stakeholders.
One of the important tasks of today is the assessment of various performance indicators of enterprises by using modern rapid and effective tools and methods for processing results. One of such tools is the information technology, which makes it possible to perform an independent assessment. However, such an assessment at present provides mostly statistical reports with quantitative indicators while the reliability is not checked; there are almost no methods to check it. Therefore, it is a relevant task now to develop and investigate processing techniques of expert estimations, which would make it possible to evaluate the reliability of results and their maximum approximation to the true one at a predefined accuracy.

Literature review and problem statement
At present, analysis of processes or resources at enterprises often requires assessment in terms of different indicators: quality, efficiency, degree of utilization, and others. In most cases, such an assessment implies a point-based score that makes it possible to perform the mathematical processing of assessment results. An estimate can also be verbal (with a verbal description of the object's quality), in this case, however, each description can be matched against a certain score, thereby forming a point-based estimate again.
In general, assessment of indicators can be performed using the following tools: -thorough examination and estimation of each indicator according to the results of monitoring or by employing selected experts. However, such technique requires considerable resources and time for monitoring or for expert selection and assessment [3]; -using on-line polling. Such an assessment is quite fast and cost-efficient, but there is an issue on its reliability. Of special interest is the self-assessment of estimated objects by people who apply information systems, which make it possible to conduct an independent, fast, and economically-wise assessment.
As early as 2001, paper [4] reported a research based on two estimation methods: self-assessment by employees and expert estimation of the influence of chemical substances on humans. The result was the obtained shift in self-assessment, compared to expert estimation, by less than 17 percent in 9 out of 10 studies groups composed of 369 employees and experts. The two methods that are being considered and compared in [5] imply determining the proximity of results from assessments in terms of reliability and the specified points. This testifies to the expedience of performing self-assessment, especially if this involves modern information technology.
Another factor in the appropriateness of performing self-assessment is the complexity in identifying the reliability and objectivity of experts' estimates, who must be found and engaged in the process of estimation. Many studies [6,7] address methods for selecting and analyzing experts and estimates, which in most cases is a challenging task. Papers [8][9][10] examined methods for determining the weight coefficients of experts based on their experience, which implies determining special coefficients for calibration and for an informational component, as well as determining methods for obtaining reliable estimates by experts. This task is important for problems that require rather significant reliability and accuracy for predicting and making important decisions involving risk assessment.
However, there are many tasks that do not need considerable accuracy of estimation, while they require efficiency and objectivity. An example is the assessment of pupils' training quality at an educational establishment, which can be performed at any time based on the selected indicators: modernity, imagery and accessibility of perception, combination with practice, and others. An educational establishment can carry out such a self-assessment and identify prob-lematic issues related to training quality by specific teachers, courses, areas of study, and the educational institution in general. This will help improve operational decision making to enhance the quality in the future and, in some time, to perform a repeated self-assessment to verify the appropriateness of decisions made. Such tasks can be implemented using on-line estimates, which are independent, and, therefore, quite objective. At the same time, such an assessment is very rapid as it does not require a complex and long-term process of expert selection.
There are interesting studies on generating the systems of "on-line" test questionnaires with external estimation of quality that makes it possible to constantly update them on-demand [11]. At the same time, such an assessment requires statistical and other methods for processing assessment results, which have for many years been used and perfected [12], which is important for determining the reliability of the results obtained.
A pairwise comparison of estimated objects is another assessment technique to determine the ranks of related objects based on a given indicator. The processing of such methods of estimation has been addressed in many scientific works [13,14]. However, these methods may experience problems with solving conflicts on transitivity [15,16]. In addition, such a technique is not suitable in the cases of individual assessment of each object without comparison.
Given the above, it can be argued that there are many practical problems at present that require an economically-efficient, quick, and fairly accurate estimation of performance indicators for enterprises and institutions, both externally and internally. For most of these tasks, it is not essential to obtain the maximum accuracy of estimation, while more important is the speed of its performance and the objectivity of assessment. To implement these tasks, development of new tools and estimation methods is needed now [17], which would ensure meeting these requirements.
The new methods today include on-line assessment, which has been addressed in many studies on the construction of a system of an estimation scale [18], an analysis of the results obtained by using such an estimation [19], and others.
However, the unresolved issue is the maximum reliability of on-line estimation. A comprehensive approach is needed to come closer to solving this task. This necessitates developing, first, methods to identify experts that are unaware of an estimated subject, "random" and malicious, and, second, methods for the maximally reliable processing of estimation results at a predefined accuracy of results.
Both tasks are complex and not studied enough for their practical application today. Thus, many works addressed the selection of experts [7], the use of estimates for constructing models [6], monitoring the achievements based on documentary indicators and stages in its performance [2], which is not relevant for on-line assessment. Therefore, exploring new methods for on-line assessment, and especially results processing, will make it possible to solve the problem of reliability; this is relevant at present for practical application and implementation of the new systems of control and management.

The aim and objectives of the study
The aim of this study is to develop a method of distances for processing the results of expert estimation of indicators, which will be particularly effective for the class of problems that permit a deviation in the accuracy of result up to 17 % or larger and can use automated information systems for estimation.
To accomplish the aim, the following tasks have been set: -to theoretical substantiate the method of distances for processing results of the assessment of indicators; -to investigate a convergence rate of the iterative process in the method based on the number of iterations; -to perform a comparative analysis of results from estimating the method of distances, the method of square deviations and the mean value, and to determine the mean efficiency of the method of distances.

Theoretical substantiation of the method of distances for processing results from the estimation of indicators
The development of information technologies increasingly produces new methods of on-line assessment, which do not make it possible to estimate the reliability of experts by examining their documents or the history of their achievements, only based on their answers and the estimates that they provided. Therefore, an important task is the statistical processing and analysis of estimates to determine the reliability of experts' answers.
One of the common estimation techniques is to apply a point-based scale, the use of which in most cases is dominated by a cluster (density) of estimates over a small range. This circumstance has been exploited when developing the method of distances for a more accurate determination of the group estimate.
While developing a method of distances, we have chosen that the result from estimating a single object is the value that is the "closest" to the "cluster" of estimates. In addition, this method was supplemented with a condition for defining the competence of each expert engaged in estimation, on the basis of which we adjusted the result of estimation towards the "weightier" (competent) experts.
The concept of maximum closeness to the cluster of estimates is defined as the minimum distance of all estimates in totality to the average weighted estimate. Weight of each estimate is defined by the competence of the expert who produced it. Competence of experts can be determined by different techniques: by additional survey and/or the proximity of their estimate to the weighted average estimate.
The method of distances is based on the iterative calculation of the weighted average estimate, where weights are the calculated coefficients for experts' competence (CC). CC are based on the relative proximity (distance) of each estimate to the preliminary average weighted estimate.
Assessment can be performed based on different scales, but in order to calculate it, it must be normalized in the range from 0 to 1. Assume that the assessment scale is in the range from A to B with step d. Then for the new range from 0 to 1, the new step is The magnitude of each point for calculation then equals x k =kd 1 where k is the number of the point.
At the beginning of calculation, the weights of all estimates are equal and are determined as the competence coefficients of experts who produced these estimates. Thus, if the number of estimates equals m, then all weights and CC at the beginning (iteration k=0) are equal to: Calculation of the weighted average estimate is performed in several iterative steps. At every step, the average weighted estimate is computed first. If one denotes each estimate as x ji , where j is the number of the object ( j=1¸n), i is the number of an expert (i=1¸m), then the weighted average estimate at the k-th iteration is equal to: For the next step, one needs to calculate new CC (q ki ), which are determined by the proximity (distance) to the calculated weighted average estimate and the sum of which is equal to 1, specifically: from the sum of all distances of estimates of object j. The sum of all these values will be equal to 1. The larger its value, the further the estimate is from the weighted average.
However, to determine q kji , one needs an inverse value that would be greater if the distance of an estimate to the weighted average is smaller. Thus, such a value for the inverse relative distance is calculated as follows: The sum of all the values (i=1¸m) for object j will be equal to m−1. However, for normalization, the sum of experts' competence coefficients should be equal to 1. To this end, each derived value must be divided by m-1.
The new derived CC (q kji ) are employed in the new calculation of the weighted average estimate, then the calculation of new CC (q kji ) is repeated for the next step.
Based on the performed experimental calculations, we have derived the result of constant reduction in the difference between the new and preceding value for the weighted average estimate S k+1 -S k , which rapidly approaches 0. This is used to determine the final step, which is determined when this difference reaches the value that is less than the specified error for the group estimate. The magnitude of error is determined depending on the scale of assessment and the estimation pattern.

Results of experimental examination of calculations using the method of distances
The first test of the method is based on a limited number of experts with various points, determined from formulae (1) to (3).
Take for 4 objects p j ( j=1¸n; n=4) the estimates by 7 experts x i (i=1¸m; m=7) for each object. The number of estimates for each object is the same and equals m=7. Thus, each expert's competence coefficient at the initial stage (iteration k=0) is the same and is equal to q 0ji =1/7 (i=1¸7; j=1¸n). At the first step, we calculate the weighted average estimate S 0 , which is an average estimate for each object: -distances from estimates to the weighted average for each object (dji) and their total value (sumj), -new CC for each estimate and each object qkji, i=1¸7, j=1¸4; -new weighted average estimates (Skj, k is the number of iteration); -deviation of the new weighted average estimate (Skj) from the preceding (Sk-1,j). Fig. 2 shows results from the second stepthe calculation of the first iteration and results of the first error (S 1 -S 0 ) in a weighted average estimate S 1 .

Fig. 2. Results of iteration 1
At step 3 (Fig. 3), a deviation of the weighted average |S2 -S1| is reduced. That continues at each iteration (3) to (5) for each object. A value for iteration 5 (Fig. 4) determines the error in the weighted average estimate (S k ) from the preceding (S k-1 ), less than 10 -9 , which means the rapid convergence of result. The process completion depends on the assigned error for calculation: if the error is less than the assigned one, then the iteration is considered to be the last.
The second variant to test the method is based on the "on-line" assessment, in which number m is very large and constantly increasing, so direct application of formulae (1) to (3) requires much time and resources for computation, which is not effective.
If one analyzes the "on-line" assessment, it can be noted that the number of values for estimates is a small and limited magnitude, so one can adjust the previous method considering this one.
Suppose that l is the number of possible estimates (points); x i (i=1¸l), all values for possible estimates; a ji is the number of estimates with value x i for indicator j; j is the number of indicators for estimation ( j=1¸n); k is the level of iteration.
For example, indicators are estimated based on a 10-point scale from 0.1 to 1 at increment 0.1, then the number of possible estimates l=10, values for estimates x i ={0,1; 0,2; 0,3;…1}, and the score of each estimate is defined by the survey itself and could be equal, for example, to: a 1i ={0; 2; 30; 150; 1080; 560; 210; 50; 3; 1}, which determines the number of estimates for indicator 1 based on relevant estimates.
In this case, formula (1) changes as follows: The actual value is .
The average weighted estimate will change, accordingly, to: This expression shows that estimates with the same point are merged as a i x i . Thus, the next calculation of competence coefficient q k+1 , ji is determined from:  We shall present experimental calculations for the second variant of application of the method of distances. To this end, we selected 4 indicators as well, for which we performed calculations based on a 10-point scale with estimates from 0.1 to 1, but the number of experts for each indicator exceeded a thousand (Fig. 5).  As early as at the first iteration, we obtain an error in the weighted average estimate S 1 -S 0 of order 10 -5 -10 -6 . And at iteration 5 the error becomes almost zero (Fig. 7).
The third variant to test the method is based on the interrelated indicators so that their estimation by a single expert is a series of weight coefficients of these indicators for some model, whose sum is equal to 1. For example, a first expert estimates all 4 indicators, based on their influence on a certain process, as, respectively, {0.1; 0.45; 0.25; 0.2}, hence 0.1+0.45+0.25+0.2=1. Such estimates can be produced, first, by the limited number of experts, selected for their competence, second, they can be acquired from an "on-line" survey. In this case, experts' competence coefficient can be estimated based on each indicator separately, to be further averaged, based on all indicators, for each expert. We present an algorithm for such calculation for the interrelated indicators.
If experts' estimates are represented in array X, where X( j, i)∈(0¸1) is the estimate by expert i=1¸m for indicator j=1¸n. The variables and vectors, which are used in the algorithm, are to be denoted as follows: dif -calculation accuracy is specified by the developer of the assessment system; sr( j) -the average (weighted average at subsequent stages) of indicator j; sd( j) -difference between the new and the preceding weighted averages of indicator j; nSr( j) -new weighted average of indicator j; SdMax -the maximum value for sd( j) for all indicators; Sum( j) -total distance of all estimates to the average weighted indicator j; qSr(i) -coefficient of competence by expert I; the calculation of the resulting average weighted estimate can be implemented by the following algorithm (Fig. 8).

Fig. 8. Calculation algorithm of the weighted average value for interrelated indicators
We present experimental calculations for the third variant with the interrelated indicators. For this purpose, we selected 4 parameters р1-р4 for which estimates were pro-   no yes no yes yes duced by 7 experts based on a scale from 0.1 to 1, as well as the condition under which the total estimation of all indicators by a single expert equals unity (Fig. 9).

Fig. 9. Estimates by 7 experts based on 4 interrelated indicators
At each step of the iteration, for each indicator, we calculated the weighted average (Sj) similar to the previous tests, but a single CC for each expert is calculated as their average value (Fig. 10). When calculating each Sj, estimates are multiplied not by CC for each indicator j, but by their average value for each expert. Thus, if one denotes: x ji ( j=1¸n, i=1¸m) -value of estimates by expert i for indicator j; q kji -coefficient of competence by expert i for indicator j at iteration k; q ki -average competence coefficient by expert i at iteration k; S kj -weighted average estimate of indicator j at iteration k; dS kj -error of the average weighted indicator j at iteration k; dS k -error of weighted average estimate at iteration k; then calculation of the weighted average estimate S kj will be performed according to formulae: where k=0, At iteration 5, we obtained an error in the average weighted value in the experiment of the order of 10 -7 (Fig. 11), which also shows the rapid convergence. For the "on-line" assessment of interrelated indicators, formulae (8) to (13) are also used, however, in this case, one needs to employ the algorithmic capacities of information systems to process a large number of estimates.

Discussion of results of studying the method of distances
The method of distances, which calculates the weighted estimate, based on the results from 3 variants of its application, demonstrated the result that is closer to the average estimate, produced by the maximum number of experts, than the method of square deviations, based on formulae (14) to (16).
At the beginning of calculation, the coefficient of competence and the weighted average will be similar (4), (5), but the computation of competence coefficients at subsequent steps will be differentbased on a square deviation from the weighted average: We performed experimental calculation in line with a method of square deviations according to formulae (14) to (16) and similar data used for the method of distances. This calculation demonstrated a somewhat smaller convergence S k+1 -S k (Fig. 12).
One can also represent a comparison between results of the average and the weighted average estimates for both methods by a chart in Fig. 13. To this end, we denote the result from the method of distances as S1, from the method of square deviations as S2, the average value as S. Thus, to   13 shows that the estimate in line with a first method S1 is closer to the average than that in line with a second method S2.
The advantage of the method of distances is the possibility to rapidly implement it in modern information estimation systems, especially so when there is a large number of experts. The method also makes it possible to assign the desired precision in calculations.
A limitation of the method of distances is certain trust in the estimate by each expert. Therefore, it will not suffice to only process the obtained estimates using the method of distances. For greater reliability, new additional research is needed into determining the credibility of experts at the stage of a survey. For this purpose, we plan to design and explore a system of additional questions in the assessment systems for educational institutions that would specify additional factors of trust to each expert. Constructing a system of additional questions would make it possible to discard those experts who are not familiar or have insufficient knowledge of the object of estimation.
Using the method of distances is also limited by the condition for the possibility to reduce the estimation system to a point-based discrete normalized scale from 0 to 1.
Efficiency of the method of distances, when compared to methods for selecting experts for assessment, can be defined as the magnitude that is directly proportional to the percentage of reduction of costs per unit of permissible percentage of loss in the quality (accuracy) of an indicator.
If one denotes the time needed to study documents and achievements of each expert for the purpose of their selection for each indicator as t1, the time needed for estimation by each expert as t2, the time needed for discussion of results and their subsequent re-evaluation as t3, and if the difference in accuracy of estimates for indicators is to be selected as 17 %, determined in research [4], then the effectiveness of assessment can be determined as follows. According to the undertaken research into the assessment of indicators by experts at three educational establishments in the city of Odesa (Ukraine), t1 on average requires 2 hours, t2 -0.05 hours, t3 -5 hours. When using information systems that involve the "on-line" assessment, t1=0 and t2=0, so the difference in the cost of time (in proportion to the wages of highly paid experts) would amount to 7 hours per each expert. If the deviation in accuracy of up to 17 percent is acceptable for assessment, reducing the cost of estimation would equal (t1+t2)/(t1+t2+t3)•100=99.3 per cent by 17 percent of the loss in accuracy, equal to 5.8 per cent of the effectiveness of the new method.
Our study will make it possible to use the method of distances in the practical activities of enterprises through the development of an appropriate information system for the expert estimation of indicators.
Owing to the proposed calculation of each expert's competence coefficients and selecting the most competent ones when determining the resulting estimate, the method of distances automates the process of selecting experts. This significantly reduces the cost of such a selection when using a method of studying documents and history of achievements by every expert, so it can be used in information systems.
An information system should suggest conducting an on-line assessment for all stakeholders, as well as selected experts, as well as run an automated analysis of such assessment using the method of distances, which would make it possible for enterprises top management to receive constant operative information about quality, cost, or other indicators of performance, without wasting large amounts of time and resources.

Conclusions
1. We have theoretically substantiated the iterative method of distances that determines the weighted average assessment of indicators based on expert estimates. At each iteration, one calculates each expert's competence coefficients, which define the weight of his/her estimate in the ultimate result.
The three variants of calculations based on the method of distances have been proposed: -a small number of expert estimates for independent indicators; -a large number of expert estimates for independent indicators; -expert estimates for dependent indicators.
2. Our study into the rate of convergence within an iterative process for three variants of expert estimation has showed convergence from 1 to 4 orders of magnitude for each iteration, which requires in most cases from 2 to 5 iterations for an error less than 1 percent of the step in a point-based estimation scale. The resulting convergence rate to 5 iter- ations makes it possible to draw a conclusion about a very small amount of computation for an information system. Thus, for n experts, an information system will perform at each iteration operations of the order 2n 3 , which for a number of experts to 100 would not exceed 10 7 operations at 5 calculation iterations. It is known that when a processor clocks at 1 GHz (which is less than the frequency of modern processors), it performs 10 9 operations per second, and so the processing of estimates by experts would take less than 1 second. For the number of experts exceeding 100, we have proposed a second variant of calculations, which is performed regardless of the number of experts, and depends on the number of values in the estimation scale, which in most assessments does not exceed 100. As regards the dependent indicators, if the number of experts is equal to m, and the number of indices is n, then the number of computations at each iteration is n(2m(m-1)+3). For example, performing about 2 million transactions for a single indicator and 1,000 experts would take less than 1 second. 3. A comparative analysis of the method of distances with the method of square deviations has revealed showed almost the same convergence rate, but the method of distances yields the estimate, which is closer to the average estimate by experts for each indicator.
Thus, the issue related to complicated processes for selecting experts and developing a time-consuming procedure for conducting estimation is solved by using the method of distances in modern information assessment systems, which would allow the rapid, independent, and cost-efficient assessment of various performance indicators at enterprises.