A COMPARATIVE ANALYSIS OF RESULTS OF THE GROUP EXPERT ASSESSMENT OF METROLOGICAL ASSURANCE OF MEASUREMENTS

In order to obtain reliable assessments through group expert assessment, it is first of all necessary to correctly approach the selection of the method and experts involved in the study. The general opinion of experts is obtained through the use of methods of mathematical statistics in the processing of expert data obtained and verification of consistency. The simplest methods of expert evaluations are mainly used as components of more complex methods for evaluating complex systems. Most often in the practice of expert evaluation, a questionnaire is used [1]. The expert data obtained should be checked for consistency. In the case of data inconsistency, it is necessary to perform an analysis either to reject these data, or to further harmonize indicating the data by specifying the criteria or indicators used. To obtain reliable estimations of group expert assessment in any field, it is necessary to select the most optimal method. Since these methods are based on various algorithms for the implementation of expert evaluation, the chosen method needs to be improved to meet certain needs in a particular field [1]. Exactly the same methods were applied by the authors during the realization of research on increasing the efficiency of complex systems and competence evaluation of experts in the sphere of technical regulation [1, 2]. The urgency of the work is confirmed by the growing dissemination of expert methods for solving problems and making decisions for complex organizational and technical systems. In modern science and technology, these systems characterize both quantitative and qualitative indicators. Increasing their effectiveness is predominantly the process of finding the best solution under certain conditions with available limitations and criteria. Therefore, the urgent issue is the development of improved methods of group expert assessment, taking into account the competence of the experts involved, as well as the development of software tools on the basis of improved methods in order A COMPARATIVE ANALYSIS OF RESULTS OF THE GROUP EXPERT ASSESSMENT OF METROLOGICAL ASSURANCE OF MEASUREMENTS


Introduction
In order to obtain reliable assessments through group expert assessment, it is first of all necessary to correctly approach the selection of the method and experts involved in the study.The general opinion of experts is obtained through the use of methods of mathematical statistics in the processing of expert data obtained and verification of consistency.The simplest methods of expert evaluations are mainly used as components of more complex methods for evaluating complex systems.Most often in the practice of expert evaluation, a questionnaire is used [1].
The expert data obtained should be checked for consistency.In the case of data inconsistency, it is necessary to perform an analysis either to reject these data, or to further harmonize indicating the data by specifying the criteria or indicators used.
To obtain reliable estimations of group expert assessment in any field, it is necessary to select the most opti-mal method.Since these methods are based on various algorithms for the implementation of expert evaluation, the chosen method needs to be improved to meet certain needs in a particular field [1].Exactly the same methods were applied by the authors during the realization of research on increasing the efficiency of complex systems and competence evaluation of experts in the sphere of technical regulation [1,2].The urgency of the work is confirmed by the growing dissemination of expert methods for solving problems and making decisions for complex organizational and technical systems.In modern science and technology, these systems characterize both quantitative and qualitative indicators.Increasing their effectiveness is predominantly the process of finding the best solution under certain conditions with available limitations and criteria.Therefore, the urgent issue is the development of improved methods of group expert assessment, taking into account the competence of the experts involved, as well as the development of software tools on the basis of improved methods in order

Literature review and problem statement
A necessary task for the present is an adaptation of the reformed system of technical regulation in Ukraine to European requirements and the involvement of highly skilled professionals in this area.For an identification of problematic questions in the modern field of technical regulation, it is expedient to use a group expert assessment with the involvement of highly skilled specialists.
A thorough analysis of the appropriate methods for the competence assessment of experts in standardization and metrological assurance (MA) was the subject of previous studies [1,2].The most common expert methods are quite simple and have the imperfections: -the method of ranking [3] and its modifications [4][5][6] does not provide sufficient accuracy of ranking more than 15-20 objects; -the method of direct evaluation [3] and it's modifications [4,7] -cannot be used in case of incomplete knowledge of an expert about the investigated properties of an object; -the method of comparison, including two varieties; -the method of successive comparison [6] -the most labour-intensive and complex; -the method of pairwise comparison [6] and its modifications [8][9][10] is quite simple in comparison with other methods, characterized by the high level of reliability of the estimation results and allows investigating plenty of objects with great accuracy; -the method of competence evaluation of experts on the basis of the fuzzy set theory [11] -the lacks of this method are divergences between the finite set of competences that characterize the states of an object, and characteristics suggested by a certain expert.This narrows the application scope of this approach.
Methods of scenario analysis are known no less: -analysis of the root cause, scenario, influence on activity, cause-and-effect relations -do not provide for numerical estimations [7]; -the basic method of analytic hierarchy process [12,13] and its modifications [6,8] -are applicable only in case of a small number of the set alternatives and do not give an opportunity to combine different opinions of expert groups.
For the research of complex objects or systems, it is expedient to use the method of analytic hierarchy process (АНР).АНР is a mathematical tool for a systems approach to complex decision-making problems.This method is widely used due to the works [13,14], which have more fully revealed the possibilities of the procedure, and since then АНР has been actively developed and widely used in practice.
The next task is to conduct a group expert assessment of MA for different types of measurements with the involvement of experts in metrology.
Expert groups (working groups) are formed for the organization of group expert assessments, which include collecting opinions, processing materials and analyzing the evaluation results.Before organizing a group expert assessment, the main directions of development of the object or field of research are specified, and also a matrix (a table) that reflects the general goal, sub-goals and means of achievement is constructed.
Among all known forms of collecting opinions, one can mention individual, collective and mixed, each of which has a number of varieties, advantages and disadvantages [3][4][5][6][7][8]15]: -questionnaire (often involving interviewing and discussion) -allows experts to collect their opinions with less effort, more time-consuming; -interviewing -provides an opportunity to determine the degree of openness of the respondent, allows you to get information directly, quickly and completely.The disadvantage of the method is a greater need for time and resources than for questionnaires; -discussion -an effective method of debating the issue under study, involves a collective discussion of the problem, during which the truth is found, requires the comprehensive readiness of each participant.The disadvantage of the method is a time constraint (cannot last more than 3 hours); -brainstorming -works most effectively in groups, in collective work, easy to understand, does not require special training of participants, allows you to quickly "generate" new ideas.The disadvantage of the method is the complexity of the group's organization, it does not always allow you to generate strategically correct solutions, the method is unsuitable for solving complex problems, which require special knowledge about the object of research or technical training; -meeting -is held in a small circle of competent and highly skilled specialists according to the scheme: a report -a question -a debate -a decision, allows you to analyze, find ways and methods for solving problems.The disadvantage of the method is the restrictions on time, the number of participants involved; -business game -used as a method of active training of participants in order to develop their decision-making skills in non-standard situations, as well as a means of testing abilities.
In many cases, each of these varieties is used together, which often gives greater effect and objectivity.This approach is used in cases of some ambiguity of the problem, divergences of individual opinions or in the collective discussion of problems by experts.However, questionnaire (interviewing) is most often used in the practice of group expert assessment.Many papers on the application of expert methods [16][17][18] describe the use of specially designed software tools (software).The software allows to significantly increase the productivity of the applied methods of group expert assessment and eliminate errors in calculations of the results.

The aim and objectives of the study
The conducted studies aimed to develop an effective method and software for conducting a group expert assessment of MA for different types of measurements, which would allow, if necessary, to change the used criteria of evaluation.
To achieve the aim, the following objectives were accomplished: -to select and improve the most appropriate method of group expert assessment for certain tasks in the field of MA; -to carry out group expert assessment of MA for different types of measurements with the involvement of experts in metrology and to conduct a comparative analysis of the results obtained; -to develop improved software tools to increase the productivity of the method of group expert assessment and eliminate errors in calculating the results.

Materials and methods of research for development of improved methods for group expert assessment
In the practice of expert evaluation, questionnaires (interviewing) is used.To do this, a survey questionnaire is developed.The survey questionnaire can be developed in the form of tables, but the content must be determined by the specifics of the object or field under study.At the same time, questions must be drawn up according to a certain structural and hierarchical scheme: from general questions to specific ones; from complex to simple.When conducting a survey of experts, it is necessary to ensure unambiguous understanding of individual questions, as well as the independence of expert judgments.Next, it is necessary to process the data obtained from the group expert assessment, which characterize the generalized opinion and the degree of consistency of individual expert assessments.Processing of expert data is a source of material for the synthesis of predictive hypotheses and variants for the development of the certain object or sphere under study.
The method of group expert assessment, which is to be improved, is described in [1], and one of the examples of its implementation in [16].
For the realization of the improved method, it is necessary to calculate: -the average score i x for each of N analyzed questions taking into account the competence coefficient (CC) for each of M experts involved in the evaluation: -the reference value of expert evaluation x ref for all of the problematic questions as a simple average value for all the estimated questions (in numerical scores) -the degree of deviation of the estimated average scores i x from the reference value x ref for each of the defined questions (in numerical scores): Based on the obtained values of the degree of deviation of the estimated average scores from the reference value, the ranking of the obtained values in descending order D i is carried out.
Further, the indicators that characterize the consistency of the data obtained on the sub-questions are calculated, considering: -the Kendall's coefficient of concordance W taking into account the connected ranks defined by the formula [1,16]: where S -the sum of the squares of deviations from the average; T i -the total number of identical ranks for the i-th expert on all the sub-questions considered; t q -the number for the same rank for the i-th expert on all the sub-questions considered; Q -the number of groups of the same ranks in the i-th expert on the sub-questions considered; q -the same rank for the i-th expert on all the sub-questions considered; -the Pearson's chi-squared test with respect to connected ranks, which is determined by the formula: The obtained value of the Kendall's coefficient of concordance W is analyzed, and a conclusion on the degree of consistency of the data in accordance with the Margolin scale is made [1,16].If necessary, adjustment of the score values for certain sub-questions under study is carried out.
The obtained value based on the Pearson's chi-squared test for the confidence level of 0.05 is compared with the critical value T(0,05; 1) M for this confidence level.In the case where the value of the Pearson's chi-squared test is less than the tabulated critical value, consideration of the correction of numerical scores for certain sub-questions under study is required.
The final stage is: -formation of a list of questions for further detailed consideration; -formation of a list of rejected questions for further consideration; -presentation of the results on a special chart (petal diagram or histogram) with plotting the reference value of the expert assessment.
When forming a histogram with finite results, the Pareto principle is applied with plotting the Lorentz curve [1].
The algorithm for implementing the improved method of group expert assessment, taking into account the competence of experts is shown in Fig. 1.The above algorithm can be easily implemented with the use of widespread mathematical software packages (for example, Microsoft Excel, USA).
Special expert assessment methods taking into account the competence of the experts involved are developed to increase the reliability of the group expert assessment and to significantly reduce the time spent on conducting it.
The software allows: -to select the most priority issues in certain fields among those that are defined for consideration by the expert group; -to reject the questions that are not of primary or any importance for further analysis.
The data obtained through the questionnaire were processed using the Microsoft Excel (USA) universal software.The evaluation results are shown in Fig. 2, 3.
The column chart with the ranking of sub-questions in consideration in descending order of their significance is shown in Fig. 2. On the x-axis, the subquestions are displayed, on the y-axis -the estimated average i x for the sub-questions are displayed.The chart uses the Pareto principle, or the "20/80" principle, which generally means that 20 % of the effort gives 80 % of the result (yellow columns), and the remaining 80 % of the effort -only 20 % of the result (blue columns).
The analysis of the results of evaluation of the state of MA for measurement of time and frequency by the improved method shows that 12 sub-questions (32 %) are prioritized for further detailed analysis in order to make the necessary decisions (yellow columns), and 26 sub-questions (68 %) -have no primary or any significance at all for their further analysis.
Sub-question by the questionnaire Expert evaluation Average for sub-questions 80 % The primary sub-questions for further in-depth study for measuring time and frequency are (in order of importance): -the number of specialists conducting or participating in the testing (experience of testing) of measuring instruments (MI) (Х1_2); -availability of mobile laboratories equipped with working standards, MI and equipment at the enterprise (Х4_8); -availability of methods (methodologies) that require development or revision (Х3_5); -the total number of specialists engaged in metrology works (Х1_1); -availability of verification schemes (methods, standards) at workplaces (Х3_6); -the use of calibration methods (methodologies) of MI (Х6_4); -estimation of the suitability of the software for automated collection and processing of data obtained during the verification (calibration) of MI (Х6_6); -the use of the form of verification procedures (Х5_3); -the use of the methods (standards) of verification (verification procedures) of MI (Х3_2); -authorization or accreditation of the enterprise for metrological works (Х5_1); -the state of estimation of uncertainty in the calibration of MI (Х6_5); -provision of working standards, MI and equipment with repair and maintenance (Х4_7).
As a result of the expert evaluation in general, one can note the positive state of MA for measurements of time and frequency.
The application of the improved method did not reveal new priority issues.

Fig. 3. Diagram for average scores for questions with/ without taking into account the competence of experts
The petal diagram for the average scores of expert assessments for questions with and without taking into account the competence of experts is shown in Fig. 3. Red dotted lines mark reference values.The green line marks the average for the questions taking into account the competence of experts, and the blue line is the average for the questions without taking into account the competence of experts.The results obtained show a small variation of the average values, indicating the presence of balance.Allowing for the competence of experts led to the data bias, however, did not affect the final result of the evaluation in general because of the rather homogeneous assessments of the questions under consideration.

Improved software tools for group expert assessment
In [17,18], it was noted that there are certain peculiarities of the application of software tools, which is connected with the methods on their basis.
The block diagram of the proposed special software tools for conducting a group expert assessment is shown in Fig. 4 where: 1 -a module of specifying a set of competence criteria for a technical expert and their numerical (score) values; 2 -a module of introducing objective data on the criteria for the set of experts M, whose competence is compared; 3 -a module of calculating: -average scores; -average relative and average normalized scores for each expert's data; -the total average score; -the total relative average score; -the average normalized score for all technical experts M according to all criteria for the competence evaluation of experts (CEE); 4 -a module of calculating the Kendall's coefficient of concordance taking into account the connected ranks, and obtaining a conclusion on the established degree of consistency of the data; 5 -a module for verifying the average normalized value for each expert for compliance with the Pearson's chisquared test, taking into account connected ranks at a confidence level equal to 0.05; 6 -a module of the set of critical values of χ 2 for the confidence level equal to 0.05; 7 -a module of ranking of the obtained normalized averages for experts (competence values) in descending order, with the rejection of the normalized averages for experts, which do not satisfy the Pareto principle, and displaying of the results on a special chart; 8 -a module for forming a group of experts based on the results obtained on the consistency of the data for conducting the expert research of certain objects in the chosen field of activity; 9 -a module specifying the set of problematic questions N that need to be analyzed, and the numerical (score) values of the questions; 10 -a module is intended for inputting the obtained questionnaire data from experts according to the set list of problematic questions N, which need to be analyzed; 11 -a module of calculating the average scores for each of the identified questions with/without taking into account the CС of each of the M experts involved in the assessment; the reference value of the expert assessment for all problematic questions and the degree of deviation of the assessed average scores from the reference value for each of the identified questions; 12 -a module of calculating the Kendall's coefficient of concordance taking into account the connected ranks, and obtaining a conclusion on the established degree of consistency of the data; 13 -a module for checking the average scores for all sub-questions according to all experts' estimates for compliance with the Pearson's chi-squared test, taking into account connected ranks at a confidence level equal to 0.05; 14 -a module for ranking numerical (score) values for sub-questions in descending order D i , applying the principle and displaying the results on a special chart; 15 -a module for the final formulation and presentation of the list of considered sub-questions on a special chart for further detailed consideration.Fig. 4. Proposed software tools for group expert assessment Module 1 specifies the set and numerical (score) values of the CEE criteria for the study of a particular object in the chosen field of activity.In module 2, objective data are introduced according to the established CEE criteria for a set of experts whose competence is assessed (compared).
Module 3 carries out calculations of: -average scores; -average relative and average normalized scores for the data of each technical expert; -the total average score; -the total relative average score; -the average normalized score for all technical experts M according to all CEE criteria.Module 4 is intended to calculate the Kendall's coefficient of concordance, taking into account the connected ranks, and obtain a conclusion on the degree of consistency of the data.Module 5 is used to check the average normalized values for each expert for consistency with the Pearson's chi-squared test, taking into account the connected ranks when compared with the critical values of χ 2 for the confidence level of 0.05.Module 6 contains the sets of critical values of χ 2 for a confidence level of 0.05.
Module 7 carries out: -ranking of the obtained normalized averages for experts (competence values) in descending order with the rejection of normalized averages for experts, which do not satisfy the Pareto principle; -displaying of the results on a special chart.Module 8 performs the final formation of a group of experts, taking into account the results obtained on the consistency of the data for conducting the expert research of certain objects in the chosen field of activity.
Module 9 specifies the set and numerical (score) values of the problematic questions N that need to be analyzed.Module 10 carries out the input of the obtained questionnaire data from the experts according to the set list of problematic questions N, which need to be analyzed.
Module 11 carries out calculations of: -average scores for each of the identified questions with/without taking into account the CC of each of the M experts involved in the assessment; -the reference value of the expert evaluation for all problematic questions as a simple average; -the degree of deviation of the estimated average scores from the reference value for each of the identified questions.
Module 12 is intended to calculate the Kendall's coefficient of concordance, taking into account the connected ranks, and obtain a conclusion regarding the established degree of consistency of the data.Module 13 is used to check the average scores for all questions according to all experts' estimates for compliance with the Pearson's chi-squared test, taking into account the connected ranks at a confidence level of 0.05.
Module 14 carries out the ranking of the obtained score values for sub-questions in descending order D i ; applies the Pareto principle and displays the results on a special chart.Module 15 carries out the final formation and presentation of the list of the considered sub-questions on a special chart for further detailed consideration.
The considered software tools can be implemented with the use of any modern personal computer, and calculations by the algorithm shown in Fig. 1 can be implemented with the use of, in particular, the Microsoft Excel program (USA).
The application of the improved method and software tools for group expert assessment is designed to eliminate the restrictions on the number of experts and the number of questions (factors) that need to be analyzed.The approach is based on the integration of the element of competence assessment of experts involved into the software tools for group expert assessment.This method allowed for a flexible mechanism for adjusting, if necessary, the applied evaluation criteria, which increases the accuracy and performance of such an assessment.
The characteristics for a particular question obtained by the improved evaluation method and software tools allow a more reasonable selection of the most priority problematic questions in a particular field of activity.This increases the reliability of the assessment and allows for more reliable expert assessments on problematic questions.

Discussion of the results of the state of metrological assurance of measurements
The results of the comparison of average values by the questions are shown in Table 1.In Table 1, the relative score is a relative value for a specific type of measurement from the maximum score (9 points), and the reference value is the arithmetic mean of all the questions.The analysis of the results (Table 1) shows the average level of the MA state for all types of measurements (from 0.52 to 0.66).In general, the state of MA of measurements can be estimated conditionally by the levels (gradations) given in Table .2. The average ratio for the questions (Fig. 5) was also estimated for all types of measurements.The red dotted line marks the reference value.The blue line marks the average value for all the questions.The obtained results show a small variation of the averages from 4.70 to 5.98, indicating the presence of balance.
The values of the Kendall's coefficient of concordance W and the Pearson's chi-squared test for all types of measurements are shown in Table 3.
The analysis of the obtained results (Table 3) testifies that: -the value of the Kendall's coefficient of concordance ranges from 0.34 to 0.46 (poor consistency on the Margolin's scale); -all obtained values of the Pearson's chi-squared test meet the set requirements for the confidence level of 0.05 ( T(0,05; 1)

Conclusions
1.The most effective methods of group expert assessment, which are suitable for assessing the state of metrological assurance of measurements are considered.The improved method of group expert assessment, which takes into account the competence of the experts involved on the basis of previously established criteria and tested on many objects in the metrology field, in particular, determine the state of metrological assurance of measurements is proposed.It is found that an important element of the practical application of the methods of group expert assessment is the mandatory verification of the consistency of the expert estimates obtained.For this purpose, the most suitable means is to verify the consistency of expert data by applying the Kendall's coefficient of concordance and the Pearson's chi-squared test.
2. A group expert assessment of metrological assurance for various types of measurements involving experts in metrology is carried out.The results are processed using the universal software Microsoft Excel (USA).The analysis of the results showed the priority of 32 % of the sub-questions.Thus, new prioritized questions were not identified in comparison with the previously developed method, which was improved.A comparative analysis of the results showed the average level of the state of MA (from 0.52 to 0.66) for all types of measurements.The ratio of the averages for the questions for all types of measurement is estimated.The results showed a small variation of the averages from 4.70 to 5.98, indicating a good balance.3. The developed block diagram of the special software tools is based on the integration of the element of competence assessment of the experts involved, has no restrictions on the number of experts and the number of questions (factors) that need to be analyzed, and also allows, if necessary, adjusting the applied criteria of evaluation.This contributes not only to the implementation of the improved method of group expert evaluation and to the elimination of errors in calculating the results, but also to the improvement of the accuracy and performance of such an assessment.

Fig. 2 .
Fig.2.The results of the evaluation by the improved method on the sub-questions taking into account the competence of the experts

8 . 11 .No 6 . 4 . 5 .Fig. 1 .
Fig. 1.The algorithm for implementing the improved method of group expert assessment taking into account the competence of experts

Fig. 5 .
Fig. 5. Average for the questions for all types of measurements

Table 1
Comparison of the results obtained by the questions

Table 2
Levels (gradations) of the state of metrological assuranceand their evaluation

Table 3
The values of the Kendall's coefficient of concordance W and the Pearson's chi-squared test