CONTROL OVER GRAPE YIELD IN THE NORTH- EASTERN REGION OF UKRAINE USING MATHEMATICAL MODELING

One of the problems in viticulture is obtaining a sufficient harvest for providing the population. According to the scientific literature [1, 2], 80 % of successfully cultivated agro-cultures depend on the basic climatic and agrochemical conditions of cultivation. Furthermore, the timing of germination, ripening and flowering of plants are very important for the estimation of harvest of food plants. These measures are not possible without taking into account the requirements and determinining the factor, which more accurately expresses the harvest of biomass. Managing these measures for the purpose of regulation of the productivity of fruits and berries must be achieved in accordance with the management criteria, which match the purposes of control. The forecast estimations of productivity as a a criterion of control over harvest of grape culture is traditionally of great practical interest both for the countries-producers and for potential users. At the same time, accuracy of forecasting the productivity has remained insufficient until now. Especially so when we deal with the cultures, whose cultivation is not characteristic for the regions with a high risk of spring and autumnal frost and unstable amount of seasonal precipitation. In order to properly solve the problem of control over harvest, it is necessary to investigate the climatic processes that take place, as well as their impact on the reduction in productivity of grapes under conditions of the Northeastern region of Ukraine. Practical work on the cultivation of grapes under conditions of the Northeastern forest-steppe area of Ukraine is accompanied by instrument observations over the climatic parameters, among which considerable attention is paid to the study of thermal regime. Important in this case is the accumulation of experience in both the cultivation and comparisons of results of instrument observations. Every year, this makes it possible to study the picture of changes in the thermal regime of vegetation period and, accordingly, to predict a possible decrease or an increase in the harvest. However, under climatically unstable conditions and given the multifactoredness of the task, this prognostication, without the use of contemporary mathematical apparatus and computational means, has a rather subjective character. CONTROL OVER GRAPE YIELD IN THE NORTHEASTERN REGION OF UKRAINE USING MATHEMATICAL MODELING


Introduction
One of the problems in viticulture is obtaining a sufficient harvest for providing the population.According to the scientific literature [1,2], 80 % of successfully cultivated agro-cultures depend on the basic climatic and agrochemical conditions of cultivation.Furthermore, the timing of germination, ripening and flowering of plants are very important for the estimation of harvest of food plants.These measures are not possible without taking into account the requirements and determinining the factor, which more accurately expresses the harvest of biomass.Managing these measures for the purpose of regulation of the productivity of fruits and berries must be achieved in accordance with the management criteria, which match the purposes of control.
The forecast estimations of productivity as a a criterion of control over harvest of grape culture is traditionally of great practical interest both for the countries-producers and for potential users.At the same time, accuracy of forecasting the productivity has remained insufficient until now.Especially so when we deal with the cultures, whose cultivation is not characteristic for the regions with a high risk of spring and autumnal frost and unstable amount of seasonal precipitation.In order to properly solve the problem of control over harvest, it is necessary to investigate the climatic processes that take place, as well as their impact on the reduction in productivity of grapes under conditions of the Northeastern region of Ukraine.
Practical work on the cultivation of grapes under conditions of the Northeastern forest-steppe area of Ukraine is accompanied by instrument observations over the climatic parameters, among which considerable attention is paid to the study of thermal regime.Important in this case is the accumulation of experience in both the cultivation and comparisons of results of instrument observations.Every year, this makes it possible to study the picture of changes in the thermal regime of vegetation period and, accordingly, to predict a possible decrease or an increase in the harvest.However, under climatically unstable conditions and given the multifactoredness of the task, this prognostication, without the use of contemporary mathematical apparatus and computational means, has a rather subjective character.

Literature review and problem statement
Control over harvest, which can be obtained in the near future (short-term forecast), is distinguished by its variety and, in essence, is of probabilistic nature.Methods for predicting the productivity, which are used in contemporary information systems (IS) to manage this process, are typically divided into extrapolational, imitational and intelligent ones.
Extrapolation and interpolation are a broad range of prediction methods in agriculture.These methods are based on studying the dynamics of actual productivity for a number of previous years (not less than 5, usually 10-20) [3], dynamics of the change in soil fertility under the influence of natural processes [4], fertilizer system or land-reclamation works, according to weather forecasts, etc. [5].To forecast using these methods, different mathematical models are emplyed, constructed with the application of systems dynamics.Among the information systems and automatically controlled complexes, which apply extrapolation and interpolation as mathematical software for the process of making agrotechnical decisions, we may note such geo-information systems as "SSTools", "SMS", (Russia) and "N-Sensor", "PFadvantage" (USA) [6,7].However, their drawback is the invariance of utilized data, precise geolocation, large error in the forecast models as a result of possible presence of significant data bias in the form of ejections.Also, the possibility of accounting for seasonal changes in the indices, available in the systems, is insufficient to form a reliable solution, while the "random component" does not make any sense.All the aforementioned does not make it possible to consider the possible interrelation between the examined indicator and other attributes, and the solution formed by the system might be only of hypothetical nature.
The mathematical software of IS for the prognostication of harvest productivity based on the imitation simulation by investigating the Markov and semi-Markov processes is based on the extrapolation of tendencies [8].This solution implies an analysis of harvests by the groups of farms, brigades or sections with different productivity and the analysis of probability of their transition from one group to another in dynamics based on the tables of probabilities [9].
However, a shortcoming in the use of such mathematical software is low adequacy in the prognostication of processes to the nonstationary multi-periodic character, such as the vegetation of fruit and berry cultures.Low prognostic capability is connected to the fact that similar models are based on the so-called marginal probabilities.The work of such models is considered correct if the examined dynamic series is a random process, and the value of harvest productivity in subsequent year does not depend on the harvest productivity, achieved over current year.Practice reveals that such a condition is extremely rarely satisfied [10].
At present, the classical methods of physical-statistical forecasts have been less and less employed in modern IS in their original form.Given the success rate of numerical dynamic forecasting methods, statistical methods are increasingly more frequently adapted in IS for the optimization of practical prognostication [11].Thus, the prognostication by the tendencies of change in the level of soil fertility, employed in the systems "AREFS" and "AIS Geoset -2000" (Russia), depends on the occurrence of several situations.When the harvest is reduced on the old vineyards as a result of strengthening the processes of soil-fatigue, the forecast is not an increase, but a reduction in the harvest.The situation appears different when, as a result of multi-year maintaining the soil under fallow, the content of organic matter is reduced in the soil.In this case, correlation-regression data pairs are compiled and the productivity is forecast.Such prognostication can also be realized with the use of discrete simulation of the systems dynamics of the examined process [12].This makes it possible to not only more reliably forecast an increase or reduction in the harvest, but also to determine specific weight of each factor utilized.
Intelligent systems MARS and DSSAT (Russia) accumulate the experience of qualified specialists-agriculturists and swiftly present it to a user in the form of intelligent solution of a partivular production problem [13][14][15].Thus, the intelligent system of prognostication MARS employs as the basic mathematical apparatus for constructing the decisive rules the methods of expert estimates and nonlinear dynamics [14].This approach consists in the following: expert board (3-5 people) is selected from the highly skilled specialists-agriculturists, each member of which makes a forecast and enters it into a specifically designed electronic form.Then these data are processed by the system and, based on the work of a neural network, the forecast and its statistical evaluation are derived [16].The system also assesses the risks, associated with the forecasting of agricultural production.However, impossibility to avoid a subjective approach during forecast is a drawback of the given system.
The information systems that have been widely used recently for predicting the yield of fruit and berry crops, including grapes, are based on weather data with the use of methods of regression analysis.Mathematical software of such systems is based on a connection between the harvest of the forecasted period and harvests in previous years, as well as with the processes that take place in soil, with meteorological conditions, by solar radiation, etc. [17].The given methods are simple to use, they are deprived of the deficiencies of imitation simulation and expert methods.The obtained result implies that the examined conditions and tendencies remain the same why the models can be adapted to particular regions of cultivation of an agricultural crop, including grapes.Thus, in monograph [18], authors propose an approach to control, based on the connection between harvest of the forecasted period and harvests in previous years, as well current climatic processes.This combination of practical activity of observations provides for an increase in control level and for making the results more sustainable.However, the use of the model proposed by authors as a mathematical software for the developed system makes it possible to forecast productivity with insufficient accuracy.

The aim and tasks of the study
The purpose of present study is to develop a mathematical software for the information system of determining the probability of reduction in the yield of grapes according to weather data.
To achieve the set aim, the following tasks are to be solved: -to determine the most prognostically significant indicators, which influence the yield of grape crop, and to synthesize a mathematical model using the method of binary logistic regression; -to estimate reliability and adequacy of the developed mathematical model.

Development of the mathematical model for determining a probability of reduction in the yield of grapes
Mathematical software of IS is a totality of mathematical methods or models, employed in the solution of problems on the informatization of any process.
Authors have accumulated sufficient experience in developing the software for different information systems in medicine [19], psychology [20] and ecology [21].
Thus, a mathematical software of IS for determining a probability of reduction in the yield of grapes will constitute a mathematical model for determining the probability of reduction in the yield of grape crops, synthesized using the method of binary logistic regression.
Structural diagram of IS for determining the probability of reduction in the yield of grapes is shown in Fig. 1.
This system includes biological and technical subsystems.A biological subsystem includes: -a person who makes decisions (DM); -a grape culture -the biological subject of observation; -climatic factors, which characterize meteorological conditions of the growth of a grape culture.
A technical subsystem includes: -devices for the registration of climatic indicators -barometer-aneroid, thermometers, precipitation gauge, weathercock, electronic snow measuring rod; the modules of information input and output; -information processing unit; -information storage unit (database); -unit for determining the probability of reduction in harvest (mathematical software of the system); -conclusion making unit.
The unit for determining the probability of reduction in harvest is the core of thye technical subsystem.Result obtained at the output of this unit will directly influence a decision to be taken by DM.
In order to constructing the model, we analyzed data of a 16-year summer cycle of meteorological observations over the productivity in the Kharkiv region, Ukraine (from 2001 to 2016), conducted on a domestic meteorological station in Kharkiv region (Ukraine).The meteorological station is located on a plot in the territory of a garden, on the open terrain.The height above sea level is 195 m.Distance from a highway is about 200 m.There are no objects with elevated thermal emission near the weather station site.Data that are registered on the domestic meteorological station are comparable to the indicators of the «Climatic cadastral survey of Ukraine» at accuracy 2 % [22,23].
The basic evaluated indicators were: -sums of active temperatures during different phases of the vegetation of grapes (Fig. 2); -annual sum of active temperatures (Fig. 3); -total number of days of vegetation in each examined year (Fig. 4); -average monthly volume of precipitation during different phases of the vegetation of grapes (Fig. 5); -annual total precipitation (Fig. 6); -general radiation background, the rose of winds, the indicator of harvest.
The first five categories of indicators were measured quantitatively; general radiation background, the rose of winds and the indicator of harvest were defined as quality characteristics.
All the observations were divided into 2 groups: group 1 -years with normal and good yield (11 years); group 2 -years with low yield (5 years).
Fig. 1.Structural diagram of the information system for determining the probability of reduction in the yield of grapes Annual sum of active temperatures is the indicator, which determines the possibility of cultivation a grape culture in the region.In this case, the sum of active temperatures must comprise for all types of varieties more than 3400 °C [24].At the sum of active temperatures less than 3000 °C, the vegetation of culture in the Northeastern region of Ukraine is practically impossible.
In turn, the indicator of value of the sums of active temperatures during different phases of the vegetation of grape makes it possible to determine the arrival of warmth to each phase of the vegetation.This is not a less important characteristic than the total sum of temperatures.
All indicators were encoded and assigned, accordingly, to the 7-dimensional vector, which considers the presence, direction and magnitude of each indicator.Next, in accordance with the method of stepwise logistic regression, the predictors of the model and their coefficients were determined.The procedure of determining the predictors of the model was carried out in steps with the estimation of measure -2 of the logarithmic likelihood (-2 Log) [25].Computation of the coefficients of the model was conducted with the use of the least square method (LSM) [26].
As the initial value for -2 Log, we receive the value of 11,780.After adding randomly, at the first step of iterations, the variable of influence "Radiation background", the value of -2 Log became equal to 8,495 (Table 1).This value is less that that inititial by 3,285.

. Annual value of total precipitation
Such reduction in the magnitude indicates an improvement in the quality of approximation of the regression model to the actual process.At the fourth iteration, we added the variable "Value of the sums of active temperatures in the period from June 1 to June 30".In this case, quality of the model continued to improve.At the fourteenth iteration, we randomly added the variable "Average monthly volume of precipitation in May".Adding this variable did not lead to the improvement in quality of the model.Conducting additional iterations also did not lead to the improvement.Thus, the predictors of the model for determining the probability of reduction in the yield of grape are: -X 1 -radiation background; -X 2 -sum of active temperatures in the period from June 1 to June 30; -X 3 -annual total precipitation in the previous year.We also recalculated coefficients of the model and its constant at each iteration.
At the next step, we checked the significance of the fitted coefficients of a mathematical model using the Wald test (Table 2).An analysis of results of testing the fitted coefficients of a mathematical model with the help of the Wald test revealed that all variables are significant (p<0,05) and selected correctly.
Based on the chosen predictors and computed coefficients, the model received took the following form: Number P, obtained by the model, can be interpreted as the probability of reduction in the productivity of grape cultures in the examined territory.The value of p=0,5 was selected as the separation.If indicators are P 0,5, < then the probability of reduction in the yield of grape crops is low, if the values of P 0,5, > the probability P will be high.Estimation of reliability of the obtained equation was conducted with the help of correctness estimation of the model prediction according to the criterion of the R square measure of Nigel Kirk.At this estimation, the actually observed indicators are compared with those predicted based on the logistic regression and its statistical significance.The value of R square, which shows the share of influence of all predictors of a model on the dispersion of dependent variable, must be within the limits from 0 to 1.
Predictive value of the obtained model was determined with the use of a method based on the analysis of operating characteristic curve (ROC − Receiver Operating Characteristic curve) [27].

Discussion of results of evaluating the constructed mathematical model
The obtained model as a whole and its separate coefficients are statistically significant (р<0,05), and the value of the R square measure of Nigel Kirk is 0,897 (р=0,001), correctness of the model presiction is 89,7 %.Fig. 7 shows a diagram of the classification of observations.In the diagram, numbers "1" and "2" designate the gradations of the predicted dependent variable: "1" -good yield; "2" -low yield.Each bar in the diagram corresponds to the specific predicted probability, and its height -to the number of observations, for which this probability was forecast.In the diagram of classification, figure "2" in the right side and number "1" in the left side match correct predictions.1,877 X 0,115 X +0,546 X 256,668 Values of the predicted probability, computed by the equation of regression, are plotted along the horizontal axis; the frequencies are plotted on the vertical line.The closer the value of the predicted probability to unity, the more probable the reduction in the yield of a grape crop.The better the quality of prediction of the constructed model, the denser are the observations grouped in the histogram at the appropriate ends of the horizontal axis.
Numerical result of the binary classification of objects, which allows us to judge the number of correct and incorrect predictions, is given in Table 3. Judging by Table 3, it is possible to conclude that out of the total number of observations of group 1, the test correctly categorized all 11 cases.Out of the total number of observations of group 2, the test correctly categorized 3 observations of 5. Thus, general accuracy in the prediction reached 87,5 %.
ROC-analysis is a convenient and visual tool in order to estimate the model effectiveness [28].It implies a comparison of the operating characteristics -sensitivity and specificity.Area under the ROC-curve is used as the integral characteristic for the estimation of model effectiveness.
An analysis of ROC-curves was carried out taking into account that a ROC-curve of the "ideal classifier" runs through the upper left corner of a graph (Fig. 8).Therefore, the closer the "dark-blue curve" to the upper left corner, the higher the predictive ability of the model, in other words, its effectiveness is higher; the "green line" corresponds to the "useless" classifier, that is, full indistinguishability between two classes of the tests -the ineffectiveness of the model.
The value of area under the ROC-curve amounted to 0,97 (0.89, 1.00), which testifies to a good quality of the model.

Fig. 8. ROC-curves
General estimation of alignment between the influence of risk factors revealed in the model and the actually registered occurrence of unfavorable outcome was carried out using the Hosmer-Lemeshow test for goodness of fit (H L ), in which value p becomes higher as the differences between the frequency of the observed oucomes and those outcomes that are predicted based on data of the regression model become less.The obtained level of significance of the Hosmer-Lemeshow test for goodness of fit (H L =4,994, p=0,459) testifies to the adequacy of the developed model to actual data.
Validity of the developed model, which makes it possible to determine the probability of reduction in the yield of a grape crop, is confirmed by testing the independent predictors on the examination model (Table 4).All computations were performed using the software IBM SPSS 19.0 (USA) [29].It is established that all the predictors in accordance with the chi-squared test impact the forecast for a reduction in the yield at the level р<0.05.

Discussion of results to control the grape yield using mathematical modeling
The studies conducted and the development of mathematical software for IS make it possible to gain a positive result in the management of such unstable process as a harvest of grapes.In order to make a decision to act on a grape culture, as a result of determining the probability of reduction in harvest, at the output of a conclusion formation unit of IS, a scheme of agrotechnical measures will be created.
For example, at negative forecast, obtained in the unit of determining the probability of reduction in harvest, the following scheme of agrotechnical measures can be obtained: -fumigation of vineyards for elevating the temperature at the microlevel; -observation during and after the phase of flowering over dusting of clusters and, in order to improve quality of the production, to decrease their amount; -additional feeding of plants for their optimization and vital durability with the use of ecologically safe fertilizers.
Such a list of measures may help in an unfavorable year to preserve up to 70 % of the harvest and, accordingly, to decrease financial losses.
As an example, it is possible to enumerate the measures, which can be formed by the system, if a forecast proves to be positive: -creation of artificially large loads on an agro-culture; -powerful continuous irrigation; -the use of ecologically clean fertilizers.Such a list of measures allows a plant to approach the vegetation phase of flowering in optimum status.
It is also necessary to remember that a hot summer in a previous year is one of the aspects of positive prediction.Given this factor, a common list of the measures recommended by the system will be complemented with a special agro-method -covering a vineyard with a hail-sun-protecting net.This will make it possible to preserve the crop from sunburns.
Thus, the constructed prediction model can indirectly provide for not only obtaining the stable high harvests of grapes, but also increasing the profitability of economic activity.
Comparison of results of prediction to the actual data of the grape yield in the period from 2001 to 2016 confirms the relevance and economically-ecological expediency of the introduction of this culture under conditions of the Northeastern forest-steppe region of Ukraine.However, control over this process is still affected by climatic impact and changeability in meteorological conditions.
We may note that the famous "country-viticulturist" France considers to be successful the cultivation of grapes at 50 % reliability of positive result [30].Original data, given in Table 3, are even more favorable (at 97,5 % reliability of positive result), what is a robust indicator for the Northeastern region of Ukraine.

Conclusions
1. We defined three most significant indicators: radiation background, the sum of active temperatures during flowering, annual total volume of precipitation in the previous year.They make it possible to estimate the risk of reduction in the harvest of grapes, which is grown under conditions of the Northeastern forest-steppe region of Ukraine.By employing the method of binary logistic regression, we obtained a mathematical model for determining the probability of reduction in the yield of grape.The model proposed might be used as a mathematical software of an information system for making a decision about the need of changing the agrotechnical methods for the purpose of improving the grape yield.
2. The developed mathematical model is reliable and adequate with the actual data while all predictors of the model, in accordance with the chi-squared test, affect the prediction on reducing the yield of grape.The indicators included in the developed mathematical model allow us at confidence coefficient of 95 % and accuracy of 87,5 % determine a possible risk of reduction in the harvest of grapes.

Fig. 2 .Fig. 3 .Fig. 5 .
Fig. 2. Values of the sums of active temperatures during different phases of the vegetation of grape

Table 2
Basic indicators of the obtained model

Table 1
History of iterations in determining the predictors of the model and their coefficients

Table 3
Classification results of the model of binary logistic regression

Table 4
Combined tests for the predictors and the model