Search for impact factor characteristics in construction of linear regression models
The object of research is the task of constructing a linear regression model that arises in the process of solving the problem of predicting the values of a dependent variable on a set of independent factor characteristics. This task often arises in the process of analyzing indicators of economic activity of enterprises. The process of constructing a regression equation adequately reflects the relationship between factor attributes and the resultant attributes, is a multi-stage and time-consuming procedure. Important in this case is the stage of choosing the most influential factor characteristics. The adequacy of the regression model and the effectiveness of the analysis of the activities of enterprises depend on the effectiveness of this stage and the correct choice of a system of attributes. A number of methods and algorithms are proposed in scientific sources for choosing the most influential factor attributes. Some of them are based on correlation and regression analysis, but there are a number of heuristic methods. Studies have shown that the use of various methods for selecting the most influential factor attributes for solving specific problems, in the general case, leads to different results. Moreover, a feature of most methods is their computational complexity or instability about the conditions of use. The main criterion for the effectiveness of factor selection algorithms is the adequacy of the constructed regression model.
The study analyzes the process of constructing multiple linear regression models. Its main steps are determined and basic concepts and calculation formulas are given. The authors propose an algorithm for selecting the most influential factor attributes in constructing linear regression models. A feature of the proposed approach is that it is based on the properties of particular correlation coefficients. The application of the developed algorithm allows to reduce the computational complexity of the process of selecting factor attributes in comparison with known algorithms.
An experimental verification of the developed algorithm for the task of building dependencies between different performance indicators of two enterprises in the form of multiple linear regression is performed. As a result of the calculations, one or two influential features are selected from a system of 17 factor attributes for each indicator. The equations of multiple linear regression constructed in this way have a reliability that exceeds 90 %.
Smeekes, S., Wijler, E. (2018). Macroeconomic forecasting using penalized regression methods. International Journal of Forecasting, 34 (3), 408–430. doi: https://doi.org/10.1016/j.ijforecast.2018.01.001
Alvarez-Diaz, M., Alvarez, A. (2010). Forecasting exchange rates using local regression. Applied Economics Letters, 17 (5), 509–514. doi: https://doi.org/10.1080/13504850801987217
Cleland, A. C., Earle, M. D., Boag, I. F. (2007). Application of multiple linear regression to analysis of data from factory energy surveys. International Journal of Food Science & Technology, 16 (5), 481–492. doi: https://doi.org/10.1111/j.1365-2621.1981.tb01841.x
Heche, F. E. (2019). Teoriya ymovirnostei i matematychna statystyka. Uzhhorod: AUTDOR-ShARK, 235.
Baltagi, B. (2008). Econometric analysis of panel data. John Wiley & Sons, 388.
Shojima, K., Usami, S., Hashimoto, T., Todo, N., Takano, K. (2018). Understanding Differences in Statistical Models. The Annual Report of Educational Psychology in Japan, 57, 302–308. doi: https://doi.org/10.5926/arepj.57.302
Depczynski, U., Frost, V. J., Molt, K. (2000). Genetic algorithms applied to the selection of factors in principal component regression. Analytica Chimica Acta, 420 (2), 217–227. doi: https://doi.org/10.1016/s0003-2670(00)00893-x
Tibshirani, R. (1996). Regression Shrinkage and Selection Via the Lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58 (1), 267–288. doi: https://doi.org/10.1111/j.2517-6161.1996.tb02080.x
Mulesa, O. (2016). Development of evolutionary methods of the structural and parametric identification for tabular dependencies. Technology audit and production reserves, 4 (2 (30)), 13–19. doi: https://doi.org/10.15587/2312-8372.2016.74482
Azadeh, A., Ziaei, B., Moghaddam, M. (2012). A hybrid fuzzy regression-fuzzy cognitive map algorithm for forecasting and optimization of housing market fluctuations. Expert Systems with Applications, 39 (1), 298–315. doi: https://doi.org/10.1016/j.eswa.2011.07.020
Ahentstvo z rozvytku infrastruktury fondovoho rynku Ukrainy. Available at: https://smida.gov.ua/
GOST Style Citations
Copyright (c) 2019 Fedir Geche, Oksana Mulesa, Viktor Hrynenko, Veronika Smolanka
This work is licensed under a Creative Commons Attribution 4.0 International License.
ISSN (print) 2664-9969, ISSN (on-line) 2706-5448