Search for impact factor characteristics in construction of linear regression models

Authors

DOI:

https://doi.org/10.15587/2312-8372.2019.175020

Keywords:

multiple linear regression, partial correlation coefficients, factor attributes, model adequacy

Abstract

The object of research is the task of constructing a linear regression model that arises in the process of solving the problem of predicting the values of a dependent variable on a set of independent factor characteristics. This task often arises in the process of analyzing indicators of economic activity of enterprises. The process of constructing a regression equation adequately reflects the relationship between factor attributes and the resultant attributes, is a multi-stage and time-consuming procedure. Important in this case is the stage of choosing the most influential factor characteristics. The adequacy of the regression model and the effectiveness of the analysis of the activities of enterprises depend on the effectiveness of this stage and the correct choice of a system of attributes. A number of methods and algorithms are proposed in scientific sources for choosing the most influential factor attributes. Some of them are based on correlation and regression analysis, but there are a number of heuristic methods. Studies have shown that the use of various methods for selecting the most influential factor attributes for solving specific problems, in the general case, leads to different results. Moreover, a feature of most methods is their computational complexity or instability about the conditions of use. The main criterion for the effectiveness of factor selection algorithms is the adequacy of the constructed regression model.

The study analyzes the process of constructing multiple linear regression models. Its main steps are determined and basic concepts and calculation formulas are given. The authors propose an algorithm for selecting the most influential factor attributes in constructing linear regression models. A feature of the proposed approach is that it is based on the properties of particular correlation coefficients. The application of the developed algorithm allows to reduce the computational complexity of the process of selecting factor attributes in comparison with known algorithms.

An experimental verification of the developed algorithm for the task of building dependencies between different performance indicators of two enterprises in the form of multiple linear regression is performed. As a result of the calculations, one or two influential features are selected from a system of 17 factor attributes for each indicator. The equations of multiple linear regression constructed in this way have a reliability that exceeds 90 %.

Author Biographies

Fedir Geche, Uzhhorod National University, 3, Narodna sq., Uzhhorod, Ukraine, 88000

Doctor of Technical Sciences, Professor, Head of Department

Department of Cybernetics and Applied Mathematics

Oksana Mulesa, Uzhhorod National University, 3, Narodna sq., Uzhhorod, Ukraine, 88000

PhD, Associate Professor

Department of Cybernetics and Applied Mathematics

Viktor Hrynenko, Uzhhorod National University, 3, Narodna sq., Uzhhorod, Ukraine, 88000

Postgraduate Student

Department of Cybernetics and Applied Mathematics

Veronika Smolanka, Uzhhorod National University, 3, Narodna sq., Uzhhorod, Ukraine, 88000

Postgraduate Student

Department of Cybernetics and Applied Mathematics

References

  1. Smeekes, S., Wijler, E. (2018). Macroeconomic forecasting using penalized regression methods. International Journal of Forecasting, 34 (3), 408–430. doi: https://doi.org/10.1016/j.ijforecast.2018.01.001
  2. Alvarez-Diaz, M., Alvarez, A. (2010). Forecasting exchange rates using local regression. Applied Economics Letters, 17 (5), 509–514. doi: https://doi.org/10.1080/13504850801987217
  3. Cleland, A. C., Earle, M. D., Boag, I. F. (2007). Application of multiple linear regression to analysis of data from factory energy surveys. International Journal of Food Science & Technology, 16 (5), 481–492. doi: https://doi.org/10.1111/j.1365-2621.1981.tb01841.x
  4. Heche, F. E. (2019). Teoriya ymovirnostei i matematychna statystyka. Uzhhorod: AUTDOR-ShARK, 235.
  5. Baltagi, B. (2008). Econometric analysis of panel data. John Wiley & Sons, 388.
  6. Shojima, K., Usami, S., Hashimoto, T., Todo, N., Takano, K. (2018). Understanding Differences in Statistical Models. The Annual Report of Educational Psychology in Japan, 57, 302–308. doi: https://doi.org/10.5926/arepj.57.302
  7. Depczynski, U., Frost, V. J., Molt, K. (2000). Genetic algorithms applied to the selection of factors in principal component regression. Analytica Chimica Acta, 420 (2), 217–227. doi: https://doi.org/10.1016/s0003-2670(00)00893-x
  8. Tibshirani, R. (1996). Regression Shrinkage and Selection Via the Lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58 (1), 267–288. doi: https://doi.org/10.1111/j.2517-6161.1996.tb02080.x
  9. Mulesa, O. (2016). Development of evolutionary methods of the structural and parametric identification for tabular dependencies. Technology audit and production reserves, 4 (2 (30)), 13–19. doi: https://doi.org/10.15587/2312-8372.2016.74482
  10. Azadeh, A., Ziaei, B., Moghaddam, M. (2012). A hybrid fuzzy regression-fuzzy cognitive map algorithm for forecasting and optimization of housing market fluctuations. Expert Systems with Applications, 39 (1), 298–315. doi: https://doi.org/10.1016/j.eswa.2011.07.020
  11. Ahentstvo z rozvytku infrastruktury fondovoho rynku Ukrainy. Available at: https://smida.gov.ua/

Published

2019-06-30

How to Cite

Geche, F., Mulesa, O., Hrynenko, V., & Smolanka, V. (2019). Search for impact factor characteristics in construction of linear regression models. Technology Audit and Production Reserves, 3(2(47), 20–25. https://doi.org/10.15587/2312-8372.2019.175020

Issue

Section

Mathematical Modeling: Original Research