MONITORING DATA AGGREGATION OF DYNAMIC SYSTEMS USING INFORMATION TECHNOLOGIES

Authors

DOI:

https://doi.org/10.30837/ITSSI.2023.23.123

Keywords:

data dimensionality reduction; deep learning; autoencoders

Abstract

The subject matter of the article is models, methods and information technologies of monitoring data aggregation. The goal of the article is to determine the best deep learning model for reducing the dimensionality of dynamic systems monitoring data. The following tasks were solved: analysis of existing dimensionality reduction approaches, description of the general architecture of vanilla and variational autoencoders, development of their architecture, development of software for training and testing of autoencoders, conducting research on the performance quality of autoencoders for the problem of dimensionality reduction. The following models and methods were used: data processing and preparation, data dimensionality reduction. The software was developed using the Python language. Scikit-learn, Pandas, PyTorch, NumPy, argparse and others were used as auxiliary libraries. Obtained results: the work presents a classification of models and methods for dimensionality reduction, general reviews of vanilla and variational autoencoders, which include a description of the models, their properties, loss functions and their application to the problem of dimensionality reduction. Custom autoencoder architectures were also created, including visual representations of the autoencoder architecture and descriptions of each component. The software for training and testing autoencoders was developed, the dynamic system monitoring data set, and the steps for pre-training the data set were described. The metric for evaluating the quality of models is also described; the configuration of autoencoders and their training are considered. Conclusions: The vanilla autoencoder recovers the data much better than the variational one. Looking at the fact that the architectures of the autoencoders are the same, except for the peculiarities of the autoencoders, it can be noted that a vanilla autoencoder compresses data better by keeping more useful variables for later recovery from the bottleneck. Additionally, by training on different bottleneck sizes, you can determine the size at which the data is recovered best, which means that the most important variables are preserved. Looking at the results in general, the autoencoders work effectively for the dimensionality reduction task and the data recovery quality metric shows that they recover the data well with an error of 3–4 digits after 0. In conclusion, the vanilla autoencoder is the best deep learning model for aggregating monitoring data of dynamic systems.

Author Biographies

Dmytro Shevchenko, V. N. Karazin Kharkiv National University

Postgraduate Student Theoretical and Applied Systems Engineering Department

Mykhaylo Ugryumov, V. N. Karazin Kharkiv National University

doctor of Engineering Sciences

Sergii Artiukh, State Organization "Grigoriev Institute for Medical Radiology and Oncology of the National Academy of Medical Sciences of Ukraine"

Candidate of Medical Sciences (PhD)

References

References

Grusha, V. M. (2017), "Chlorophyll fluorometer data normalization and dimensionality reduction” ["Normalizaciya ta zmenshennya rozmirnosti dany’x xlorofil-fluorometriv"], Computer facilities, networks and systems, No. 16, P. 76–86.

Martseniuk, V. P., Dronyak, Y. V. and Tsikorska, I. V. (2019) "Reduction of dimension for prediction of progress in problems of medical education: an approach based", Medical Informatics and Engineering, No. 4, P. 16–24.

Kozak, Ye. B. (2021), "A complex algorithm for creating control automata based on machine learning" ["Kompleksny’j algory`tm stvorennya keruyuchy’x avtomativ na bazi mashy`nnogo navchannya"], Technical engineering, No. 2 (88), P. 35–41. DOI: https://doi.org/10.26642/ten-2021-2(88)-35-41.

Bakurova, A. et al. (2021), "Neural network forecasting of energy consumption of a metallurgical enterprise", Innovative Technologies and Scientific Solutions for Industries, No. 1 (15), P. 14–22. DOI: https://doi.org/10.30837/itssi.2021.15.014.

Korablyov, M. and Lutskyy, S. (2022), "System-information models for Intelligent Information Processing", Innovative Technologies and Scientific Solutions for Industries, No. 3 (21), P. 26–38. DOI: https://doi.org/10.30837/itssi.2022.21.026.

Xie, H., Li, J. and Xue, H. (2018), "A survey of dimensionality reduction techniques based on random projection", arXiv.org. DOI: https://doi.org/10.48550/arXiv.1706.04371

Espadoto, M. et al. (2021), "Toward a quantitative survey of dimension reduction techniques," IEEE Transactions on Visualization and Computer Graphics, No. 27 (3), P. 2153–2173. DOI: https://doi.org/10.1109/tvcg.2019.2944182

Velliangiri, S., Alagumuthukrishnan, S. and Thankumar joseph, S.I. (2019), "A review of dimensionality reduction techniques for efficient computation", Procedia Computer Science, No. 165, P. 104–111. DOI: https://doi.org/10.1016/j.procs.2020.01.079

McInnes, L., Healy, J. and Melville, J. (2020), "UMAP: Uniform manifold approximation and projection for dimension reduction", arXiv.org. DOI: https://doi.org/10.48550/arXiv.1802.03426

Chorowski, J. et al. (2019), "Unsupervised speech representation learning using WaveNet autoencoders", IEEE/ACM Transactions on Audio, Speech, and Language Processing, No. 27 (12), P. 2041–2053. DOI: https://doi.org/10.1109/TASLP.2019.2938863

Jia, W. et al. (2022), "Feature dimensionality reduction: A Review", Complex & Intelligent Systems, No. 8 (3), P. 2663–2693. DOI: https://doi.org/10.1007/s40747-021-00637-x

May, P. and Rekabdarkolaee, H.M. (2022), "Dimension reduction for spatially correlated data: Spatial predictor envelope", arXiv.org. DOI: https://doi.org/10.48550/arXiv.2201.01919

Matchev, K.T., Matcheva, K. and Roman, A. (2022), "Unsupervised machine learning for exploratory data analysis of Exoplanet Transmission Spectra", arXiv.org. DOI: https://doi.org/10.48550/arXiv.2201.02696

Björklund, A., Mäkelä, J. and Puolamäki, K. (2022), "SLISEMAP: Supervised dimensionality reduction through local explanations", Machine Learning, No. 112 (1), P. 1–43. DOI: https://doi.org/10.1007/s10994-022-06261-1

Bhandari, N. et al. (2022), "A comprehensive survey on computational learning methods for analysis of Gene Expression Data", arXiv.org. DOI: https://doi.org/10.48550/arXiv.2202.02958

Bank, D., Koenigstein, N. and Giryes, R. (2021), "Autoencoders", arXiv.org. DOI: https://doi.org/10.48550/arXiv.2003.05991

Hinton, G.E. et al. (2012), "Improving neural networks by preventing co-adaptation of feature detectors", arXiv.org. DOI: https://doi.org/10.48550/arXiv.1207.0580

Ioffe, S. and Szegedy, C. (2015), "Batch normalization: Accelerating deep network training by reducing internal covariate shift", International conference on machine learning. DOI: https://doi.org/10.48550/arXiv.1502.03167

Kingma, D.P. and Ba, J. (2017), "Adam: A method for stochastic optimization", arXiv.org. DOI: https://doi.org/10.48550/arXiv.1412.6980

Downloads

Published

2023-04-21

How to Cite

Shevchenko, D., Ugryumov, M., & Artiukh, S. (2023). MONITORING DATA AGGREGATION OF DYNAMIC SYSTEMS USING INFORMATION TECHNOLOGIES. INNOVATIVE TECHNOLOGIES AND SCIENTIFIC SOLUTIONS FOR INDUSTRIES, (1 (23), 123–131. https://doi.org/10.30837/ITSSI.2023.23.123