Improving a neural network model for semantic segmentation of images of monitored objects in aerial photographs
Keywords:semantic segmentation of images, convolutional neural network, aerial photograph, unmanned aerial vehicle
This paper considers a model of the neural network for semantically segmenting the images of monitored objects on aerial photographs. Unmanned aerial vehicles monitor objects by analyzing (processing) aerial photographs and video streams. The results of aerial photography are processed by the operator in a manual mode; however, there are objective difficulties associated with the operator's handling a large number of aerial photographs, which is why it is advisable to automate this process. Analysis of the models showed that to perform the task of semantic segmentation of images of monitored objects on aerial photographs, the U-Net model (Germany), which is a convolutional neural network, is most suitable as a basic model. This model has been improved by using a wavelet layer and the optimal values of the model training parameters: speed (step) ‒ 0.001, the number of epochs ‒ 60, the optimization algorithm ‒ Adam. The training was conducted by a set of segmented images acquired from aerial photographs (with a resolution of 6,000×4,000 pixels) by the Image Labeler software in the mathematical programming environment MATLAB R2020b (USA). As a result, a new model for semantically segmenting the images of monitored objects on aerial photographs with the proposed name U-NetWavelet was built.
The effectiveness of the improved model was investigated using an example of processing 80 aerial photographs. The accuracy, sensitivity, and segmentation error were selected as the main indicators of the model's efficiency. The use of a modified wavelet layer has made it possible to adapt the size of an aerial photograph to the parameters of the input layer of the neural network, to improve the efficiency of image segmentation in aerial photographs; the application of a convolutional neural network has allowed this process to be automatic.
Pospelov, B., Andronov, V., Rybka, E., Krainiukov, O., Maksymenko, N., Meleshchenko, R. et. al. (2020). Mathematical model of determining a risk to the human health along with the detection of hazardous states of urban atmosphere pollution based on measuring the current concentrations of pollutants. Eastern-European Journal of Enterprise Technologies, 4 (10 (106)), 37–44. doi: https://doi.org/10.15587/1729-4061.2020.210059
Semko, A. N., Beskrovnaya, M. V., Vinogradov, S. A., Hritsina, I. N., Yagudina, N. I. (2014). The usage of high speed impulse liquid jets for putting out gas blowouts. Journal of Theoretical and Applied Mechanics, 52 (3), 655–664.
Chernukha, A., Teslenko, A., Kovalov, P., Bezuglov, O. (2020). Mathematical Modeling of Fire-Proof Efficiency of Coatings Based on Silicate Composition. Materials Science Forum, 1006, 70–75. doi: https://doi.org/10.4028/www.scientific.net/msf.1006.70
Vambol, S., Vambol, V., Kondratenko, O., Suchikova, Y., Hurenko, O. (2017). Assessment of improvement of ecological safety of power plants by arranging the system of pollutant neutralization. Eastern-European Journal of Enterprise Technologies, 3 (10 (87)), 63–73. doi: https://doi.org/10.15587/1729-4061.2017.102314
Vambol, S., Vambol, V., Sobyna, V., Koloskov, V., Poberezhna, L. (2018). Investigation of the energy efficiency of waste utilization technology, with considering the use of low-temperature separation of the resulting gas mixtures. Energetika, 64 (4), 186–195. doi: https://doi.org/10.6001/energetika.v64i4.3893
Pospelov, B., Rybka, E., Meleshchenko, R., Borodych, P., Gornostal, S. (2019). Development of the method for rapid detection of hazardous atmospheric pollution of cities with the help of recurrence measures. Eastern-European Journal of Enterprise Technologies, 1 (10 (97)), 29–35. doi: https://doi.org/10.15587/1729-4061.2019.155027
Dadashov, I., Loboichenko, V., Kireev, A. (2018). Analysis of the ecological characteristics of environment friendly fire fighting chemicals used in extinguishing oil products. Pollution Research, 37 (1), 63–77. Available at: http://repositsc.nuczu.edu.ua/handle/123456789/6849
Holla, A., Pai, M., Verma, U., Pai, R. M. (2020). Efficient Vehicle Counting by Eliminating Identical Vehicles in UAV aerial videos. 2020 IEEE International Conference on Distributed Computing, VLSI, Electrical Circuits and Robotics (DISCOVER), 246–251. doi: https://doi.org/10.1109/discover50404.2020.9278095
Deng, H., Zhang, Y., Li, R., Hu, C., Feng, Z., Li, H. (2022). Combining residual attention mechanisms and generative adversarial networks for hippocampus segmentation. Tsinghua Science and Technology, 27 (1), 68–78. doi: https://doi.org/10.26599/tst.2020.9010056
Jing, W., Jin, T., Xiang, D. (2021). Fast Superpixel-Based Clustering Algorithm for SAR Image Segmentation. IEEE Geoscience and Remote Sensing Letters, 1–1. doi: https://doi.org/10.1109/lgrs.2021.3124071
Xin, L., Chao, L., He, L. (2021). Malicious code detection method based on image segmentation and deep residual network RESNET. 2021 International Conference on Computer Engineering and Application (ICCEA), 473–480. doi: https://doi.org/10.1109/ICCEA53728.2021.00099
Xie, B., Yang, Z., Yang, L., Luo, R., Wei, A., Weng, X., Li, B. (2021). Multi-Scale Fusion With Matching Attention Model: A Novel Decoding Network Cooperated With NAS for Real-Time Semantic Segmentation. IEEE Transactions on Intelligent Transportation Systems, 1–11. doi: https://doi.org/10.1109/tits.2021.3115705
Yang, S., Hou, J., Jia, Y., Mei, S., Du, Q. (2021). Superpixel-Guided Discriminative Low-Rank Representation of Hyperspectral Images for Classification. IEEE Transactions on Image Processing, 30, 8823–8835. doi: https://doi.org/10.1109/tip.2021.3120675
Peng, C., Zhang, K., Ma, Y., Ma, J. (2021). Cross Fusion Net: A Fast Semantic Segmentation Network for Small-Scale Semantic Information Capturing in Aerial Scenes. IEEE Transactions on Geoscience and Remote Sensing, 60, 1–13. doi: https://doi.org/10.1109/tgrs.2021.3053062
Ronneberger, O., Fischer, P., Brox, T. (2015). U-Net: Convolutional Networks for Biomedical Image Segmentation. Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015, 234–241. doi: https://doi.org/10.1007/978-3-319-24574-4_28
Jwaid, W. M., Al-Husseini, Z. S. M., Sabry, A. H. (2021). Development of brain tumor segmentation of magnetic resonance imaging (MRI) using U-Net deep learning. Eastern-European Journal of Enterprise Technologies, 4 (9 (112)), 23–31. doi: https://doi.org/10.15587/1729-4061.2021.238957
Slyusar, V., Protsenko, M., Chernukha, A., Gornostal, S., Rudakov, S., Shevchenko, S. et. al. (2021). Construction of an advanced method for recognizing monitored objects by a convolutional neural network using a discrete wavelet transform. Eastern-European Journal of Enterprise Technologies, 4 (9 (112)), 65–77. doi: https://doi.org/10.15587/1729-4061.2021.238601
Slyusar, V., Protsenko, M., Chernukha, A., Kovalov, P., Borodych, P., Shevchenko, S. et. al. (2021). Improvement of the model of object recognition in aero photographs using deep convolutional neural networks. Eastern-European Journal of Enterprise Technologies, 5 (2 (113)), 6–21. doi: https://doi.org/10.15587/1729-4061.2021.243094
Long, J., Shelhamer, E., Darrell, T. (2015). Fully convolutional networks for semantic segmentation. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). doi: https://doi.org/10.1109/cvpr.2015.7298965
Badrinarayanan, V., Kendall, A., Cipolla, R. (2017). SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39 (12), 2481–2495. doi: https://doi.org/10.1109/tpami.2016.2644615
Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J. (2017). Pyramid Scene Parsing Network. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). doi: https://doi.org/10.1109/cvpr.2017.660
How to Cite
Copyright (c) 2021 Vadym Slyusar, Mykhailo Protsenko, Anton Chernukha, Vasyl Melkin, Olena Petrova, Mikhail Kravtsov, Svitlana Velma, Nataliia Kosenko, Olga Sydorenko, Maksym Sobol
This work is licensed under a Creative Commons Attribution 4.0 International License.
The consolidation and conditions for the transfer of copyright (identification of authorship) is carried out in the License Agreement. In particular, the authors reserve the right to the authorship of their manuscript and transfer the first publication of this work to the journal under the terms of the Creative Commons CC BY license. At the same time, they have the right to conclude on their own additional agreements concerning the non-exclusive distribution of the work in the form in which it was published by this journal, but provided that the link to the first publication of the article in this journal is preserved.