Improving the model of object detection on aerial photographs and video in unmanned aerial systems
DOI:
https://doi.org/10.15587/1729-4061.2022.252876Keywords:
neural network, object detection, VisDrone 2021, Microsoft COCO, YOLOv5x, unmanned aerial systemAbstract
This paper considers a model of object detection on aerial photographs and video using a neural network in unmanned aerial systems. The development of artificial intelligence and computer vision systems for unmanned systems (drones, robots) requires the improvement of models for detecting and recognizing objects in images and video streams. The results of video and aerial photography in unmanned aircraft systems are processed by the operator manually but there are objective difficulties associated with the operator’s processing of a large number of videos and aerial photographs, so it is advisable to automate this process. Analysis of neural network models has revealed that the YOLOv5x model (USA) is most suitable, as a basic model, for performing the task of object detection on aerial photographs and video. The Microsoft COCO suite (USA) is used to train this model. This set contains more than 200,000 images across 80 categories. To improve the YOLOv5x model, the neural network was trained with a set of VisDrone 2021 images (China) with the choice of such optimal training parameters as the optimization algorithm SGD; the initial learning rate (step) of 0.0005; the number of epochs of 25. As a result, a new model of object detection on aerial photographs and videos with the proposed name VisDroneYOLOv5x was obtained. The effectiveness of the improved model was studied using aerial photographs and videos from the VisDrone 2021 set. To assess the effectiveness of the model, the following indicators were chosen as the main indicators: accuracy, sensitivity, the estimation of average accuracy. Using a convolutional neural network has made it possible to automate the process of object detection on aerial photographs and video in unmanned aerial systems.
References
- Kuznetsova, Y., Somochkin, M. (2021). The concept of creating an intellectual core of an integrated information and analytical system for action in emergencies of man-made nature. Innovative Technologies and Scientific Solutions for Industries, 4 (18), 40–49. doi: https://doi.org/10.30837/itssi.2021.18.040
- Peng, F., Zheng, L., Cui, X., Wang, Z. (2021). Traffic flow statistics algorithm based on YOLOv3. 2021 International Conference on Communications, Information System and Computer Engineering (CISCE). doi: https://doi.org/10.1109/cisce52179.2021.9445932
- Bin Zuraimi, M. A., Kamaru Zaman, F. H. (2021). Vehicle Detection and Tracking using YOLO and DeepSORT. 2021 IEEE 11th IEEE Symposium on Computer Applications & Industrial Electronics (ISCAIE). doi: https://doi.org/10.1109/iscaie51753.2021.9431784
- Fedosov, V. P., Ibadov, S. R., Ibadov, R. R., Kucheryavenko, S. V. (2021). Method For Detecting Violation at a Pedestrian Crossing Using a Convolutional Neuaral Network. 2021 Radiation and Scattering of Electromagnetic Waves (RSEMW). doi: https://doi.org/10.1109/rsemw52378.2021.9494089
- Sindhu, V. S. (2021). Vehicle Identification from Traffic Video Surveillance Using YOLOv4. 2021 5th International Conference on Intelligent Computing and Control Systems (ICICCS). doi: https://doi.org/10.1109/iciccs51141.2021.9432144
- Kim, J., Koh, J., Lee, B., Yang, S., Choi J. (2021). Video Object Detection Using Object's Motion Context and Spatio-Temporal Feature Aggregation. 2020 25th International Conference on Pattern Recognition (ICPR). doi: https://doi.org/10.1109/icpr48806.2021.9412715
- Ahmed, A. A., Echi, M. (2021). Hawk-Eye: An AI-Powered Threat Detector for Intelligent Surveillance Cameras. IEEE Access, 9, 63283–63293. doi: https://doi.org/10.1109/access.2021.3074319
- Zhao, Z.-Q., Zheng, P., Xu, S.-T., Wu, X. (2019). Object Detection With Deep Learning: A Review. IEEE Transactions on Neural Networks and Learning Systems, 30 (11), 3212–3232. doi: https://doi.org/10.1109/tnnls.2018.2876865
- Redmon, J., Divvala, S., Girshick, R., Farhadi, A. (2016). You Only Look Once: Unified, Real-Time Object Detection. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). doi: https://doi.org/10.1109/cvpr.2016.91
- Slyusar, V., Protsenko, M., Chernukha, A., Gornostal, S., Rudakov, S., Shevchenko, S. et. al. (2021). Construction of an advanced method for recognizing monitored objects by a convolutional neural network using a discrete wavelet transform. Eastern-European Journal of Enterprise Technologies, 4 (9 (112)), 65–77. doi: https://doi.org/10.15587/1729-4061.2021.238601
- Redmon, J., Farhadi, A. (2017). YOLO9000: Better, Faster, Stronger. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). doi: https://doi.org/10.1109/cvpr.2017.690
- Knysh, B., Kulyk, Y. (2021). Improving a model of object recognition in images based on a convolutional neural network. Eastern-European Journal of Enterprise Technologies, 3 (9 (111)), 40–50. doi: https://doi.org/10.15587/1729-4061.2021.233786
- Huang, Z., Wang, J., Fu, X., Yu, T., Guo, Y., Wang, R. (2020). DC-SPP-YOLO: Dense connection and spatial pyramid pooling based YOLO for object detection. Information Sciences, 522, 241–258. doi: https://doi.org/10.1016/j.ins.2020.02.067
- Bochkovskiy, A., Wang, C.-Y., Liao, M. (2020). YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv.org. Available at: https://arxiv.org/pdf/2004.10934.pdf
- Xu, R., Lin, H., Lu, K., Cao, L., Liu, Y. (2021). A Forest Fire Detection System Based on Ensemble Learning. Forests, 12 (2), 217. doi: https://doi.org/10.3390/f12020217
- Everingham, M., Eslami, S. M. A., Van Gool, L., Williams, C. K. I., Winn, J., Zisserman, A. (2014). The Pascal Visual Object Classes Challenge: A Retrospective. International Journal of Computer Vision, 111 (1), 98–136. doi: https://doi.org/10.1007/s11263-014-0733-5 5
- Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D. et. al. (2014). Microsoft COCO: Common Objects in Context. Lecture Notes in Computer Science, 740–755. doi: https://doi.org/10.1007/978-3-319-10602-1_48
- Zhu, P., Wen, L., Bian, X., Ling, H., Hu, Q. (2018). Vision meets drones: A challenge. arXiv.org. Available at: https://arxiv.org/pdf/1804.07437.pdf
- Ultralytics Yolov5 and Vision AI. Available at: https://github.com/ultralytics/yolov5
- Slyusar, V., Protsenko, M., Chernukha, A., Kovalov, P., Borodych, P., Shevchenko, S. et. al. (2021). Improvement of the model of object recognition in aero photographs using deep convolutional neural networks. Eastern-European Journal of Enterprise Technologies, 5 (2 (113)), 6–21. doi: https://doi.org/10.15587/1729-4061.2021.243094
- Slyusar, V., Protsenko, M., Chernukha, A., Melkin, V., Petrova, O., Kravtsov, M. et. al. (2021). Improving a neural network model for semantic segmentation of images of monitored objects in aerial photographs. Eastern-European Journal of Enterprise Technologies, 6 (2 (114)), 86–95. doi: https://doi.org/10.15587/1729-4061.2021.248390
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2022 Vadym Slyusar, Mykhailo Protsenko, Anton Chernukha, Vasyl Melkin, Oleh Biloborodov, Mykola Samoilenko, Olena Kravchenko, Halyna Kalynychenko, Anton Rohovyi, Mykhaylo Soloshchuk
This work is licensed under a Creative Commons Attribution 4.0 International License.
The consolidation and conditions for the transfer of copyright (identification of authorship) is carried out in the License Agreement. In particular, the authors reserve the right to the authorship of their manuscript and transfer the first publication of this work to the journal under the terms of the Creative Commons CC BY license. At the same time, they have the right to conclude on their own additional agreements concerning the non-exclusive distribution of the work in the form in which it was published by this journal, but provided that the link to the first publication of the article in this journal is preserved.
A license agreement is a document in which the author warrants that he/she owns all copyright for the work (manuscript, article, etc.).
The authors, signing the License Agreement with TECHNOLOGY CENTER PC, have all rights to the further use of their work, provided that they link to our edition in which the work was published.
According to the terms of the License Agreement, the Publisher TECHNOLOGY CENTER PC does not take away your copyrights and receives permission from the authors to use and dissemination of the publication through the world's scientific resources (own electronic resources, scientometric databases, repositories, libraries, etc.).
In the absence of a signed License Agreement or in the absence of this agreement of identifiers allowing to identify the identity of the author, the editors have no right to work with the manuscript.
It is important to remember that there is another type of agreement between authors and publishers – when copyright is transferred from the authors to the publisher. In this case, the authors lose ownership of their work and may not use it in any way.