Improving a model of object recognition in images based on a convolutional neural network

Bogdan Knysh; Yaroslav Kulyk

doi:10.15587/1729-4061.2021.233786

Authors

Bogdan Knysh Vinnytsia National Technical University, Ukraine https://orcid.org/0000-0002-6779-4349
Yaroslav Kulyk Vinnytsia National Technical University, Ukraine https://orcid.org/0000-0001-8327-8259

DOI:

https://doi.org/10.15587/1729-4061.2021.233786

Keywords:

image processing, object recognition, convolutional neural networks, unmanned aerial vehicle

Abstract

This paper considers a model of object recognition in images using convolutional neural networks; the efficiency of the model-based process involving the training of deep layers in convolutional neural networks has been studied. There are objective difficulties associated with determining the optimal characteristics of neural networks, so there is an issue related to retraining a neural network. Eliminating the retraining by determining only the optimal number of epochs is insufficient since it does not provide high accuracy.

The requirements for the set of images for model training and verification have been defined. These requirements are better met by the INRIA image set (France).

GoogLeNet (USA) has been established to be a trained model that can perform object recognition on images but the object recognition reliability is insufficient. Therefore, it becomes necessary to improve the effectiveness of object recognition in images. It is advisable to use the GoogLeNet architecture to build a specialized model that, by changing the parameters and retraining some layers, could allow for better recognition of objects in images.

Ten models were trained using the following parameters: learning speed, the number of epochs, an optimization algorithm, the type of learning speed change, a gamma or power coefficient, a pre-trained model.

A convolutional neural network has been developed to improve the precision and efficiency of object recognition in images. The optimal neural network training parameters were determined: training speed, 0.000025; the number of epochs, 100; a power coefficient, 0.25, etc. A 3 % increase in precision was obtained, which makes it possible to assert the proper choice of the architecture for the developed network and the selection of its parameters. That allows this network to be used for practical tasks of object recognition in images.

Author Biographies

Bogdan Knysh, Vinnytsia National Technical University

PhD, Associate Professor

Department of Electronics and Nanosystems

Yaroslav Kulyk, Vinnytsia National Technical University

PhD, Associate Professor

Department of Automation and Intelligent Information Technologies

References

Bilinskiy, Y. Y., Knysh, B. P., Kulyk, Y. А. (2017). Quality estimation methodology of filter performance for suppression noise in the mathcad package. Herald of Khmelnytskyi national university, 3, 125–130. Available at: http://ir.lib.vntu.edu.ua/bitstream/handle/123456789/23238/47857.pdf?sequence=2&isAllowed=y
Gall, J., Razavi, N., Van Gool, L. (2012). An Introduction to Random Forests for Multi-class Object Detection. Outdoor and Large-Scale Real-World Scene Analysis, 243–263. doi: https://doi.org/10.1007/978-3-642-34091-8_11
Viola, P., Jones, M. (2001). Rapid object detection using a boosted cascade of simple features. Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001. doi: https://doi.org/10.1109/cvpr.2001.990517
Weiming Hu, Wei Hu, Maybank, S. (2008). AdaBoost-Based Algorithm for Network Intrusion Detection. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 38 (2), 577–583. doi: https://doi.org/10.1109/tsmcb.2007.914695
Shang, W., Sohn, K., Almeida, D., Honglak, L. (2016). Understanding and Improving Convolutional Neural Networks via Concatenated Rectified Linear Units. Proceedings of The 33rd International Conference on Machine Learning, 48, 2217–2225. Available at: http://proceedings.mlr.press/v48/shang16.html
Simonyan, K., Zisserman, A. (2015). Very Deep Convolutional Networks for Large-Scale Image Recognition. ICLR. Available at: https://arxiv.org/pdf/1409.1556.pdf
Redmon, J., Divvala, S., Girshick, R., Farhadi, A. (2016). You Only Look Once: Unified, Real-Time Object Detection. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). doi: https://doi.org/10.1109/cvpr.2016.91
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D. et. al. (2015). Going deeper with convolutions. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). doi: https://doi.org/10.1109/cvpr.2015.7298594
Prathap, G., Afanasyev, I. (2018). Deep Learning Approach for Building Detection in Satellite Multispectral Imagery. 2018 International Conference on Intelligent Systems (IS). doi: https://doi.org/10.1109/is.2018.8710471
Wu, K., Chen, Z., Li, W. (2018). A Novel Intrusion Detection Model for a Massive Network Using Convolutional Neural Networks. IEEE Access, 6, 50850–50859. doi: https://doi.org/10.1109/access.2018.2868993
Maggiori, E., Tarabalka, Y., Charpiat, G., Alliez, P. (2017). Can semantic labeling methods generalize to any city? The inria aerial image labeling benchmark. 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS). doi: https://doi.org/10.1109/igarss.2017.8127684
Knysh, B., Kulyk, Y. (2021). Development of an image segmentation model based on a convolutional neural network. Eastern-European Journal of Enterprise Technologies, 2 (2 (110)), 6–15. doi: https://doi.org/10.15587/1729-4061.2021.228644
Krizhevsky, A., Sutskever, I., Hinton, G. E. (2012). ImageNet Classification with Deep Convolutional Neural Networks. NIPS'12: Proceedings of the 25th International Conference on Neural Information Processing Systems, 1097–1105. Available at: https://papers.nips.cc/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf
Zeiler, M. D., Fergus, R. (2014). Visualizing and Understanding Convolutional Networks. Lecture Notes in Computer Science, 818–833. doi: https://doi.org/10.1007/978-3-319-10590-1_53
Deep Learning: GoogLeNet Explained. Towards Data Science. Available at: https://towardsdatascience.com/deep-learning-googlenet-explained-de8861c82765
Tao, A., Barker, J., Sarathy, S. (2016). DetectNet: Deep Neural Network for Object Detection in DIGITS. NVidia developer blog. Available at: https://developer.nvidia.com/blog/detectnet-deep-neural-network-object-detection-digits
Kingma, D. P., Ba, J. (2015). Adam: a method for stochastic optimization. ICLR 2015. Available at: https://arxiv.org/pdf/1412.6980.pdf
Kvetny, R. N., Masliy, R. V., Kyrylenko, O. M. (2020). Detection and classification of traffic objects using the environment digits. Optoelectronic Information-Power Technologies, 1 (39), 14–20. doi: https://doi.org/10.31649/1681-7893-2020-39-1-14-20
Wilson, A. C., Roelofs, R., Stern, M., Srebro, N., Recht, B. (2017). The marginal value of adaptive gradient methods in machine learning. 31st Conference on Neural Information Processing Systems (NIPS 2017). Available at: https://arxiv.org/pdf/1705.08292v2.pdf
Guo, Z., Chen, Q., Wu, G., Xu, Y., Shibasaki, R., Shao, X. (2017). Village Building Identification Based on Ensemble Convolutional Neural Networks. Sensors, 17 (11), 2487. doi: https://doi.org/10.3390/s17112487
Erdem, F., Avdan, U. (2020). Comparison of Different U-Net Models for Building Extraction from High-Resolution Aerial Imagery. International Journal of Environment and Geoinformatics, 7 (3), 221–227. doi: https://doi.org/10.30897/ijegeo.684951
Nvidia Aerial Drone Dataset. Available at: https://nvidia.box.com/shared/static/ft9cc5yjvrbhkh07wcivu5ji9zola6i1.gz

Improving a model of object recognition in images based on a convolutional neural network

Authors

DOI:

Keywords:

Abstract

Author Biographies

Bogdan Knysh, Vinnytsia National Technical University

Yaroslav Kulyk, Vinnytsia National Technical University

References

Downloads

Published

How to Cite

Issue

Section

License

Language

Information

Make a Submission

Developed By

Current Issue