Improving a model of object recognition in images based on a convolutional neural network

Authors

DOI:

https://doi.org/10.15587/1729-4061.2021.233786

Keywords:

image processing, object recognition, convolutional neural networks, unmanned aerial vehicle

Abstract

This paper considers a model of object recognition in images using convolutional neural networks; the efficiency of the model-based process involving the training of deep layers in convolutional neural networks has been studied. There are objective difficulties associated with determining the optimal characteristics of neural networks, so there is an issue related to retraining a neural network. Eliminating the retraining by determining only the optimal number of epochs is insufficient since it does not provide high accuracy.

The requirements for the set of images for model training and verification have been defined. These requirements are better met by the INRIA image set (France).

GoogLeNet (USA) has been established to be a trained model that can perform object recognition on images but the object recognition reliability is insufficient. Therefore, it becomes necessary to improve the effectiveness of object recognition in images. It is advisable to use the GoogLeNet architecture to build a specialized model that, by changing the parameters and retraining some layers, could allow for better recognition of objects in images.

Ten models were trained using the following parameters: learning speed, the number of epochs, an optimization algorithm, the type of learning speed change, a gamma or power coefficient, a pre-trained model.

A convolutional neural network has been developed to improve the precision and efficiency of object recognition in images. The optimal neural network training parameters were determined: training speed, 0.000025; the number of epochs, 100; a power coefficient, 0.25, etc. A 3 % increase in precision was obtained, which makes it possible to assert the proper choice of the architecture for the developed network and the selection of its parameters. That allows this network to be used for practical tasks of object recognition in images.

Author Biographies

Bogdan Knysh, Vinnytsia National Technical University

PhD, Associate Professor

Department of Electronics and Nanosystems

Yaroslav Kulyk, Vinnytsia National Technical University

PhD, Associate Professor

Department of Automation and Intelligent Information Technologies

References

  1. Bilinskiy, Y. Y., Knysh, B. P., Kulyk, Y. А. (2017). Quality estimation methodology of filter performance for suppression noise in the mathcad package. Herald of Khmelnytskyi national university, 3, 125–130. Available at: http://ir.lib.vntu.edu.ua/bitstream/handle/123456789/23238/47857.pdf?sequence=2&isAllowed=y
  2. Gall, J., Razavi, N., Van Gool, L. (2012). An Introduction to Random Forests for Multi-class Object Detection. Outdoor and Large-Scale Real-World Scene Analysis, 243–263. doi: https://doi.org/10.1007/978-3-642-34091-8_11
  3. Viola, P., Jones, M. (2001). Rapid object detection using a boosted cascade of simple features. Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001. doi: https://doi.org/10.1109/cvpr.2001.990517
  4. Weiming Hu, Wei Hu, Maybank, S. (2008). AdaBoost-Based Algorithm for Network Intrusion Detection. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 38 (2), 577–583. doi: https://doi.org/10.1109/tsmcb.2007.914695
  5. Shang, W., Sohn, K., Almeida, D., Honglak, L. (2016). Understanding and Improving Convolutional Neural Networks via Concatenated Rectified Linear Units. Proceedings of The 33rd International Conference on Machine Learning, 48, 2217–2225. Available at: http://proceedings.mlr.press/v48/shang16.html
  6. Simonyan, K., Zisserman, A. (2015). Very Deep Convolutional Networks for Large-Scale Image Recognition. ICLR. Available at: https://arxiv.org/pdf/1409.1556.pdf
  7. Redmon, J., Divvala, S., Girshick, R., Farhadi, A. (2016). You Only Look Once: Unified, Real-Time Object Detection. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). doi: https://doi.org/10.1109/cvpr.2016.91
  8. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D. et. al. (2015). Going deeper with convolutions. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). doi: https://doi.org/10.1109/cvpr.2015.7298594
  9. Prathap, G., Afanasyev, I. (2018). Deep Learning Approach for Building Detection in Satellite Multispectral Imagery. 2018 International Conference on Intelligent Systems (IS). doi: https://doi.org/10.1109/is.2018.8710471
  10. Wu, K., Chen, Z., Li, W. (2018). A Novel Intrusion Detection Model for a Massive Network Using Convolutional Neural Networks. IEEE Access, 6, 50850–50859. doi: https://doi.org/10.1109/access.2018.2868993
  11. Maggiori, E., Tarabalka, Y., Charpiat, G., Alliez, P. (2017). Can semantic labeling methods generalize to any city? The inria aerial image labeling benchmark. 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS). doi: https://doi.org/10.1109/igarss.2017.8127684
  12. Knysh, B., Kulyk, Y. (2021). Development of an image segmentation model based on a convolutional neural network. Eastern-European Journal of Enterprise Technologies, 2 (2 (110)), 6–15. doi: https://doi.org/10.15587/1729-4061.2021.228644
  13. Krizhevsky, A., Sutskever, I., Hinton, G. E. (2012). ImageNet Classification with Deep Convolutional Neural Networks. NIPS'12: Proceedings of the 25th International Conference on Neural Information Processing Systems, 1097–1105. Available at: https://papers.nips.cc/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf
  14. Zeiler, M. D., Fergus, R. (2014). Visualizing and Understanding Convolutional Networks. Lecture Notes in Computer Science, 818–833. doi: https://doi.org/10.1007/978-3-319-10590-1_53
  15. Deep Learning: GoogLeNet Explained. Towards Data Science. Available at: https://towardsdatascience.com/deep-learning-googlenet-explained-de8861c82765
  16. Tao, A., Barker, J., Sarathy, S. (2016). DetectNet: Deep Neural Network for Object Detection in DIGITS. NVidia developer blog. Available at: https://developer.nvidia.com/blog/detectnet-deep-neural-network-object-detection-digits
  17. Kingma, D. P., Ba, J. (2015). Adam: a method for stochastic optimization. ICLR 2015. Available at: https://arxiv.org/pdf/1412.6980.pdf
  18. Kvetny, R. N., Masliy, R. V., Kyrylenko, O. M. (2020). Detection and classification of traffic objects using the environment digits. Optoelectronic Information-Power Technologies, 1 (39), 14–20. doi: https://doi.org/10.31649/1681-7893-2020-39-1-14-20
  19. Wilson, A. C., Roelofs, R., Stern, M., Srebro, N., Recht, B. (2017). The marginal value of adaptive gradient methods in machine learning. 31st Conference on Neural Information Processing Systems (NIPS 2017). Available at: https://arxiv.org/pdf/1705.08292v2.pdf
  20. Guo, Z., Chen, Q., Wu, G., Xu, Y., Shibasaki, R., Shao, X. (2017). Village Building Identification Based on Ensemble Convolutional Neural Networks. Sensors, 17 (11), 2487. doi: https://doi.org/10.3390/s17112487
  21. Erdem, F., Avdan, U. (2020). Comparison of Different U-Net Models for Building Extraction from High-Resolution Aerial Imagery. International Journal of Environment and Geoinformatics, 7 (3), 221–227. doi: https://doi.org/10.30897/ijegeo.684951
  22. Nvidia Aerial Drone Dataset. Available at: https://nvidia.box.com/shared/static/ft9cc5yjvrbhkh07wcivu5ji9zola6i1.gz

Downloads

Published

2021-06-30

How to Cite

Knysh, B., & Kulyk, Y. (2021). Improving a model of object recognition in images based on a convolutional neural network. Eastern-European Journal of Enterprise Technologies, 3(9(111), 40–50. https://doi.org/10.15587/1729-4061.2021.233786

Issue

Section

Information and controlling system