Investigation of computer vision techniques for indoor navigation systems
DOI:
https://doi.org/10.30837/2522-9818.2025.2.005Keywords:
system; localization; navigation; blindness; recognition; computer vision; classification.Abstract
The subject of this article is the development and implementation of computer vision methods that can be integrated into an indoor navigation system designed for individuals with visual impairments. The goal of the study is to enhance such a system with advanced object recognition capabilities in enclosed environments by combining modern technologies, including artificial intelligence, spatial analysis, voice control, and Bluetooth-based localization. To achieve this, a number of tasks were carried out. These included an analysis of the problem domain and justification of the study’s relevance, a comparison of existing solutions, and the development of a generalized model of the navigation system with a voice interface, enabling real-time search for locations and items. A specialized dataset was prepared, containing images of key obstacle classes typically encountered in indoor environments – such as shopping carts, barrier tape, forklifts, and people. A new two-stage object recognition method was proposed to detect these classes in complex scenes. Additionally, a comparative analysis of deep learning architectures for object detection was conducted, followed by experimental studies to assess training quality and system robustness. The research employed various image preprocessing methods – bilateral filtering, Gaussian blurring, enhancement of specific color channels, motion blur removal, and noise reduction using averaging filters – as well as neural network-based methods for data analysis and statistical evaluation approaches. The results demonstrate that the proposed method significantly improves object detection performance on real-world images, achieving an average intersection-over-union (IoU) of 68% and a confidence level of 69%, which is 79% and 89% higher, respectively, compared to baseline recognition results on noisy inputs. However, the findings also revealed the necessity of integrating additional sensors, such as LiDAR, to reliably detect low-contrast or reflective obstacles like glass storefronts, which are difficult to identify using computer vision alone. Conclusions. The study confirms that the proposed two-stage preprocessing, and recognition pipeline significantly enhances navigation system performance for users with visual impairments, while also highlighting the importance of combining vision-based methods with complementary sensing technologies to ensure safe and reliable operation in complex indoor environments.
References
References
Khan, S., Nazir, S., & Khan, H. U. (2021), "Analysis of navigation assistants for blind and visually impaired people: A systematic review". IEEE access 9 (2021), Р. 26712–26734. DOI:10.1109/ACCESS.2021.3052415
Барковська, О., Сердечний, В. (2024), "Intelligent assistance system for people with visual impairments". Innovative technologies and scientific solutions for industries, (2 (28)), Р. 6–16. DOI:10.30837/2522-9818.2024.28.006
Ashmafee, M. H., & Sabab, S. A. (2016), "Blind Reader: An intelligent assistant for blind". In 2016 19th International Conference on Computer and Information Technology. DOI: 10.1109/ICCITECHN.2016.7860200
Wu, M., Li, C., & Yao, Z. (2022), "Deep active learning for computer vision tasks: methodologies, applications, and challenges". Applied Sciences, 12(16), 8103 р. DOI: https://doi.org/10.3390/app12168103
Paneru, S., Jeelani, I. (2021), "Computer vision applications in construction: Current state, opportunities & challenges". Automation in Construction, 132, 103940 р. DOI: 10.1016/j.autcon.2021.103940
Elyan, E., Vuttipittayamongkol, P., Johnston, P., Martin, K., McPherson, K., Moreno-García, C. F., Sarker, M. M. K. (2022), "Computer vision and machine learning for medical image analysis: recent advances, challenges, and way forward". Artificial Intelligence Surgery, 2(1), Р. 24–45. DOI: 10.20517/ais.2021.15
Naik, B. T., Hashmi, M. F., Bokde, N. D. (2022), "A comprehensive review of computer vision in sports: Open issues, future trends and research directions". Applied Sciences, 12(9), 4429 р. DOI: https://doi.org/10.3390/app12094429
Zablocki, É., Ben-Younes, H., Pérez, P., & Cord, M. (2022), "Explainability of deep vision-based autonomous driving systems: Review and challenges". International Journal of Computer Vision, 130(10), Р. 2425–2452. DOI: https://doi.org/10.1007/s11263-022-01657-x
He, K., Zhang, X., Ren, S., & Sun, J. (2016), "Deep residual learning for image recognition". In Proceedings of the IEEE conference on computer vision and pattern recognition. P. 770–778. DOI: 10.1109/cvpr.2016.90
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K. Q. (2017), "Densely connected convolutional networks". In Proceedings of the IEEE conference on computer vision and pattern recognition. Р. 4700–4708. DOI: 10.1109/cvpr.2017.243
Tan, M., & Le, Q. (2019), "Efficientnet: Rethinking model scaling for convolutional neural networks". In International conference on machine learning. Р. 6105–6114. DOI: https://doi.org/10.48550/arXiv.1905.11946
Ren, S., He, K., Girshick, R., & Sun, J. (2016), "Faster R-CNN: Towards real-time object detection with region proposal networks". IEEE transactions on pattern analysis and machine intelligence, 39(6), Р. 1137–1149. DOI:10.1109/tpami.2016.2577031
Alexey, D. (2020), "An image is worth 16x16 words: Transformers for image recognition at scale". Computer Vision and Pattern Recognition.
Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Guo, B. (2021), "Swin transformer: Hierarchical vision transformer using shifted windows". In Proceedings of the IEEE/CVF international conference on computer vision Р. 10012–10022. DOI: https://doi.org/10.1109/ICCV48922.2021.00986
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., & Zagoruyko, S. (2020), "End-to-end object detection with transformers". In European conference on computer vision. Cham: Springer International Publishing. Р. 213–229. DOI: https://doi.org/10.1007/978-3-030-58452-8_13
Liu, Z., Mao, H., Wu, C. Y., Feichtenhofer, C., Darrell, T., Xie, S. (2022), "A convnet for the 2020s". In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. Р. 11976–11986. DOI: 10.1109/CVPR52688.2022.01167
Redmon, J. (2016), "You only look once: Unified, real-time object detection". In Proceedings of the IEEE conference on computer vision and pattern recognition. DOI:10.1109/CVPR.2016.91
Brock, A. (2018), "Large Scale GAN Training for High Fidelity Natural Image Synthesis", DOI:10.48550/arXiv.1809.11096
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Our journal abides by the Creative Commons copyright rights and permissions for open access journals.
Authors who publish with this journal agree to the following terms:
Authors hold the copyright without restrictions and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0) that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-commercial and non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
Authors are permitted and encouraged to post their published work online (e.g., in institutional repositories or on their website) as it can lead to productive exchanges, as well as earlier and greater citation of published work.












