Development of an object-oriented image comparison algorithm for efficient search
DOI:
https://doi.org/10.30837/2522-9818.2025.2.079Keywords:
image processing; object detection; deep learning; image descriptor; image retrieval; big data; image storage; search optimization; information technology.Abstract
The object of research is content-based image retrieval (CBIR). The subject of this study is models and methods for content-based image retrieval (CBIR) and managing large volumes of media content in extensive image storage systems. The goal of the research is to develop an algorithm for comparing object-oriented image descriptors, which involves using advanced computer vision models for object detection and constructing efficient methods for comparing and searching these descriptors. The proposed descriptor and comparison algorithm aim to enhance the efficiency and accuracy of image search and management processes. The tasks include: analyzing modern approaches and solutions for creating and comparing image descriptors and their use in CBIR; developing metrics and algorithms for comparing image descriptors that effectively utilize information about detected objects – such as their types, sizes, and locations – for image search in large data repositories; conducting experiments to evaluate the proposed image search algorithm and comparing its efficiency with existing solutions. The methodology includes: conducting a comprehensive review of advanced image descriptor generation methods, including hash-based descriptors, handcrafted descriptors, and deep learning-based descriptors; analyzing the use of existing descriptors in CBIR systems, focusing on their advantages and limitations; evaluating the best image search algorithms, including deep learning-based approaches; developing an object descriptor comparison algorithm for tag-based search, image-based search, and other tasks. The results obtained are as follows: an object-based image descriptor was developed using state-of-the-art machine learning models for object detection; metrics and comparison algorithms for the proposed descriptors were developed, enabling their use for CBIR in large data repositories; a series of experiments were conducted to assess the efficiency and search quality of the proposed descriptor and algorithms in large-scale image storage systems. These experiments compared their performance with existing methods, revealing their advantages and limitations, namely: faster descriptor generation; faster descriptor comparison than hashed, handcrafted, and deep learning-based descriptors; efficient image filtering in storage; higher search quality and speed for image-based queries. However, the descriptor’s effectiveness depends on the quality of the model and data used for object detection, as images without detected objects do not appear in search results, which may limit search completeness. Conclusions: The developed algorithm for comparing object-oriented image descriptors is an effective tool for solving various CBIR tasks. The obtained results are satisfactory, as the proposed image search algorithm outperforms most alternatives in terms of speed and search quality. A promising direction for future research is the development of a CBIR system using the proposed descriptor and algorithms, enhanced by parallel and distributed computing, and further refinement for specific applications. This would allow its use not only for general-purpose images but also for more precise scientific domains.
References
Список літератури
Gonzalez R. C., Woods R. E. Digital Image Processing: monograph. 4th ed., New York, 2018. 1168 р. ISBN: 9780133356724
Yang W., Zhao H., Wang M. Design of Intelligent Search Engine Service Performance Evaluation System. 2020 5th Asia-Pacific Conference on Intelligent Robot Systems (ACIRS). Singapore, 2020. Р. 86–91. DOI: https://doi.org/10.1109/ACIRS49895.2020.9162611
Amorós F., Payá L., Mayol-Cuevas W. Holistic Descriptors of Omnidirectional Color Images and Their Performance in Estimation of Position and Orientation. IEEE Access. 2020. Vol. 8. Р. 81822–81848. DOI: https://doi.org/10.1109/ACCESS.2020.2990996
Liu X., Cheung G., Lin C.-W. Prior-Based Quantization Bin Matching for Cloud Storage of JPEG Images. IEEE Transactions on Image Processing. 2018. Vol. 27. № 7. H. 3222–3235. DOI: https://doi.org/10.1109/TIP.2018.2799704
Carvalho E. D., Filho A. O. C., Silva R. R. V. Breast Cancer Diagnosis from Histopathological Images Using Textural Features and CBIR. Artificial Intelligence in Medicine. 2020. Vol. 105. Р. 101845. DOI: https://doi.org/10.1016/j.artmed.2020.101845
Nakazato M., Huang T. S. 3D MARS: Immersive Virtual Reality for Content-Based Image Retrieval. IEEE International Conference on Multimedia and Expo (ICME). Tokyo, Japan, 2001. Р. 44–47. DOI: https://doi.org/10.1109/ICME.2001.1237651
Iqbal K., Odetayo M. O., James A. Content-Based Image Retrieval Approach for Biometric Security Using Colour, Texture and Shape Features Controlled by Fuzzy Heuristics. Journal of Computer and System Sciences. 2012. Vol. 78. № 4. Р. 1258–1277. DOI: https://doi.org/10.1016/j.jcss.2011.10.013
Popescu A., Grefenstette G. Social Media Driven Image Retrieval. Proceedings of the 1st ACM International Conference on Multimedia Retrieval (ICMR '11). New York, NY, USA, 2011. Р. 1–8. DOI: https://doi.org/10.1145/1991996.1992029
Liu Y., Mei T., Hua X.-S. CrowdReranking: Exploring Multiple Search Engines for Visual Search Reranking. Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '09). New York, NY, USA, 2009. Р. 500–507. DOI: https://doi.org/10.1145/1571941.1572027
Staszewski P., Jaworski M., Cao J. A New Approach to Descriptors Generation for Image Retrieval by Analyzing Activations of Deep Neural Network Layers. IEEE Transactions on Neural Networks and Learning Systems. 2022. Vol. 33. № 12. Р. 7913–7920. DOI: https://doi.org/10.1109/TNNLS.2021.3084633
Sahmoudi Y., El-Ogri O., El-Mekkaoui J. An Efficient Biomedical Color Image Retrieval System Based on Continuous Orthogonal Legendre Fourier Quaternion. 2024 Sixth International Conference on Intelligent Computing in Data Sciences (ICDS). Marrakech, Morocco, 2024. Р. 1–6. DOI: https://doi.org/10.1109/ICDS62089.2024.10756406
Bano M., Matta P., Chandel S. Content Based Image Retrieval: A Study of Approaches and Techniques. 2024 4th International Conference on Technological Advancements in Computational Sciences (ICTACS). Tashkent, Uzbekistan, 2024. Р. 16–22. DOI: https://doi.org/10.1109/ICTACS62700.2024.10840489
Anand A., Saxena A., Singh K. Statistical Features Based Content Based Image Retrieval Using Machine Learning Classifiers. 2024 IEEE 3rd World Conference on Applied Intelligence and Computing (AIC). Gwalior, India, 2024. Р. 1102–1109. DOI: https://doi.org/10.1109/AIC61668.2024.10731120
Debin H., Yue Z., Shuai J. Application of Content-Based Retrieval Technology in Image Archive Management. 2024 Global Conference on Communications and Information Technologies (GCCIT). Bangalore, India, 2024. Р. 1–6. DOI: https://doi.org/10.1109/GCCIT63234.2024.10862277
Bai J., Ni B., Wang M. Deep Progressive Hashing for Image Retrieval. IEEE Transactions on Multimedia. 2019. Vol. 21. № 12. Р. 3178–3193. DOI: https://doi.org/10.1109/TMM.2019.2920601
Lowe D. G. Object Recognition from Local Scale-Invariant Features. Proceedings of the Seventh IEEE International Conference on Computer Vision. Kerkyra, Greece, 1999. Vol. 2. Р. 1150–1157. DOI: https://doi.org/10.1109/ICCV.1999.790410
Bay H., Ess A., Tuytelaars T. Speeded-Up Robust Features (SURF). Computer Vision and Image Understanding. 2008. Vol. 110. № 3. Р. 346–359. DOI: https://doi.org/10.1016/j.cviu.2007.09.014
Mikolajczyk K., Schmid C. A Performance Evaluation of Local Descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2005. Vol. 27. № 10. Р. 1615–1630. DOI: https://doi.org/10.1109/TPAMI.2005.188
Ke Y., Sukthankar R. PCA-SIFT: A More Distinctive Representation for Local Image Descriptors. Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR). Washington, DC, USA, 2004. Vol. 2. Р. II–II. DOI: https://doi.org/10.1109/CVPR.2004.1315206
Calonder M., Lepetit V., Strecha C. BRIEF: Binary Robust Independent Elementary Features. Computer Vision – ECCV 2010. Lecture Notes in Computer Science. 2010. Vol. 6314. Р. 778–792. DOI: https://doi.org/10.1007/978-3-642-15561-1_56
Rublee E., Rabaud V., Konolige K. ORB: An Efficient Alternative to SIFT or SURF. 2011 International Conference on Computer Vision. Barcelona, Spain, 2011. Р. 2564–2571. DOI: https://doi.org/10.1109/ICCV.2011.6126544
Leutenegger S., Chli M., Siegwart R. Y. BRISK: Binary Robust Invariant Scalable Keypoints. 2011 International Conference on Computer Vision. Barcelona, Spain, 2011. Р. 2548–2555. DOI: https://doi.org/10.1109/ICCV.2011.6126542
Alahi A., Ortiz R., Vandergheynst P. FREAK: Fast Retina Keypoint. 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence, RI, USA, 2012. Р. 510–517. DOI: https://doi.org/10.1109/CVPR.2012.6247715
Žižakić N., Pižurica A. Efficient Local Image Descriptors Learned with Autoencoders. IEEE Access. 2022. Vol.10. Р. 221–235. DOI: https://doi.org/10.1109/ACCESS.2021.3138168
Liu Y., Xu X., Li F. Image Feature Matching Based on Deep Learning. 2018 IEEE 4th International Conference on Computer and Communications (ICCC). Chengdu, China, 2018. Р. 1752–1756. DOI: https://doi.org/10.1109/CompComm.2018.8780936
Radenović F., Tolias G., Chum O. Fine-Tuning CNN Image Retrieval with No Human Annotation. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2019. Vol. 41. № 7. Р. 1655–1668. DOI: https://doi.org/10.1109/TPAMI.2018.2846566
Song L., Lin J., Wang Z. J. An End-to-End Multi-Task Deep Learning Framework for Skin Lesion Analysis. IEEE Journal of Biomedical and Health Informatics. 2020. Vol. 24. № 10. Р. 2912–2921. DOI: https://doi.org/10.1109/JBHI.2020.2973614
Wang B., Zhang H., Zhu L. Multi-Level Adversarial Attention Cross-Modal Hashing. Signal Processing: Image Communication. 2023. Vol. 117. 117017 р. DOI: https://doi.org/10.1016/j.image.2023.117017
Gajjar V., Khandhediya Y., Gurnani A. Human Detection and Tracking for Video Surveillance: A Cognitive Science Approach. 2017 IEEE International Conference on Computer Vision Workshops (ICCVW). Venice, Italy, 2017. Р. 2805–2809. DOI: https://doi.org/10.1109/ICCVW.2017.330
Adel M., Moussaoui A., Rasigni M. Statistical-Based Tracking Technique for Linear Structures Detection: Application to Vessel Segmentation in Medical Images. IEEE Signal Processing Letters. 2010. Vol. 17. № 6. Р. 555–558. DOI: https://doi.org/10.1109/LSP.2010.2046697
Truong X.-T., Yoong V. N., Ngo T.-D. RGB-D and Laser Data Fusion-Based Human Detection and Tracking for Socially Aware Robot Navigation Framework. 2015 IEEE International Conference on Robotics and Biomimetics (ROBIO). Zhuhai, China, 2015. Р. 608–613. DOI: https://doi.org/10.1109/ROBIO.2015.7418835
Galvez R. L., Bandala A. A., Dadios E. P. Object Detection Using Convolutional Neural Networks. TENCON 2018 – 2018 IEEE Region 10 Conference. Jeju, Korea (South), 2018. Р. 2023–2027. DOI: https://doi.org/10.1109/TENCON.2018.8650517
Wehbe A., Hotiet H., Minetti I. Integrating YOLO for Advanced Content-Based Image Retrieval in Lung Cancer Imaging. 2024 31st IEEE International Conference on Electronics, Circuits and Systems (ICECS). Nancy, France, 2024. Р. 1–4. DOI: https://doi.org/10.1109/ICECS61496.2024.10848862
Dosovitskiy A., Beyer L., Kolesnikov A. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale / А. Dosovitskiy та ін. International Conference on Learning Representations. 2021. DOI: https://doi.org/10.48550/arXiv.2010.11929
References
Gonzalez, R.C. and Woods, R.E. (2018), Digital Image Processing, 4th ed., Pearson/Prentice Hall, 1168 p. DOI/ISBN: 9780133356724
Yang, W., Zhao, H., Wang, M. and Ji, J. (2020), "Design of Intelligent Search Engine Service Performance Evaluation System", 2020 5th Asia-Pacific Conference on Intelligent Robot Systems (ACIRS), Singapore, Р. 86–91. DOI: https://doi.org/10.1109/ACIRS49895.2020.9162611
Amorós, F., Payá, L., Mayol-Cuevas, W., Jiménez, L.M. and Reinoso, O. (2020), "Holistic Descriptors of Omnidirectional Color Images and Their Performance in Estimation of Position and Orientation", IEEE Access, vol. 8, Р. 81822–81848. DOI: https://doi.org/10.1109/ACCESS.2020.2990996
Liu, X., Cheung, G., Lin, C.-W., Zhao, D. and Gao, W. (2018), "Prior-Based Quantization Bin Matching for Cloud Storage of JPEG Images", IEEE Transactions on Image Processing, Vol. 27, No. 7, Р. 3222–3235. DOI: https://doi.org/10.1109/TIP.2018.2799704
Carvalho, E.D., Filho, A.O.C., Silva, R.R.V., Araújo, F.H.D., Diniz, J.O.B., Silva, A.C., Paiva, A.C. and Gattass, M. (2020), "Breast Cancer Diagnosis from Histopathological Images Using Textural Features and CBIR", Artificial Intelligence in Medicine, Vol. 105, 101845 р. DOI: https://doi.org/10.1016/j.artmed.2020.101845
Nakazato, M. and Huang, T.S. (2001), "3D MARS: Immersive Virtual Reality for Content-Based Image Retrieval", IEEE International Conference on Multimedia and Expo (ICME), Tokyo, Japan, Р. 44–47. DOI: https://doi.org/10.1109/ICME.2001.1237651
Iqbal, K., Odetayo, M.O. and James, A. (2012), "Content-Based Image Retrieval Approach for Biometric Security Using Colour, Texture and Shape Features Controlled by Fuzzy Heuristics", Journal of Computer and System Sciences, Vol. 78, No. 4, Р. 1258–1277. DOI: https://doi.org/10.1016/j.jcss.2011.10.013
Popescu, A. and Grefenstette, G. (2011), "Social Media Driven Image Retrieval", Proceedings of the 1st ACM International Conference on Multimedia Retrieval (ICMR '11), New York, NY, USA, Article 33, Р. 1–8. DOI: https://doi.org/10.1145/1991996.1992029
Liu, Y., Mei, T. and Hua, X.-S. (2009), "CrowdReranking: Exploring Multiple Search Engines for Visual Search Reranking", Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '09), New York, NY, USA, Р. 500–507. DOI: https://doi.org/10.1145/1571941.1572027
Staszewski, P., Jaworski, M., Cao, J. and Rutkowski, L. (2022), "A New Approach to Descriptors Generation for Image Retrieval by Analyzing Activations of Deep Neural Network Layers", IEEE Transactions on Neural Networks and Learning Systems, vol. 33, No. 12, Р. 7913–7920. DOI: https://doi.org/10.1109/TNNLS.2021.3084633
Sahmoudi, Y., El-Ogri, O., El-Mekkaoui, J. and Hjouji, A. (2024), "An Efficient Biomedical Color Image Retrieval System Based on Continuous Orthogonal Legendre Fourier Quaternion", 2024 Sixth International Conference on Intelligent Computing in Data Sciences (ICDS), Marrakech, Morocco, Р. 1–6. DOI: https://doi.org/10.1109/ICDS62089.2024.10756406
Bano, M., Matta, P. and Chandel, S. (2024), "Content Based Image Retrieval: A Study of Approaches and Techniques", 2024 4th International Conference on Technological Advancements in Computational Sciences (ICTACS), Tashkent, Uzbekistan, Р. 16–22. DOI: https://doi.org/10.1109/ICTACS62700.2024.10840489
Anand, A., Saxena, A. and Singh, K. (2024), "Statistical Features Based Content Based Image Retrieval Using Machine Learning Classifiers", 2024 IEEE 3rd World Conference on Applied Intelligence and Computing (AIC), Gwalior, India, Р. 1102–1109. DOI: https://doi.org/10.1109/AIC61668.2024.10731120
Debin, H., Yue, Z. and Shuai, J. (2024), "Application of Content-Based Retrieval Technology in Image Archive Management", 2024 Global Conference on Communications and Information Technologies (GCCIT), Bangalore, India, Р. 1–6. DOI: https://doi.org/10.1109/GCCIT63234.2024.10862277
Bai, J., Ni, B., Wang, M., Li, Z., Cheng, S. and Yang, X. (2019), "Deep Progressive Hashing for Image Retrieval", IEEE Transactions on Multimedia, Vol. 21, No. 12, Р. 3178–3193. DOI: https://doi.org/10.1109/TMM.2019.2920601
Lowe, D.G. (1999), "Object Recognition from Local Scale-Invariant Features", Proceedings of the Seventh IEEE International Conference on Computer Vision, Kerkyra, Greece, vol. 2, Р. 1150–1157. DOI: https://doi.org/10.1109/ICCV.1999.790410
Bay, H., Ess, A., Tuytelaars, T. and Van Gool, L. (2008), "Speeded-Up Robust Features (SURF)", Computer Vision and Image Understanding, vol. 110, No. 3, Р. 346–359. DOI: https://doi.org/10.1016/j.cviu.2007.09.014
Mikolajczyk, K. and Schmid, C. (2005), "A Performance Evaluation of Local Descriptors", IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 27, No. 10, Р. 1615–1630. DOI: https://doi.org/10.1109/TPAMI.2005.188
Ke, Y. and Sukthankar, R. (2004), "PCA-SIFT: A More Distinctive Representation for Local Image Descriptors", Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), Washington, DC, USA, Vol. 2, Р. II–II. DOI: https://doi.org/10.1109/CVPR.2004.1315206
Calonder, M., Lepetit, V., Strecha, C. and Fua, P. (2010), "BRIEF: Binary Robust Independent Elementary Features", Computer Vision – ECCV 2010. Lecture Notes in Computer Science, Vol. 6314, Р. 778–792. DOI: https://doi.org/10.1007/978-3-642-15561-1_56.
Rublee, E., Rabaud, V., Konolige, K. and Bradski, G. (2011), "ORB: An Efficient Alternative to SIFT or SURF", 2011 International Conference on Computer Vision, Barcelona, Spain, Р. 2564–2571. DOI: https://doi.org/10.1109/ICCV.2011.6126544
Leutenegger, S., Chli, M. and Siegwart, R.Y. (2011), "BRISK: Binary Robust Invariant Scalable Keypoints", 2011 International Conference on Computer Vision, Barcelona, Spain, Р. 2548–2555. DOI: https://doi.org/10.1109/ICCV.2011.6126542
Alahi, A., Ortiz, R. and Vandergheynst, P. (2012), "FREAK: Fast Retina Keypoint", 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, Р. 510–517. DOI: https://doi.org/10.1109/CVPR.2012.6247715
Žižakić, N. and Pižurica, A. (2022), "Efficient Local Image Descriptors Learned with Autoencoders", IEEE Access, Vol. 10, Р. 221–235. DOI: https://doi.org/10.1109/ACCESS.2021.3138168
Liu, Y., Xu, X. and Li, F. (2018), "Image Feature Matching Based on Deep Learning", 2018 IEEE 4th International Conference on Computer and Communications (ICCC), Chengdu, China, Р. 1752–1756. DOI: https://doi.org/10.1109/CompComm.2018.8780936
Radenović, F., Tolias, G. and Chum, O. (2019), "Fine-Tuning CNN Image Retrieval with No Human Annotation", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 41, No. 7, Р. 1655–1668. DOI: https://doi.org/10.1109/TPAMI.2018.2846566
Song, L., Lin, J., Wang, Z.J. and Wang, H. (2020), "An End-to-End Multi-Task Deep Learning Framework for Skin Lesion Analysis", IEEE Journal of Biomedical and Health Informatics, vol. 24, No. 10, Р. 2912–2921. DOI: https://doi.org/10.1109/JBHI.2020.2973614
Wang, B., Zhang, H., Zhu, L., Nie, L. and Liu, L. (2023), "Multi-Level Adversarial Attention Cross-Modal Hashing", Signal Processing: Image Communication, Vol. 117, 117017 р. DOI: https://doi.org/10.1016/j.image.2023.117017
Gajjar, V., Khandhediya, Y. and Gurnani, A. (2017), "Human Detection and Tracking for Video Surveillance: A Cognitive Science Approach", 2017 IEEE International Conference on Computer Vision Workshops (ICCVW), Venice, Italy, Р. 2805–2809. DOI: https://doi.org/10.1109/ICCVW.2017.330
Adel, M., Moussaoui, A., Rasigni, M., Bourennane, S. and Hamami, L. (2010), "Statistical-Based Tracking Technique for Linear Structures Detection: Application to Vessel Segmentation in Medical Images", IEEE Signal Processing Letters, Vol. 17, No. 6, Р. 555–558. DOI: https://doi.org/10.1109/LSP.2010.2046697
Truong, X.-T., Yoong, V.N. and Ngo, T.-D. (2015), "RGB-D and Laser Data Fusion-Based Human Detection and Tracking for Socially Aware Robot Navigation Framework", 2015 IEEE International Conference on Robotics and Biomimetics (ROBIO), Zhuhai, China, Р. 608–613. DOI: https://doi.org/10.1109/ROBIO.2015.7418835
Galvez, R.L., Bandala, A.A., Dadios, E.P., Vicerra, R.R.P. and Maningo, J.M.Z. (2018), "Object Detection Using Convolutional Neural Networks", TENCON 2018 – 2018 IEEE Region 10 Conference, Jeju, Korea (South), Р. 2023–2027. DOI: https://doi.org/10.1109/TENCON.2018.8650517
Wehbe, A., Hotiet, H., Minetti, I. and Dellapiane, S. (2024), "Integrating YOLO for Advanced Content-Based Image Retrieval in Lung Cancer Imaging", 2024 31st IEEE International Conference on Electronics, Circuits and Systems (ICECS), Nancy, France, Р. 1–4. DOI: https://doi.org/10.1109/ICECS61496.2024.10848862
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J. and Houlsby, N. (2021), "An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale", International Conference on Learning Representations. DOI: https://doi.org/10.48550/arXiv.2010.11929
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Our journal abides by the Creative Commons copyright rights and permissions for open access journals.
Authors who publish with this journal agree to the following terms:
Authors hold the copyright without restrictions and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0) that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-commercial and non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
Authors are permitted and encouraged to post their published work online (e.g., in institutional repositories or on their website) as it can lead to productive exchanges, as well as earlier and greater citation of published work.












