Development of a combined image recognition model
The object of research is the processes of identification and classification of objects in computer vision tasks. Currently, for the recognition of images, the best results are demonstrated by artificial neural networks. However, learning neural networks is a poorly conditioned task. Poor conditioning means that even a large data set can carry a small amount of information about a problem that is being solved. Therefore, a key role in the synthesis of parameters of a specific mathematical model of a neural network belongs to educational data. Selection of a representative training set is one of the most difficult tasks in machine learning and is not always possible in practice.
The new combined model of image recognition using the non-force interaction theory proposed in the paper has the following key features:
– designed to handle large amounts of data;
– selects useful information from an arbitrary stream;
– allows to naturally add new objects;
– tolerant of errors and allows to quickly reprogram the behavior of the system.
Compared to existing analogues, the recognition accuracy of the proposed model in all experimental studies was higher than the known recognition methods. The average recognition accuracy of the proposed model was 71.3 %; using local binary patterns – 59.9 %; the method of analysis of the main components – 65.2 %; by the method of linear discriminant analysis – 65.6 %. Such recognition accuracy in combination with computational complexity makes this method acceptable for use in systems operating in conditions close to real time. Also, this approach allows to manage the recognition accuracy. This is achieved by adjusting the number of sectors of the histograms of local binary patterns that are used in the description of images and the number of image fragments used in the classification stage by the introformation approach. To a large extent, the number of image fragments affects the time of classification, since in this case, it is necessary to calculate the matching of the system actions in each of the possible directions in pairs.
Wagner, P. (2011). Principal Component Analysis and Linear Discriminant Analysis with GNU Octave. Available at: https://www.bytefish.de/blog/pca_lda_with_gnu_octave/
Samal, A., Iyengar, P. A. (1992). Automatic recognition and analysis of human faces and facial expressions. Pattern Recognition, 25 (1), 65–77. doi: http://doi.org/10.1016/0031-3203(92)90007-6
Ojala, T., Pietikainen, M., Maenpaa, T. (2002). Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24 (7), 971–987. doi: http://doi.org/10.1109/tpami.2002.1017623
Tomashevich, N. S. (2007). Statisticheskie metody vydeleniia priznakov. Neirokompiutery v prikladnykh zadachakh obrabotki izobrazhenii, 121–128.
Volchenkov, M. P., Samonenko, I. Iu. (2005). Ob avtomaticheskom raspoznavanii lits. Intellektualnye sistemy, 9 (1-4), 135–156.
Simard, P. Y., Steinkraus, D., Platt, J. C. (2003). Best practices for convolutional neural networks applied to visual document analysis. 12th International Conference on Document Analysis and Recognition, 2, 958. doi: http://doi.org/10.1109/icdar.2003.1227801
LeCun, Y., Huang, F.-J., Bottou, L. (2004). Learning methods for generic objects recognition with invariance to pose and lighting. Los Alamitos. Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR’04), 2, 97–104. doi: http://doi.org/10.1109/cvpr.2004.1315150
Mitrofanov, C. A. (2015). Sravnenie effektivnosti razlichnykh metodov intellektualnogo analiza dannykh v zadachakh raspoznavaniia izobrazhenii. Innovatsionnaia nauka, 12 (2), 96–98.
Mamontov, D. Iu., Karaseva, T. S. (2015). Reshenie zadach finansovogo analiza s pomoshchiu intellektualnykh informatsionnykh tekhnologii. ITSiT. Available at: https://studfiles.net/preview/5966499/
Krizhevsky, A., Sutskever, I., Hinton, G. E. (2012). ImageNet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems, 25, 1106–1114.
Amsterdam Library of Object Images (ALOI). Available at: http://aloi.science.uva.nl/
The Chars74K dataset. Available at: http://www.ee.surrey.ac.uk/CVSSP/demos/chars74k/
The Database of Faces. Available at: http://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html
Celebrities Data Images Set for Computer Vision. Available at: http://cdiset.blogspot.com/
GOST Style Citations
Copyright (c) 2019 Mykola Voloshyn
This work is licensed under a Creative Commons Attribution 4.0 International License.
ISSN (print) 2664-9969, ISSN (on-line) 2706-5448