DEVELOPMENT OF AN INTELLIGENT PROCESSING SYSTEM MODULE FOR SCANNED DOCUMENTS BASED ON THE COMBINED IMAGE SEGMENTATION METHOD

Authors

DOI:

https://doi.org/10.30837/2522-9818.2019.8.044

Keywords:

image segmentation, scanned documents, document processing, intelligent system

Abstract

The subject of research in the article is a segmentation module, created on the basis of a combined method of image segmentation, and embedded in an intelligent processing system for scanned documents used in the Odessa printing company “Studio “Print”. The aim of the work is to develop a module of image segmentation to improve the efficiency of the intellectual system of processing scanned documents at the printing company "Studio "Print". The combined method of image segmentation of scanned documents, which reduces the processing time of the image is used with this purpose. The article solves the following problems: analysis of existing methods of image segmentation, which are used in intelligent systems for processing scanned documents; development of procedures for the segmentation module based on the combined image segmentation method for an intelligent system for processing scanned documents. The work uses the following methods: methods of digital image processing, methods of filtering and morphological image analysis, methods of mathematical analysis, neural networks. The following results were obtained: The results of image processing using an intelligent system for processing scanned documents based on the proposed segmentation module confirm the operability of the procedures of the image segmentation module. The average processing time for images of scanned documents was 5.3 seconds compared to the previously obtained - 42 seconds, which allows to conclude that the efficiency of the investigated intellectual system for processing scanned documents is increased. Conclusions: The introduction of the developed image segmentation module into the intellectual processing system of scanned documents of the printing company “Print Studio” reduced the processing time of images of scanned documents by 8 times while maintaining sufficient quality, which increased the efficiency of this intelligent system.

Author Biography

Alesya Ishchenko, Odessa National Polytechnic University

Senior Lecturer of the Department of Applied Mathematics and Information Technologies

References

Usylyn, S. A., Nykolaev, D. P., Postnykov, V. V. (2009), "Cognitive PDF / A - the technology of digitizing text documents for publication in the Internet and long-term archiving" ["Cognitive PDF/A – texnologyya ocyfrovky tekstovih dokumentov dlya publykacyy v Ynternet y dolgovremennogo arhyvnogo hranenyya"], Trudi Ynstytuta systemnogo analyza RAN. Texnologyy programmyrovanyya yhranenyya dannih / pod red. Arlazarov V.L., Emelyanov N.E, Moscow : LENAND, Vol. 45. P. 159–173.

Rajeswari, N., Rathnapriya, S., Nijandan, S. (2014), "Test Segmentation of MRC Document Compression and Decompression by Using MATLAB", International Conference on Engineering Technology and Science-(ICETS’14), Tamilnadu, India, Vol. 3, Special Issue 1, P. 914–919.

Antonacopoulos, A., Pletschacher, S., Bridson, D. and Papadopoulos, C. (2009), "ICDAR2009 Page Segmentation Competition", 10th International Conference on Document Analysis and Recognition, Barcelona, Spain, P. 1371–1374. DOI: https://doi.org/10.1109/ICDAR.2009.27

Thai, V. H., Tabbone, S. (2010), "Text Extraction from Graphical Document Images Using Sparse Representation", ACM International Conference Proceeding Series: International Workshop on Document Analysis Systems - DAS’2010, Jun 2010, Boston, United States, ACM, P. 143–150.

Erkilinc, S., Saber, E., Jaber, M. (2012), "Text, photo, and line extraction in scanned documents", Journal of Electronic Imaging, Vol. 21 (3), P. 033006-1–033006-18.

Bukhari, S. S., Azawi, M. A, Shafait, F, Breuel, T. (2010), "Document image segmentation using discriminative learning over connected components", The 9th IAPR International Workshop DAS 2010 (Document Analysis Systems). Boston, Massachusetts. USA, P. 183–190.

Polyakova, М., Ishchenko, A., Volkova, N., Pavlov, O. (2018), "The combining segmentation method of the scanned documents images with sequential division of the photo, graphics, and the text areas", Eastern-European Journal of Enterprise Technologies, No. 5/2 (95), P. 6–16. DOI: https://doi.org/10.15587/1729-4061.2018.142735

Ishchenko, А., Polyakova, M., Kuvaieva, V., Nesteryuk, A. (2018), "Elaboration of structural representation of regions of scanned document images for MRC model", Eastern-European Journal of Enterprise Technologies, No. 6/2 (96), P. 32–38. DOI: https://doi.org/10.15587/1729-4061.2018.147671

Polyakova, M., Ishchenko, A., Huliaieva, N. (2018), "Document image segmentation using averaging filtering and mathematical morphology", 14th International Conference on Advanced Trends in Radioelecrtronics, Telecommunications and Computer Engineering (TCSET). Lviv-Slavske, Ukraine, P. 966–969. DOI: https://doi.org/10.1109/TCSET.2018.8336354

Bloomberg, D. S. (1992), "Multiresolution Morphological Approach to Document Image", Visual Communications and Image Processing, Boston, MA, United States, SPIE, Vol. 1818, P. 648–663.

Boltenkov, V., Kuvaieva, V., Galchonkov, O., Ishchenko, A. (2018), "Application of the assignment problem in the calcucation of median consensus rankings", Eastern-European Journal of Enterprise Technologies, No. 4 (94), P. 27–35.

Chu, W., Keerthi, S. (2002), "A general formulation for support vector machines",C. J. Ong.In Proc. of the 9th Int. Conf. on Neural Information Processing (ICONIP '02), Singapore.

Gonsales, R. S., Vuds, R. E., Eddins, S. L. (2006), Digital image processing in MATLAB [Cyfrovaya obrabotka yzobrazhenyj v srede MATLAB], Moscow : Tehnosfera, 616 p.

Haralyk R. (1979), "Statistical and structural approaches to the description of textures" ["Statisticheskiy i strukturnyy podkhody k opisaniyu tekstur"], TYYER, Vol. 67, No. 5, P. 98–120.

Guyon, I., Weston, J., Barnhill, S., Vapnik, V. (2002), "Gene Selection for Cancer Classification using Support Vector Machines", Machine Learning, Vol. 46, No. 1-3, P. 389–422.

Wang, H., Khoshgoftaar, T., Napolitano, A. (2011), "An Empirical Study of Software Metrics Selection Using Support Vector Machine", Proceedings of the 23rd International Conference on Software Engineering & Knowledge Engineering (SEKE’2011), Eden Roc Renaissance, Miami Beach, USA, P. 83–88.

Sojfer, V. A. (2003), Computer image processing methods [Metodi kompyuternoj obrabotky yzobrazhenyj] : pod red. V. A. Sojfera, Moscow : Fyzmatlyt, 784 p.

Bolohova, N., Ruban, I. (2019), "Image processing models and methods research and ways of improving marker recognition technologies in added reality systems", Innovative Technologies and Scientific Solutions for Industries, No. 1 (7), P. 25–33. DOI: https://doi.org/10.30837/2522-9818.2019.7.025

Published

2019-06-24

How to Cite

Ishchenko, A. (2019). DEVELOPMENT OF AN INTELLIGENT PROCESSING SYSTEM MODULE FOR SCANNED DOCUMENTS BASED ON THE COMBINED IMAGE SEGMENTATION METHOD. INNOVATIVE TECHNOLOGIES AND SCIENTIFIC SOLUTIONS FOR INDUSTRIES, (2 (8), 44–53. https://doi.org/10.30837/2522-9818.2019.8.044

Issue

Section

INFORMATION TECHNOLOGY