DEVELOPMENT OF AN INTELLIGENT PROCESSING SYSTEM MODULE FOR SCANNED DOCUMENTS BASED ON THE COMBINED IMAGE SEGMENTATION METHOD
DOI:
https://doi.org/10.30837/2522-9818.2019.8.044Keywords:
image segmentation, scanned documents, document processing, intelligent systemAbstract
The subject of research in the article is a segmentation module, created on the basis of a combined method of image segmentation, and embedded in an intelligent processing system for scanned documents used in the Odessa printing company “Studio “Print”. The aim of the work is to develop a module of image segmentation to improve the efficiency of the intellectual system of processing scanned documents at the printing company "Studio "Print". The combined method of image segmentation of scanned documents, which reduces the processing time of the image is used with this purpose. The article solves the following problems: analysis of existing methods of image segmentation, which are used in intelligent systems for processing scanned documents; development of procedures for the segmentation module based on the combined image segmentation method for an intelligent system for processing scanned documents. The work uses the following methods: methods of digital image processing, methods of filtering and morphological image analysis, methods of mathematical analysis, neural networks. The following results were obtained: The results of image processing using an intelligent system for processing scanned documents based on the proposed segmentation module confirm the operability of the procedures of the image segmentation module. The average processing time for images of scanned documents was 5.3 seconds compared to the previously obtained - 42 seconds, which allows to conclude that the efficiency of the investigated intellectual system for processing scanned documents is increased. Conclusions: The introduction of the developed image segmentation module into the intellectual processing system of scanned documents of the printing company “Print Studio” reduced the processing time of images of scanned documents by 8 times while maintaining sufficient quality, which increased the efficiency of this intelligent system.
References
Usylyn, S. A., Nykolaev, D. P., Postnykov, V. V. (2009), "Cognitive PDF / A - the technology of digitizing text documents for publication in the Internet and long-term archiving" ["Cognitive PDF/A – texnologyya ocyfrovky tekstovih dokumentov dlya publykacyy v Ynternet y dolgovremennogo arhyvnogo hranenyya"], Trudi Ynstytuta systemnogo analyza RAN. Texnologyy programmyrovanyya yhranenyya dannih / pod red. Arlazarov V.L., Emelyanov N.E, Moscow : LENAND, Vol. 45. P. 159–173.
Rajeswari, N., Rathnapriya, S., Nijandan, S. (2014), "Test Segmentation of MRC Document Compression and Decompression by Using MATLAB", International Conference on Engineering Technology and Science-(ICETS’14), Tamilnadu, India, Vol. 3, Special Issue 1, P. 914–919.
Antonacopoulos, A., Pletschacher, S., Bridson, D. and Papadopoulos, C. (2009), "ICDAR2009 Page Segmentation Competition", 10th International Conference on Document Analysis and Recognition, Barcelona, Spain, P. 1371–1374. DOI: https://doi.org/10.1109/ICDAR.2009.27
Thai, V. H., Tabbone, S. (2010), "Text Extraction from Graphical Document Images Using Sparse Representation", ACM International Conference Proceeding Series: International Workshop on Document Analysis Systems - DAS’2010, Jun 2010, Boston, United States, ACM, P. 143–150.
Erkilinc, S., Saber, E., Jaber, M. (2012), "Text, photo, and line extraction in scanned documents", Journal of Electronic Imaging, Vol. 21 (3), P. 033006-1–033006-18.
Bukhari, S. S., Azawi, M. A, Shafait, F, Breuel, T. (2010), "Document image segmentation using discriminative learning over connected components", The 9th IAPR International Workshop DAS 2010 (Document Analysis Systems). Boston, Massachusetts. USA, P. 183–190.
Polyakova, М., Ishchenko, A., Volkova, N., Pavlov, O. (2018), "The combining segmentation method of the scanned documents images with sequential division of the photo, graphics, and the text areas", Eastern-European Journal of Enterprise Technologies, No. 5/2 (95), P. 6–16. DOI: https://doi.org/10.15587/1729-4061.2018.142735
Ishchenko, А., Polyakova, M., Kuvaieva, V., Nesteryuk, A. (2018), "Elaboration of structural representation of regions of scanned document images for MRC model", Eastern-European Journal of Enterprise Technologies, No. 6/2 (96), P. 32–38. DOI: https://doi.org/10.15587/1729-4061.2018.147671
Polyakova, M., Ishchenko, A., Huliaieva, N. (2018), "Document image segmentation using averaging filtering and mathematical morphology", 14th International Conference on Advanced Trends in Radioelecrtronics, Telecommunications and Computer Engineering (TCSET). Lviv-Slavske, Ukraine, P. 966–969. DOI: https://doi.org/10.1109/TCSET.2018.8336354
Bloomberg, D. S. (1992), "Multiresolution Morphological Approach to Document Image", Visual Communications and Image Processing, Boston, MA, United States, SPIE, Vol. 1818, P. 648–663.
Boltenkov, V., Kuvaieva, V., Galchonkov, O., Ishchenko, A. (2018), "Application of the assignment problem in the calcucation of median consensus rankings", Eastern-European Journal of Enterprise Technologies, No. 4 (94), P. 27–35.
Chu, W., Keerthi, S. (2002), "A general formulation for support vector machines",C. J. Ong.In Proc. of the 9th Int. Conf. on Neural Information Processing (ICONIP '02), Singapore.
Gonsales, R. S., Vuds, R. E., Eddins, S. L. (2006), Digital image processing in MATLAB [Cyfrovaya obrabotka yzobrazhenyj v srede MATLAB], Moscow : Tehnosfera, 616 p.
Haralyk R. (1979), "Statistical and structural approaches to the description of textures" ["Statisticheskiy i strukturnyy podkhody k opisaniyu tekstur"], TYYER, Vol. 67, No. 5, P. 98–120.
Guyon, I., Weston, J., Barnhill, S., Vapnik, V. (2002), "Gene Selection for Cancer Classification using Support Vector Machines", Machine Learning, Vol. 46, No. 1-3, P. 389–422.
Wang, H., Khoshgoftaar, T., Napolitano, A. (2011), "An Empirical Study of Software Metrics Selection Using Support Vector Machine", Proceedings of the 23rd International Conference on Software Engineering & Knowledge Engineering (SEKE’2011), Eden Roc Renaissance, Miami Beach, USA, P. 83–88.
Sojfer, V. A. (2003), Computer image processing methods [Metodi kompyuternoj obrabotky yzobrazhenyj] : pod red. V. A. Sojfera, Moscow : Fyzmatlyt, 784 p.
Bolohova, N., Ruban, I. (2019), "Image processing models and methods research and ways of improving marker recognition technologies in added reality systems", Innovative Technologies and Scientific Solutions for Industries, No. 1 (7), P. 25–33. DOI: https://doi.org/10.30837/2522-9818.2019.7.025
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2019 Alesya Ishchenko
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Our journal abides by the Creative Commons copyright rights and permissions for open access journals.
Authors who publish with this journal agree to the following terms:
Authors hold the copyright without restrictions and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0) that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-commercial and non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
Authors are permitted and encouraged to post their published work online (e.g., in institutional repositories or on their website) as it can lead to productive exchanges, as well as earlier and greater citation of published work.