Combined method for scanned documents images segmentation using sequential extraction of regions
DOI:
https://doi.org/10.15587/1729-4061.2018.142735Keywords:
image segmentation, scanned document, block method, graphics, photographic image, text fragment, connected component, bloomberg methodAbstract
We propose a combined method to segment the images of scanned documents, which, in contrast to known methods, implies a preliminary separation of the graphics and photograph regions from the text regions and a background. In this case, an analysis of the connected components is performed, which are different for graphics, photographs, and text regions. In order to classify the selected regions into the photograph and graphics regions, a block method is employed. It was established that such a technique for splitting the regions into blocks less affects the quality of segmentation when compared to applying the block method directly to the original image. To extract the text regions that are more complex in their shape from the background, the neighborhood of each pixel was processed.
To detect the boundaries of illustrations on the images of scanned documents, we applied the bloomberg method. In order to classify into photographs and graphics, it is proposed to split an illustration into blocks of pixels. Each block of pixels is identified with a vector of two features: the mean value of the local gradient magnitude, and the mean value of the function that localizes at the images of scanned documents the linear objects (graphics and text characters). The derived feature vectors were classified using a support vector machine.
When extracting the text regions, we applied a low-frequency filtering and a thresholding.
The combined method was implemented in practice to segment the test images of scanned newspaper articles from the document database mediateam at oulu university (finland). It was established that the combined method is characterized by an increase in performance speed during image segmentation at high quality processing.
References
- Haneda, E., Bouman, C. A. (2011). Text Segmentation for MRC Document Compression. IEEE Transactions on Image Processing, 20 (6), 1611–1626. doi: https://doi.org/10.1109/tip.2010.2101611
- Polyakova, M., Ishchenko, A., Huliaieva, N. (2018). Document image segmentation using averaging filtering and mathematical morphology. 2018 14th International Conference on Advanced Trends in Radioelecrtronics, Telecommunications and Computer Engineering (TCSET). doi: https://doi.org/10.1109/tcset.2018.8336354
- El-Omari, N. K. T., Omari, A., Al-Badarneh, O., Abdel-Jaber, H. (2012). Scanned Document Image Segmentation Using Back-Propagation Artificial Neural Network Based Technique. International Journal of Computers and Communications, 6 (4), 183–190. Available at: https://www.naun.org/main/UPress/cc/16-060.pdf
- Sasirekha, D., Chandra, E. (2012). Enhanced techniques for PDF image segmentation and text extraction. International Journal of Electronics and Computer Science Engineering. 2012. Vol. 10, Issue 9. P. 1833–1838.
- Korennoy, A. V., Yudakov, D. S., Dedov, S. V., Strazhnik, V. P. (2015). Obnaruzhenie i lokalizaciya tekstovyh oblastey na polutonovyh cifrovyh izobrazheniyah. Vestnik VGU. Sistemnyy analiz i informacionnye tekhnologii, 4, 65–72.
- Kundu, M. K., Dhar, S., Banerjee, M. (2012). A new approach for segmentation of image and text in natural and commercial color documents. 2012 International Conference on Communications, Devices and Intelligent Systems (CODIS). doi: https://doi.org/10.1109/codis.2012.6422142
- Abdullah, H. S., Jassim, A. H. (2016). Improved fuzzy c-means for document image segmentation. British Journal of Science, 14 (2), 1–15.
- Abdullah, H. S., Jasim, A. H. (2016). Improved Ant Colony Optimization for Document Image Segmentation. International Journal of Computer Science and Information Security (IJCSIS), 14 (11), 775–785.
- Erkilinc, M. S., Jaber, M., Saber, E., Bauer, P., Depalov, D. (2012). Text, photo, and line extraction in scanned documents. Journal of Electronic Imaging, 21 (3), 033006. doi: https://doi.org/10.1117/1.jei.21.3.033006
- Bukhari, S. S., Shafait, F., Breuel, T. M. (2011). Improved document image segmentation algorithm using multiresolution morphology. Document Recognition and Retrieval XVIII. doi: https://doi.org/10.1117/12.873461
- Zirari, F., Ennaji, A., Nicolas, S., Mammass, D. (2013). A Document Image Segmentation System Using Analysis of Connected Components. 2013 12th International Conference on Document Analysis and Recognition. doi: https://doi.org/10.1109/icdar.2013.154
- Bukhari, S. S., Al Azawi, M. I. A., Shafait, F., Breuel, T. M. (2010). Document image segmentation using discriminative learning over connected components. Proceedings of the 8th IAPR International Workshop on Document Analysis Systems – DAS ’10. doi: https://doi.org/10.1145/1815330.1815354
- Gonsales, R., Vuds, R. (2005). Cifrovaya obrabotka izobrazheniy. Moscow: Tekhnosfera, 1072.
- Frangi, A. F., Niessen, W. J., Vincken, K. L., Viergever, M. A. (1998). Multiscale vessel enhancement filtering. Lecture Notes in Computer Science, 130–137. doi: https://doi.org/10.1007/bfb0056195
- Mandel', I. D. (1988). Klasterniy analiz. Moscow: Finansy i statistika, 176.
- Chu, W., Keerthi, S. S., Ong, C. J. (2002). A general formulation for support vector machines. Proceedings of the 9th International Conference on Neural Information Processing, 2002. ICONIP '02. doi: https://doi.org/10.1109/iconip.2002.1201949
- Otsu, N. (1979). A Threshold Selection Method from Gray-Level Histograms. IEEE Transactions on Systems, Man, and Cybernetics, 9 (1), 62–66. doi: https://doi.org/10.1109/tsmc.1979.4310076
- Sauvola, J., Kauniskangas, H. (1999). MediaTeam Document Database II: a collection of document images. University of Oulu. Finland.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2018 Marina Polyakova, Alesya Ishchenko, Natalya Volkova, Oleg Pavlov
This work is licensed under a Creative Commons Attribution 4.0 International License.
The consolidation and conditions for the transfer of copyright (identification of authorship) is carried out in the License Agreement. In particular, the authors reserve the right to the authorship of their manuscript and transfer the first publication of this work to the journal under the terms of the Creative Commons CC BY license. At the same time, they have the right to conclude on their own additional agreements concerning the non-exclusive distribution of the work in the form in which it was published by this journal, but provided that the link to the first publication of the article in this journal is preserved.
A license agreement is a document in which the author warrants that he/she owns all copyright for the work (manuscript, article, etc.).
The authors, signing the License Agreement with TECHNOLOGY CENTER PC, have all rights to the further use of their work, provided that they link to our edition in which the work was published.
According to the terms of the License Agreement, the Publisher TECHNOLOGY CENTER PC does not take away your copyrights and receives permission from the authors to use and dissemination of the publication through the world's scientific resources (own electronic resources, scientometric databases, repositories, libraries, etc.).
In the absence of a signed License Agreement or in the absence of this agreement of identifiers allowing to identify the identity of the author, the editors have no right to work with the manuscript.
It is important to remember that there is another type of agreement between authors and publishers – when copyright is transferred from the authors to the publisher. In this case, the authors lose ownership of their work and may not use it in any way.