Development of text extraction technique using optical character recognition and morphological reconstruction to eliminate artifacts of image’s background
DOI:
https://doi.org/10.15587/1729-4061.2022.252803Keywords:
Morphological Reconstruction, Optical Character Recognition (OCR), document images, non-uniform illumination imagesAbstract
Text recognition of images is beneficial in a wide range of computer vision purposes such as robot navigation, document analysis, and image search. The optical character recognition (OCR) technique presents a simple tool to combine text recognition functionality to many industrial and educational applications. Best OCR results can be acquired when the background of the text image is uniform and appears as a document picture. In contrast, the challenges to recognizing accurate texts occur when the image has a non-uniform background that require further preprocessing to obtain acceptable OCR result. This work discusses three scenarios. Initially, this work will test the OCR on a normal business card as an image with a uniform background. Next, discusses the text recognition of a keypad image including digits with a non-uniform background. Here, there are two preprocessing algorithms used to enhance the OCR function to overcome the negative effect of the non-uniform background of images and to detect text with high accuracy. Finally, the developed OCR method is tested on different scanned bills and discusses the variation of the obtained results. The two algorithms are the morphological reconstruction to eliminate artifacts and create cleaner images to be further processed by OCR and the Region of Interest ROI-based OCR to spot explicit regions in a tested image. Verification for the effectiveness of the Morphological-based OCR over the ROI-based method has been conducted on a dataset of scanned electricity bills images with an accuracy of 98.2 % for Morphological-based while it is only about 89.3 % for ROI-based OCR.
References
- Singh, A., Bacchuwar, K., Bhasin, A. (2012). A Survey of OCR Applications. International Journal of Machine Learning and Computing, 314–318. doi: https://doi.org/10.7763/ijmlc.2012.v2.137
- Fang, Y., Yao, J. (2014). Multi-operator combination for character segmentation in complex background. 2014 International Conference on Audio, Language and Image Processing. doi: https://doi.org/10.1109/icalip.2014.7009896
- Park, J., Lee, E., Kim, Y., Kang, I., Koo, H. I., Cho, N. I. (2020). Multi-Lingual Optical Character Recognition System Using the Reinforcement Learning of Character Segmenter. IEEE Access, 8, 174437–174448. doi: https://doi.org/10.1109/access.2020.3025769
- Al-Duwairi, B., Khater, I., Al-Jarrah, O. (2013). Detecting Image Spam Using Image Texture Features. International Journal for Information Security Research, 3 (4), 344–353. doi: https://doi.org/10.20533/ijisr.2042.4639.2013.0040
- Qaroush, A., Awad, A., Modallal, M., Ziq, M. (2020). Segmentation-based, omnifont printed Arabic character recognition without font identification. Journal of King Saud University - Computer and Information Sciences. doi: https://doi.org/10.1016/j.jksuci.2020.10.001
- Navitski, R. (2014). Reconsidering the Archive: Digitization and Latin American Film Historiography. Cinema Journal, 54 (1), 121–128. doi: https://doi.org/10.1353/cj.2014.0065
- Kanagarathinam, K., Sekar, K. (2019). Text detection and recognition in raw image dataset of seven segment digital energy meter display. Energy Reports, 5, 842–852. doi: https://doi.org/10.1016/j.egyr.2019.07.004
- Farhat, A., Hommos, O., Al-Zawqari, A., Al-Qahtani, A., Bensaali, F., Amira, A., Zhai, X. (2018). Optical character recognition on heterogeneous SoC for HD automatic number plate recognition system. EURASIP Journal on Image and Video Processing, 2018 (1). doi: https://doi.org/10.1186/s13640-018-0298-2
- Arora, M., Jain, A., Rustagi, S., Yadav, T. (2019). Automatic Number Plate Recognition System Using Optical Character Recognition. International Journal of Scientific Research in Computer Science, Engineering and Information Technology, 986–992. doi: https://doi.org/10.32628/cseit1952280
- Vaishnav, A., Mandot, M. (2019). Template Matching for Automatic Number Plate Recognition System with Optical Character Recognition. Advances in Intelligent Systems and Computing, 683–694. doi: https://doi.org/10.1007/978-981-13-7166-0_69
- Akhtar, Z., & Ali, R. (2020). Automatic Number Plate Recognition Using Random Forest Classifier. SN Computer Science, 1 (3). doi: https://doi.org/10.1007/s42979-020-00145-8
- Srivastava, S., Priyadarshini, J., Gopal, S., Gupta, S., Dayal, H. S. (2018). Optical Character Recognition on Bank Cheques Using 2D Convolution Neural Network. Applications of Artificial Intelligence Techniques in Engineering, 589–596. doi: https://doi.org/10.1007/978-981-13-1822-1_55
- Robby, G. A., Tandra, A., Susanto, I., Harefa, J., Chowanda, A. (2019). Implementation of Optical Character Recognition using Tesseract with the Javanese Script Target in Android Application. Procedia Computer Science, 157, 499–505. doi: https://doi.org/10.1016/j.procs.2019.09.006
- Rajbongshi, A., Ibadul, M., Amin, A., Mahbubur, M., Majumder, A., Ezharul, M. (2020). Bangla Optical Character Recognition and Text-to-Speech Conversion using Raspberry Pi. International Journal of Advanced Computer Science and Applications, 11 (6). doi: https://doi.org/10.14569/ijacsa.2020.0110636
- Oni, O. J., Asahiah, F. O. (2020). Computational modelling of an optical character recognition system for Yorùbá printed text images. Scientific African, 9, e00415. doi: https://doi.org/10.1016/j.sciaf.2020.e00415
- Michalak, H., Okarma, K. (2019). Improvement of Image Binarization Methods Using Image Preprocessing with Local Entropy Filtering for Alphanumerical Character Recognition Purposes. Entropy, 21 (6), 562. doi: https://doi.org/10.3390/e21060562
- Barnouti, N. H., Abomaali, M., Al-Mayyahi, M. H. N. (2018). An efficient character recognition technique using K-nearest neighbor classifier. International Journal of Engineering & Technology, 7 (4), 3148–3153. doi: https://doi.org/10.14419/ijet.v7i4.18952
- Sporici, D., Cușnir, E., Boiangiu, C.-A. (2020). Improving the Accuracy of Tesseract 4.0 OCR Engine Using Convolution-Based Preprocessing. Symmetry, 12 (5), 715. doi: https://doi.org/10.3390/sym12050715
- Sowmya, R., Jagtap, S. S., Kasthuri, G. (2020). Smart Reader for Visually Challenged Using Optical Character Recognition and Text-To-Speech. Innovations in Information and Communication Technology Series, 205–208. doi: https://doi.org/10.46532/978-81-950008-1-4_045
- Majumdar, J., Gupta, R. (2019). An Accuracy Examination of OCR Tools. International Journal of Innovative Technology and Exploring Engineering, 8 (9S4), 5–9. doi: https://doi.org/10.35940/ijitee.i1102.0789s419
- The RVL-CDIP Dataset. Available at: https://www.cs.cmu.edu/~aharley/rvl-cdip/
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2022 Wasan M Jwaid
This work is licensed under a Creative Commons Attribution 4.0 International License.
The consolidation and conditions for the transfer of copyright (identification of authorship) is carried out in the License Agreement. In particular, the authors reserve the right to the authorship of their manuscript and transfer the first publication of this work to the journal under the terms of the Creative Commons CC BY license. At the same time, they have the right to conclude on their own additional agreements concerning the non-exclusive distribution of the work in the form in which it was published by this journal, but provided that the link to the first publication of the article in this journal is preserved.
A license agreement is a document in which the author warrants that he/she owns all copyright for the work (manuscript, article, etc.).
The authors, signing the License Agreement with TECHNOLOGY CENTER PC, have all rights to the further use of their work, provided that they link to our edition in which the work was published.
According to the terms of the License Agreement, the Publisher TECHNOLOGY CENTER PC does not take away your copyrights and receives permission from the authors to use and dissemination of the publication through the world's scientific resources (own electronic resources, scientometric databases, repositories, libraries, etc.).
In the absence of a signed License Agreement or in the absence of this agreement of identifiers allowing to identify the identity of the author, the editors have no right to work with the manuscript.
It is important to remember that there is another type of agreement between authors and publishers – when copyright is transferred from the authors to the publisher. In this case, the authors lose ownership of their work and may not use it in any way.