An overview of statistical and neural-based line segmentation methods for offline handwriting recognition task
DOI:
https://doi.org/10.15587/2706-5448.2024.298405Keywords:
handwriting text line segmentation, line splitting, text detection, recognition algorithms, deep neural networksAbstract
The object of the research is the line segmentation task. To recognize the handwritten text from the documents in image format offline handwriting recognition technology is used. The text recognizer module accepts input as separate lines, so one of the important preprocessing steps is the detection and splitting of all handwritten text into distinct lines.
In this paper, the handwritten text line segmentation task, its requirements, problems, and challenges are examined. Two main approaches for this task that are used in modern recognition systems are reviewed. These approaches are statistical projection-based methods and neural-based methods. Multiple works and research papers for each type of approach are reviewed analyzing their strengths and weaknesses considering the described tasks, constraints, and input data peculiarities. Overall acquired results are formed in a single table for comparison.
Based on the latest works that utilize deep neural networks the new possibilities of using these methods in recognition systems are described that were unavailable with traditional statistical segmentation approaches.
The constructive conclusions are made based on the review, describing the main pros and cons of these two approaches for the line segmentation task. These results can be further used for the correct selection of suitable methods in handwriting recognition systems to improve their performance and quality, and for further research in this area.
References
- Sumi, T., Kenji Iwana, B., Hayashi, H., Uchida, S. (2019). Modality Conversion of Handwritten Patterns by Cross Variational Autoencoders. Computer Vision and Pattern Recognition. doi: https://doi.org/10.48550/arXiv.1906.06142
- Volkova, V., Deriuga, I., Osadchyi, V., Radyvonenko, O. (2018). Improvement of Character Segmentation Using Recurrent Neural Networks and Dynamic Programming. 2018 IEEE Second International Conference on Data Stream Mining & Processing (DSMP), 218–222. doi: https://doi.org/10.1109/dsmp.2018.8478457
- Omayio, E. O., Sreedevi, I., Panda, J. (2022). Word Segmentation by Component Tracing and Association (CTA) Technique. Journal of Engineering Research. doi: https://doi.org/10.36909/jer.15207
- Gruning, T., Labahn, R., Diem, M., Kleber, F., Fiel, S. (2018). READ-BAD: A New Dataset and Evaluation Scheme for Baseline Detection in Archival Documents. 2018 13th IAPR International Workshop on Document Analysis Systems (DAS). Vienna, 351–356. doi: https://doi.org/10.1109/das.2018.38
- Moysset, B., Kermorvant, C., Wolf, C., Louradour, J. (2015). Paragraph text segmentation into lines with Recurrent Neural Networks. 2015 13th International Conference on Document Analysis and Recognition (ICDAR), 456–460. doi: https://doi.org/10.1109/icdar.2015.7333803
- Vo, Q. N., Lee, G. (2016). Dense prediction for text line segmentation in handwritten document images. 2016 IEEE International Conference on Image Processing (ICIP), 3264–3268. doi: https://doi.org/10.1109/icip.2016.7532963
- Yakovchuk, O., Cherneha, A., Zhelezniakov, D., Zaytsev, V. (2020). Methods for Lines and Matrices Segmentation in RNN-based Online Handwriting Mathematical Expression Recognition Systems. 2020 IEEE Third International Conference on Data Stream Mining & Processing (DSMP). doi: https://doi.org/10.1109/dsmp47368.2020.9204273
- Razak, Z., Zulkiflee, K., Idris, M., Tamil, E., Noor, M., Salleh, R. et al. (2007). Off-line handwriting text line segmentation: A review. International Journal of Computer Science and Network Security, 8 (7), 12–20.
- Arivazhagan, M., Srinivasan, H., Srihari, S. (2007). A statistical approach to line segmentation in handwritten documents. Document Recognition and Retrieval XIV. doi: https://doi.org/10.1117/12.704538
- Ptak, R., Żygadło, B., Unold, O. (2017). Projection–Based Text Line Segmentation with a Variable Threshold. International Journal of Applied Mathematics and Computer Science, 27 (1), 195–206. doi: https://doi.org/10.1515/amcs-2017-0014
- Renton, G., Chatelain, C., Adam, S., Kermorvant, C., Paquet, T. (2017). Handwritten Text Line Segmentation Using Fully Convolutional Network. 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), 5–9. doi: https://doi.org/10.1109/icdar.2017.321
- Zhang, Z., Zhang, C., Shen, W., Yao, C., Liu, W., Bai, X. (2016). Multi-oriented Text Detection with Fully Convolutional Networks. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 4159–4167. doi: https://doi.org/10.1109/cvpr.2016.451
- Liao, M., Wan, Z., Yao, C., Chen, K., Bai, X. (2020). Real-Time Scene Text Detection with Differentiable Binarization. Proceedings of the AAAI Conference on Artificial Intelligence, 34 (7), 11474–11481. doi: https://doi.org/10.1609/aaai.v34i07.6812
- Xu, Y., Yin, X., Huang, K., Hao, H. W. (2013). Robust Text Detection in Natural Scene Images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36 (5), 970–983. doi: https://doi.org/10.1109/tpami.2013.182
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 Oleg Yakovchuk, Walery Rogoza
This work is licensed under a Creative Commons Attribution 4.0 International License.
The consolidation and conditions for the transfer of copyright (identification of authorship) is carried out in the License Agreement. In particular, the authors reserve the right to the authorship of their manuscript and transfer the first publication of this work to the journal under the terms of the Creative Commons CC BY license. At the same time, they have the right to conclude on their own additional agreements concerning the non-exclusive distribution of the work in the form in which it was published by this journal, but provided that the link to the first publication of the article in this journal is preserved.