Increasing the accuracy of handwriting text recognition in medical prescriptions with generative artificial intelligence
DOI:
https://doi.org/10.15587/2706-5448.2023.284998Keywords:
handwriting recognition, , generative artificial intelligence, recognition algorithms, deep neural networksAbstract
The object of the research is a system for recognizing handwritten text in medical prescriptions. The peculiarities of handwriting, the variety of calligraphy styles, as well as the specificity of medical prescriptions, create many problems and challenges for recognition algorithms, causing errors and reducing recognition accuracy.
The work presents a new system with additional components of post-processing the recognition results to increase the accuracy of the final results. An algorithm for combining words into lines and blocks is proposed, which makes it possible to group words while preserving contextual connections between them. Also, a generative neural network with a large language model is used to analyze the recognition result and correct possible errors. The results of the testing show an improvement in recognition accuracy by 0.13 %. Successful cases of generative artificial intelligence usage are analyzed, as well as examples of the results deterioration, that are related to grammatical errors in the initial input data.
The obtained results show the use of generative artificial intelligence as an additional step for processing the recognition results really can improve the accuracy of text recognition systems. The results of the study can be used for further experiments to improve recognition results in other tasks related to text recognition and in related fields.
References
- Baniulyte, G., Rogerson, N., Bowden, J. (2023). Evolution – removing paper and digitising the hospital. Health and Technology, 13 (2), 263–271. doi: https://doi.org/10.1007/s12553-023-00740-8
- Dhar, D., Garain, A., Singh, P. K., Sarkar, R. (2020). HP_DocPres: a method for classifying printed and handwritten texts in doctor’s prescription. Multimedia Tools and Applications, 80 (7), 9779–9812. doi: https://doi.org/10.1007/s11042-020-10151-w
- Hucka, M. (2022). Caltechlibrary/handprint: Release 1.5.6 (v1.5.6). CaltechDATA. doi: https://doi.org/10.22002/D1.20059
- Schmidt, R. (2019). Recurrent Neural Networks (RNNs): A gentle Introduction and Overview. doi: https://doi.org/10.48550/arXiv.1912.05911
- Graves, A., Fernández, S., Gomez, F., Schmidhuber, J. (2006). Connectionist temporal classification. Proceedings of the 23rd International Conference on Machine Learning – ICML ’06. doi: https://doi.org/10.1145/1143844.1143891
- Dhar, D., Garain, A., Singh, P. K., Sarkar, R. (2020). HP_DocPres: a method for classifying printed and handwritten texts in doctor’s prescription. Multimedia Tools and Applications, 80 (7), 9779–9812. doi: https://doi.org/10.1007/s11042-020-10151-w
- Yakovchuk, O., Cherneha, A., Zhelezniakov, D., Zaytsev, V. (2020). Methods for Lines and Matrices Segmentation in RNN-based Online Handwriting Mathematical Expression Recognition Systems. 2020 IEEE Third International Conference on Data Stream Mining & Processing (DSMP). doi: https://doi.org/10.1109/dsmp47368.2020.9204273
- Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I. (2019) Language Models are Unsupervised Multitask Learners. Available at: https://www.semanticscholar.org/paper/Language-Models-are-Unsupervised-Multitask-Learners-Radford-Wu/9405cc0d6169988371b2755e573cc28650d14dfe
- Child, R., Gray, S., Radford, A., Sutskever, I. (2019). Generating Long Sequences with Sparse Transformers. doi: https://doi.org/10.48550/arXiv.1904.10509
- Vaswani, A., Shazeer, N., Parmar, N. (2017). Attention Is All You Need. doi: https://doi.org/10.48550/arXiv.1706.03762
- Brown, B., Mann, B., Ryder, N., Subbiah, M. (2020). Language Models are Few-Shot Learners. Available at: https://arxiv.org/pdf/2005.14165.pdf
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2023 Oleg Yakovchuk, Maksym Vasin
This work is licensed under a Creative Commons Attribution 4.0 International License.
The consolidation and conditions for the transfer of copyright (identification of authorship) is carried out in the License Agreement. In particular, the authors reserve the right to the authorship of their manuscript and transfer the first publication of this work to the journal under the terms of the Creative Commons CC BY license. At the same time, they have the right to conclude on their own additional agreements concerning the non-exclusive distribution of the work in the form in which it was published by this journal, but provided that the link to the first publication of the article in this journal is preserved.