Accuracy evaluation and error analysis of dependency parsing of texts in Ukrainian
DOI:
https://doi.org/10.30837/2522-9818.2025.2.102Abstract
The subject of our research is the dependency parsing of sentences in the Ukrainian language using the Universal Dependencies framework. The goal of the work is to evaluate the accuracy of existing transition-based and graph-based parsing architectures with and without deep word embeddings on the Ukrainian dataset, and to analyze the error profiles of such parsers. The article addresses two tasks. One is to evaluate the accuracy of several modern dependency parsing approaches applied to a hand-annotated gold standard dataset, using labeled and unlabeled attachment scores as the metric to evaluate the parsing accuracy. The other task is to analyze and categorize the errors made by standard parsers. Resolving these errors could potentially allow us to build a more accurate parser in the future. Error rate for different categories is compared to the baseline error rate, and statistical significance of such comparison is validated using the chi-square method. The key results are as follows. For the Ukrainian language, parsing accuracy is greatly increased with the use of deep word embeddings. Transition-based parser with deep word embeddings provides the highest labeled attachment score of 84.66% for the test dataset. For the same parser, higher error rates are associated with non-projectivity of dependencies, higher sentence length and higher distance to head. Also, for pronouns and numerals the error rate for labeled attachment is significantly higher than the baseline, while the unlabeled error rate is at the baseline. Conclusions: parsing accuracy for the Ukrainian dataset is sub-par in comparison with other languages, but the overall trend of accuracy improvement with the use of deep word embeddings is consistent with existing research. To improve overall parsing accuracy, we must focus on such problem areas as non-projective dependencies, longer sentences, and greater distance between the head and the dependent. In future work we intend to explore ways to improve parsing accuracy by supplementing neural parsing with other approaches, like formal rules or pre- and post-processing.
References
References
Tsarfaty, R.; Seddah, D.; Goldberg, Y.; Kuebler, S.; Versley, Y.; Candito, M.; Foster, J.; Rehbein, I.; Tounsi, L. (2010), "Statistical Parsing of Morphologically Rich Languages (SPMRL) What, How and Whither". Proceedings of the NAACL HLT 2010 First Workshop on Statistical Parsing of Morphologically-Rich Languages. P. 1–12. URL: https://aclanthology.org/W10-1401/
Kotsyba, N.; Moskalevskyi, B.; Romanenko, M.; Samoridna, H.; Kosovska, I.; Lytvyn, O.; Orlenko, O.; Brovko, H.; Matushko, B.; Onyshchuk, N.; Pareviazko, V.; Rychyk, Y.; Stetsenko, A.; Umanets, S.; Masenko, L. (2021), "Gold standard Universal Dependencies corpus for Ukrainian (UD_Ukrainian-IU) v2.8". URL: https://github.com/UniversalDependencies/UD_Ukrainian-IU.
Silveira, N.; Dozat, T., de Marneffe M.-C.; Bowman, S.; Connor M.; Bauer, J.; Manning C. (2014), "A Gold Standard Dependency Corpus for English". Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC-2014). P. 2897–2904. URL: http://www.lrec-conf.org/proceedings/lrec2014/pdf/1089_Paper.pdf
Jurafsky D., Martin J. "Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition with Language Models, 3rd ed". USA. 2025. 599 p. URL: https://web.stanford.edu/~jurafsky/slp3/
Mikolov, T.; Chen, K.; Corrado, G.; Dean, J. (2013), "Efficient Estimation of Word Representations in Vector Space". arXiv: 1301.3781 [cs.CL]. DOI: 10.48550/arXiv.1301.3781
Kulmizev, A.; de Lhoneux, M.; Gontrum, J.; Fano, E.; Nivre, J. (2019), "Deep Contextualized Word Embeddings in Transition-Based and Graph-Based Dependency Parsing - A Tale of Two Parsers Revisited". Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). P. 2755–2768. DOI: 10.18653/v1/D19-1277
Honnibal M.; Johnson M. (2015), "An Improved Non-monotonic Transition System for Dependency Parsing". Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. P. 1373–1378. DOI: 10.18653/v1/D15-1162
De Marneffe, M.-C.; Manning, C.; Nivre, J.; Zeman, D. (2021), "Universal Dependencies". Computational Linguistics. Vol. 47. No. 2. P. 255–308. DOI: 10.1162/coli_a_00402
Chaplynskyi D. (2023), "Introducing UberText 2.0: A Corpus of Modern Ukrainian at Scale". Proceedings of the Second Ukrainian Natural Language Processing Workshop (UNLP). P. 1–10. DOI: 10.18653/v1/2023.unlp-1.1
Starko V.; Rysin A. (2023), "Creating a POS Gold Standard Corpus of Modern Ukrainian". Proceedings of the Second Ukrainian Natural Language Processing Workshop (UNLP). P. 91–95. DOI: 10.18653/v1/2023.unlp-1.11.
Shvedova M.; Lukashevskyi A. (2024), "UD_Ukrainian-ParlaMint". URL: https://github.com/UniversalDependencies/UD_Ukrainian-ParlaMint.
De Lhoneux, M.; Stymne S.; Nivre J. (2017), "Arc-Hybrid Non-Projective Dependency Parsing with a Static-Dynamic Oracle". Proceedings of the 15th International Conference on Parsing Technologies. P. 99–104. URL: https://aclanthology.org/W17-6314/
Peters, M.; Neumann, M.; Iyyer, M.; Gardner, M.; Clark, C.; Lee, K.; Zettlemoyer, L. (2018), "Deep Contextualized Word Representations". Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). P. 2227–2237. DOI: 10.18653/v1/N18-1202
Che, W.; Liu, Y.; Wang, Y.; Zheng, B.; Liu, T. (2018), "Towards Better UD Parsing: Deep Contextualized Word Embeddings, Ensemble, and Treebank Concatenation". Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies. P. 55–64. DOI: 10.18653/v1/K18-2005
Eisner J. (1996), "Three New Probabilistic Models for Dependency Parsing: An Exploration". COLING 1996 Volume 1: The 16th International Conference on Computational Linguistics. P. 340–345. URL: https://aclanthology.org/C96-1058/
Nivre J.; Hall J.; Nilsson J. (2004), "Memory-Based Dependency Parsing". Proceedings of the Eighth Conference on Computational Natural Language Learning (CoNLL-2004) at HLT-NAACL 2004. P. 49–56. URL: https://aclanthology.org/W04-2407/
Nivre J.; Fang C.-T. (2017), "Universal Dependency Evaluation". Proceedings of the NoDaLiDa 2017 Workshop on Universal Dependencies (UDW 2017). P. 86–95. URL: https://aclanthology.org/W17-0411/
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Our journal abides by the Creative Commons copyright rights and permissions for open access journals.
Authors who publish with this journal agree to the following terms:
Authors hold the copyright without restrictions and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0) that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-commercial and non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
Authors are permitted and encouraged to post their published work online (e.g., in institutional repositories or on their website) as it can lead to productive exchanges, as well as earlier and greater citation of published work.












