Justifying the selection of a neural network linguistic classifier
DOI:
https://doi.org/10.30837/ITSSI.2023.25.005Keywords:
text classification; neural networks; LSTM; CNN; classification accuracy; model comparison; sequential data.Abstract
The subject matter of this article revolves around the exploration of neural network architectures to enhance the accuracy of text classification, particularly within the realm of natural language processing. The significance of text classification has grown notably in recent years due to its pivotal role in various applications like sentiment analysis, content filtering, and information categorization. Given the escalating demand for precision and efficiency in text classification methods, the evaluation and comparison of diverse neural network models become imperative to determine optimal strategies. The goal of this study is to address the challenges and opportunities inherent in text classification while shedding light on the comparative performance of two well-established neural network architectures: Long Short-Term Memory (LSTM) and Convolutional Neural Network (CNN). To achieve the goal, the following tasks were solved: a comprehensive analysis of these neural network models was performed, considering several key aspects. These aspects include classification accuracy, training and prediction time, model size, data distribution, and overall ease of use. By systematically assessing these attributes, this study aims to provide valuable information about the strengths and weaknesses of each model and enable researchers and practitioners to make informed decisions when selecting a neural network classifier for text classification tasks. The following methods used are a comprehensive analysis of neural network models, assessment of classification accuracy, training and prediction time, model size, and data distribution. The following results were obtained: The LSTM model demonstrated superior classification accuracy across all three training sample sizes when compared to CNN. This highlights LSTM's ability to effectively adapt to diverse data types and consistently maintain high accuracy, even with substantial data volumes. Furthermore, the study revealed that computing power significantly influences model performance, emphasizing the need to consider available resources when selecting a model. Conclusions. Based on the study's findings, the Long Short-Term Memory (LSTM) model emerged as the preferred choice for text data classification. Its adeptness in handling sequential data, recognizing long-term dependencies, and consistently delivering high accuracy positions it as a robust solution for text analysis across various domains. The decision is supported by the model's swift training and prediction speed and its compact size, making it a suitable candidate for practical implementation.
References
References
Serdechnyi, V., Barkovska, O., Rosinskiy, D., Axak, N., & Korablyov, M. (2020), "Model of the internet traffic filtering system to ensure safe web surfing". In Lecture Notes in Computational Intelligence and Decision Making: Proceedings of the XV International Scientific Conference “Intellectual Systems of Decision Making and Problems of Computational Intelligence”(ISDMCI'2019), Ukraine, May 21–25, 2019, 15 P. 133–147. Springer International Publishing. DOI: https://doi.org/10.1007/978-3-030-26474-1_10
Barkovska, O., Pyvovarova, D., Kholiev, V., Ivashchenko, H., & Rosinskiy, D. (2021),"Information Object Storage Model with Accelerated Text Processing Methods". In COLINS, P. 286–299, available at: https://csitjournal.khmnu.edu.ua/index.php/csit/article/download/182/112/559
Zhang, J. et al. (2018), "LSTM-CNN hybrid model for text classification", IEEE 3rd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC). P. 1675–1680. DOI: https://doi.org/10.1109/IAEAC.2018.8577620
WANG, Haitao, HE, Jie, ZHANG, Xiaohong, et al., (2020), "A Short Text Classification Method Based on N-Gram and CNN", Chinese Journal of Electronics, Vol. 29, No. 2, P. 248–254. DOI: https://doi.org/10.1049/cje.2020.01.001
Mohammad, A.H., Alwada’n, T., Al-Momani, O., (2019), "Arabic Text Categorization Using Support Vector Machine.", Naïve Bayes and Neural Network, Р. 930–933. DOI: https://doi.org/10.5176/2251-3043_4.4.360
Wang, Congcong & Nulty, Paul & Lillis, David. (2020), "A Comparative Study on Word Embeddings in Deep Learning for Text Classification". Conference on Natural Language Processing and Information Retrieval. Р. 37–46. DOI: https://doi.org/10.1145/3443279.3443304
Selva, Birunda S., Kanniga, Devi R. (2021), "A review on word embedding techniques for text classification", Innovative Data Communication Technologies and Application: Proceedings of ICIDCA. Р. 267–281. DOI: https://doi.org/10.1007/978-981-15-9651-3_23
Mars, M. (2022), "From word embeddings to pre-trained language models: A state-of-the-art walkthrough", Applied Sciences. – Vol. 12. №. 17. Р. 8805. DOI: https://doi.org/10.3390/app12178805
Patil, Rajvardhan & Boit, Sorio & Gudivada, Venkat & Nandigam, Jagadeesh. (2023), "A Survey of Text Representation and Embedding Techniques in NLP". IEEE Access. P 1–10. DOI: https://doi.org/10.1109/ACCESS.2023.3266377.
Krizhevsky, Alex & Sutskever, Ilya & Hinton, Geoffrey. (2012), "ImageNet Classification with Deep Convolutional Neural Networks". Neural Information Processing Systems. 25 р. DOI: https://doi.org/10.1145/3065386.
Wang, S. et al. (2018), "Densely connected CNN with multi-scale feature attention for text classification" IJCAI. Vol. 18. Р. 4468–4474. DOI: https://doi.org/10.24963/ijcai.2018/621
Jang B. et al. (2020), "Bi-LSTM model to increase accuracy in text classification: Combining Word2vec CNN and attention mechanism". Applied Sciences. Vol. 10. No 17. 5841 р. DOI: https://doi.org/10.3390/app10175841
Gao, M., Li T., Huang P. (2019), "Text classification research based on improved Word2vec and CNN". Service-Oriented Computing–ICSOC 2018 Workshops: ADMS, ASOCA, ISYyCC, CloTS, DDBS, and NLS4IoT, Hangzhou, China, November 12–15, Revised Selected Papers 16. Springer International Publishing. Р. 126-135. DOI: https://doi.org/10.1007/978-3-030-17642-6_11
Zhou, H. (2022), "Research of text classification based on TF-IDF and CNN-LSTM". Journal of Physics: Conference Series. – IOP Publishing, Vol. 2171. No 1. 2021 р. DOI: https://doi.org/10.1088/1742-6596/2171/1/012021
Mikolov, Tomas & Chen, Kai & Corrado, G.s & Dean, Jeffrey (2013), "Efficient Estimation of Word Representations in Vector Space", Proceedings of Workshop at ICLR., available at: https://www.researchgate.net/publication/234131319_Efficient_Estimation_of_Word_Representations_in_Vector_Space
Jeffrey, Pennington, Richard, Socher, and Christopher, Manning (2014), "GloVe: Global Vectors for Word Representation". In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar. Association for Computational Linguistics. P. 1532–1543. DOI: https://doi.org/10.3115/v1/D14-1162.
Panchenko, D. et al. (2021), "Ukrainian news corpus as text classification benchmark". International Conference on Information and Communication Technologies in Education, Research, and Industrial Applications. Cham: Springer International Publishing, Р. 550-559. DOI: https://doi.org/10.1007/978-3-319-76168-8
Schwenk, H., & Li, X. (2018), "A Corpus for Multilingual Document Classification in Eight Languages". available at: https://arxiv.org/abs/1805.09821
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2023 Олеся Барковська, Ксенія Воропаєва, Олександр Руських
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Our journal abides by the Creative Commons copyright rights and permissions for open access journals.
Authors who publish with this journal agree to the following terms:
Authors hold the copyright without restrictions and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0) that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-commercial and non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
Authors are permitted and encouraged to post their published work online (e.g., in institutional repositories or on their website) as it can lead to productive exchanges, as well as earlier and greater citation of published work.