Evaluation and optimization of the naive bayes algorithm for intrusion detection systems using the USB-IDS-1 dataset
DOI:
https://doi.org/10.15587/1729-4061.2024.317471Keywords:
intrusion detection systems (IDS), Naive Bayes method, python, machine learning, Denial of Service (DoS) attacks, USB-IDS-1 datasetAbstract
This study takes a look into the application of the Naive Bayes machine learning algorithm to enhance the accuracy of Intrusion Detection Systems (IDS). The primary focus is to assess the algorithm's performance in detecting various types of network attacks, particularly Denial of Service (DoS) attacks. This research proposes using Naive Bayes to improve intrusion detection systems that struggle to keep pace with evolving cyber threats. This study evaluated the efficiency scores of the Naive Bayes classifying model for two different dependency scenarios and identified strong and weak properties of this model. The Naive Bayes classifier demonstrated satisfactory results in detecting network intrusions, especially in binary classification scenarios where the goal is to distinguish normative and malicious traffic due to its simplicity and efficiency. However, its performance declined in multi-class classification tasks, where multiple types of attacks need to be differentiated. The study also highlighted the importance of data quality and quantity in training machine learning models because of the impact of those parameters on the model efficiency. The USB-IDS-1 dataset, while useful, has limitations in terms of the variety of attacks. Using datasets with a wider range of attack types could significantly improve the accuracy of IDS. The findings of this research can be applied to such domains as network security, cybersecurity, and data science. The Naive Bayes classifier can be integrated into IDS systems to enhance their ability to detect and respond to cyber threats. However, it is essential to consider the limitations of the algorithm and the specific conditions of its environment. To maximize the effectiveness of the Naive Bayes classifier, it could be promising to optimize and normalize the data to improve the accuracy of the model and combine Naive Bayes with the other machine learning algorithms to address its limitations
References
- Ahmad, Z., Shahid Khan, A., Wai Shiang, C., Abdullah, J., Ahmad, F. (2020). Network intrusion detection system: A systematic study of machine learning and deep learning approaches. Transactions on Emerging Telecommunications Technologies, 32 (1). https://doi.org/10.1002/ett.4150
- Moustafa, N., Slay, J. (2015). UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). 2015 Military Communications and Information Systems Conference (MilCIS), 1–6. https://doi.org/10.1109/milcis.2015.7348942
- Dwibedi, S., Pujari, M., Sun, W. (2020). A Comparative Study on Contemporary Intrusion Detection Datasets for Machine Learning Research. 2020 IEEE International Conference on Intelligence and Security Informatics (ISI). https://doi.org/10.1109/isi49825.2020.9280519
- Chatzoglou, E., Kambourakis, G., Kolias, C. (2021). Empirical Evaluation of Attacks Against IEEE 802.11 Enterprise Networks: The AWID3 Dataset. IEEE Access, 9, 34188–34205. https://doi.org/10.1109/access.2021.3061609
- Jose, J., Jose, D. V. (2023). Deep learning algorithms for intrusion detection systems in internet of things using CIC-IDS 2017 dataset. International Journal of Electrical and Computer Engineering (IJECE), 13 (1), 1134. https://doi.org/10.11591/ijece.v13i1.pp1134-1141
- Catillo, M., Del Vecchio, A., Ocone, L., Pecchia, A., Villano, U. (2021). USB-IDS-1: a Public Multilayer Dataset of Labeled Network Flows for IDS Evaluation. 2021 51st Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W), 1–6. https://doi.org/10.1109/dsn-w52860.2021.00012
- Özsarı, M. V., Özsarı, Ş., Aydın, A., Güzel, M. S. (2024). USB-IDS-1 dataset feature reduction with genetic algorithm. Communications Faculty of Sciences University of Ankara Series A2-A3 Physical Sciences and Engineering, 66 (1), 26–44. https://doi.org/10.33769/aupse.1320795
- Kasongo, S. M. (2023). A deep learning technique for intrusion detection system using a Recurrent Neural Networks based framework. Computer Communications, 199, 113–125. https://doi.org/10.1016/j.comcom.2022.12.010
- Zou, L., Luo, X., Zhang, Y., Yang, X., Wang, X. (2023). HC-DTTSVM: A Network Intrusion Detection Method Based on Decision Tree Twin Support Vector Machine and Hierarchical Clustering. IEEE Access, 11, 21404–21416. https://doi.org/10.1109/access.2023.3251354
- Sammut, C., Webb, G. I. (2010). Encyclopedia of Machine Learning. Springer New York, 1031. https://doi.org/10.1007/978-0-387-30164-8
- Gushin, I., Sych, D. (2018). Analysis of the Impact of Text Preproccessing on the Results of Text Classification. Young Scientist, 10 (62), 264–266. Available at: https://molodyivchenyi.ua/index.php/journal/article/view/3755
- Shkarupylo, V., Lakhno, V., Konyrbaev, N., Baishemirov, Z., Adranova, A., Derbessal, A. (2024). Hierarchical model for building composite web services. Journal of Mathematics, Mechanics and Computer Science, 122 (2), 124–137. https://doi.org/10.26577/jmmcs2024-122-02-b10
- USB-IDS Datasets. Universita Degli Studi del Sannio. Available at: https://idsdata.ding.unisannio.it/datasets.html
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 Nurbek Konyrbaev, Yevheniy Nikitenko, Vadym Shtanko, Valerii Lakhno, Zharasbek Baishemirov, Sabit Ibadulla, Asem Galymzhankyzy, Erkebula Myrzabek
This work is licensed under a Creative Commons Attribution 4.0 International License.
The consolidation and conditions for the transfer of copyright (identification of authorship) is carried out in the License Agreement. In particular, the authors reserve the right to the authorship of their manuscript and transfer the first publication of this work to the journal under the terms of the Creative Commons CC BY license. At the same time, they have the right to conclude on their own additional agreements concerning the non-exclusive distribution of the work in the form in which it was published by this journal, but provided that the link to the first publication of the article in this journal is preserved.
A license agreement is a document in which the author warrants that he/she owns all copyright for the work (manuscript, article, etc.).
The authors, signing the License Agreement with TECHNOLOGY CENTER PC, have all rights to the further use of their work, provided that they link to our edition in which the work was published.
According to the terms of the License Agreement, the Publisher TECHNOLOGY CENTER PC does not take away your copyrights and receives permission from the authors to use and dissemination of the publication through the world's scientific resources (own electronic resources, scientometric databases, repositories, libraries, etc.).
In the absence of a signed License Agreement or in the absence of this agreement of identifiers allowing to identify the identity of the author, the editors have no right to work with the manuscript.
It is important to remember that there is another type of agreement between authors and publishers – when copyright is transferred from the authors to the publisher. In this case, the authors lose ownership of their work and may not use it in any way.