Evaluation and optimization of the naive bayes algorithm for intrusion detection systems using the USB-IDS-1 dataset

Authors

DOI:

https://doi.org/10.15587/1729-4061.2024.317471

Keywords:

intrusion detection systems (IDS), Naive Bayes method, python, machine learning, Denial of Service (DoS) attacks, USB-IDS-1 dataset

Abstract

This study takes a look into the application of the Naive Bayes machine learning algorithm to enhance the accuracy of Intrusion Detection Systems (IDS). The primary focus is to assess the algorithm's performance in detecting various types of network attacks, particularly Denial of Service (DoS) attacks. This research proposes using Naive Bayes to improve intrusion detection systems that struggle to keep pace with evolving cyber threats. This study evaluated the efficiency scores of the Naive Bayes classifying model for two different dependency scenarios and identified strong and weak properties of this model. The Naive Bayes classifier demonstrated satisfactory results in detecting network intrusions, especially in binary classification scenarios where the goal is to distinguish normative and malicious traffic due to its simplicity and efficiency. However, its performance declined in multi-class classification tasks, where multiple types of attacks need to be differentiated. The study also highlighted the importance of data quality and quantity in training machine learning models because of the impact of those parameters on the model efficiency. The USB-IDS-1 dataset, while useful, has limitations in terms of the variety of attacks. Using datasets with a wider range of attack types could significantly improve the accuracy of IDS. The findings of this research can be applied to such domains as network security, cybersecurity, and data science. The Naive Bayes classifier can be integrated into IDS systems to enhance their ability to detect and respond to cyber threats. However, it is essential to consider the limitations of the algorithm and the specific conditions of its environment. To maximize the effectiveness of the Naive Bayes classifier, it could be promising to optimize and normalize the data to improve the accuracy of the model and combine Naive Bayes with the other machine learning algorithms to address its limitations

Author Biographies

Nurbek Konyrbaev, Korkyt Ata Kyzylorda University

PhD, Associate Professor, Head of Department

Department of Computer Science

Institute of Engineering and Technology

Yevheniy Nikitenko, National University of Life and Environmental Sciences of Ukraine

Associate Professor

Department of Computer Systems, Networks and Cybersecurity

Vadym Shtanko, National University of Life and Environmental Sciences of Ukraine

PhD Student

Department of Computer Systems, Networks and Cybersecurity

Valerii Lakhno, National University of Life and Environmental Sciences of Ukraine

Professor

Department of Computer Systems, Networks and Cybersecurity

Zharasbek Baishemirov, Abai Kazakh National Pedagogical University; Kazakh-British Technical University

PhD, Professor

Department of Mathematics and Mathematical Modelling

Postdoctoral Researcher

Department of Science

Professor

School of Applied Mathematics

Sabit Ibadulla, Korkyt Ata Kyzylorda University

PhD

Department of Computer Science

Institute of Engineering and Technology

Asem Galymzhankyzy, Korkyt Ata Kyzylorda University

Master, Teacher

Department of Computer Science

Institute of Engineering and Technology

Erkebula Myrzabek, Korkyt Ata Kyzylorda University

Student

Department of Computer Science

Institute of Engineering and Technology

References

  1. Ahmad, Z., Shahid Khan, A., Wai Shiang, C., Abdullah, J., Ahmad, F. (2020). Network intrusion detection system: A systematic study of machine learning and deep learning approaches. Transactions on Emerging Telecommunications Technologies, 32 (1). https://doi.org/10.1002/ett.4150
  2. Moustafa, N., Slay, J. (2015). UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). 2015 Military Communications and Information Systems Conference (MilCIS), 1–6. https://doi.org/10.1109/milcis.2015.7348942
  3. Dwibedi, S., Pujari, M., Sun, W. (2020). A Comparative Study on Contemporary Intrusion Detection Datasets for Machine Learning Research. 2020 IEEE International Conference on Intelligence and Security Informatics (ISI). https://doi.org/10.1109/isi49825.2020.9280519
  4. Chatzoglou, E., Kambourakis, G., Kolias, C. (2021). Empirical Evaluation of Attacks Against IEEE 802.11 Enterprise Networks: The AWID3 Dataset. IEEE Access, 9, 34188–34205. https://doi.org/10.1109/access.2021.3061609
  5. Jose, J., Jose, D. V. (2023). Deep learning algorithms for intrusion detection systems in internet of things using CIC-IDS 2017 dataset. International Journal of Electrical and Computer Engineering (IJECE), 13 (1), 1134. https://doi.org/10.11591/ijece.v13i1.pp1134-1141
  6. Catillo, M., Del Vecchio, A., Ocone, L., Pecchia, A., Villano, U. (2021). USB-IDS-1: a Public Multilayer Dataset of Labeled Network Flows for IDS Evaluation. 2021 51st Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W), 1–6. https://doi.org/10.1109/dsn-w52860.2021.00012
  7. Özsarı, M. V., Özsarı, Ş., Aydın, A., Güzel, M. S. (2024). USB-IDS-1 dataset feature reduction with genetic algorithm. Communications Faculty of Sciences University of Ankara Series A2-A3 Physical Sciences and Engineering, 66 (1), 26–44. https://doi.org/10.33769/aupse.1320795
  8. Kasongo, S. M. (2023). A deep learning technique for intrusion detection system using a Recurrent Neural Networks based framework. Computer Communications, 199, 113–125. https://doi.org/10.1016/j.comcom.2022.12.010
  9. Zou, L., Luo, X., Zhang, Y., Yang, X., Wang, X. (2023). HC-DTTSVM: A Network Intrusion Detection Method Based on Decision Tree Twin Support Vector Machine and Hierarchical Clustering. IEEE Access, 11, 21404–21416. https://doi.org/10.1109/access.2023.3251354
  10. Sammut, C., Webb, G. I. (2010). Encyclopedia of Machine Learning. Springer New York, 1031. https://doi.org/10.1007/978-0-387-30164-8
  11. Gushin, I., Sych, D. (2018). Analysis of the Impact of Text Preproccessing on the Results of Text Classification. Young Scientist, 10 (62), 264–266. Available at: https://molodyivchenyi.ua/index.php/journal/article/view/3755
  12. Shkarupylo, V., Lakhno, V., Konyrbaev, N., Baishemirov, Z., Adranova, A., Derbessal, A. (2024). Hierarchical model for building composite web services. Journal of Mathematics, Mechanics and Computer Science, 122 (2), 124–137. https://doi.org/10.26577/jmmcs2024-122-02-b10
  13. USB-IDS Datasets. Universita Degli Studi del Sannio. Available at: https://idsdata.ding.unisannio.it/datasets.html
Evaluation and optimization of the naive bayes algorithm for intrusion detection systems using the USB-IDS-1 dataset

Downloads

Published

2024-12-25

How to Cite

Konyrbaev, N., Nikitenko, Y., Shtanko, V., Lakhno, V., Baishemirov, Z., Ibadulla, S., Galymzhankyzy, A., & Myrzabek, E. (2024). Evaluation and optimization of the naive bayes algorithm for intrusion detection systems using the USB-IDS-1 dataset. Eastern-European Journal of Enterprise Technologies, 6(2 (132), 74–82. https://doi.org/10.15587/1729-4061.2024.317471