Development of cost-sensitive artificial neural network optimization in solving imbalanced multi-class classification problems

Authors

DOI:

https://doi.org/10.15587/1729-4061.2026.356119

Keywords:

cost-sensitive, artificial neural network, imbalanced classification, multi-class classification, PCA, machine learning

Abstract

The object of the study is the classification process of imbalanced multi-class data in air quality analysis based on the air pollution standard index (ISPU), which involves numerical environmental features and categorical output classes.

This study addresses the problem of imbalanced multi-class classification in air quality data based on the air pollution standard index (ISPU), where conventional classification techniques tend to be biased toward majority classes and fail to accurately identify minority classes. To overcome this limitation, a cost-sensitive learning strategy combined with adaptive error weighting, grid search with k-fold cross-validation, and principal component analysis (PCA) is applied. The dataset consists of 1,147 samples with an imbalanced distribution across three classes. The results demonstrate that the proposed method achieves an accuracy and F1-score of 98.55% and an area under the ROC curve (AUC) of 0.999, while significantly improving minority class sensitivity. This performance is explained by the cost-sensitive method that increases the penalty for minority class errors and by PCA, which enhances feature representation and learning stability. Compared to existing methods, the proposed method provides a more balanced and robust classification performance without modifying the original data distribution. This method can be effectively applied to ISPU-based air quality classification results and other imbalanced multi-class classification problems, although it requires careful parameter optimization for different data characteristics.

Author Biographies

Bosker Sinaga, Universitas Putra Indonesia "YPTK"

Bachelor of Computer Science, Master of Computer Science, PhD Student

Department of Information Technology

Yuhandri Yuhandri, Universitas Putra Indonesia "YPTK"

Bachelor of Computer Science, Master of Computer Science, Doctor, Professor

Department of Information Technology

Gunadi Widi Nurcahyo, Universitas Putra Indonesia "YPTK"

Bachelor of Computer Science, Master of Computer Science, Doctor, Associate Professor

Department of  Information Technology

References

  1. Huang, W., Li, T., Liu, J., Xie, P., Du, S., Teng, F. (2021). An overview of air quality analysis by big data techniques: Monitoring, forecasting, and traceability. Information Fusion, 75, 28–40. https://doi.org/10.1016/j.inffus.2021.03.010
  2. Karmoude, M., Munhungewarwa, B., Chiraira, I., Mckenzie, R., Kong, J., Smith, B. et al. (2025). Machine learning for air quality prediction and data analysis: Review on recent advancements, challenges, and outlooks. Science of the Total Environment, 1002, 180593. https://doi.org/10.1016/j.scitotenv.2025.180593
  3. Ravindiran, G., Hayder, G., Kanagarathinam, K., Alagumalai, A., Sonne, C. (2023). Air quality prediction by machine learning models: A predictive study on the indian coastal city of Visakhapatnam. Chemosphere, 338, 139518. https://doi.org/10.1016/j.chemosphere.2023.139518
  4. Fahim, A., Osman, A. M., Tarek, Z., Elshewey, A. M. (2025). Enhancing Air Quality Index Classification Based on Ensemble Machine Learning Techniques. Engineering, Technology & Applied Science Research, 15 (6), 29325–29333. https://doi.org/10.48084/etasr.13875
  5. Labory, J., Njomgue-Fotso, E., Bottini, S. (2024). Benchmarking feature selection and feature extraction methods to improve the performances of machine-learning algorithms for patient classification using metabolomics biomedical data. Computational and Structural Biotechnology Journal, 23, 1274–1287. https://doi.org/10.1016/j.csbj.2024.03.016
  6. Prael, F. J., Cox, J., Sturm, N., Kutchukian, P., Forrester, W. C., Michaud, G., Blank, J. et al. (2024). Machine learning proteochemometric models for Cereblon glue activity predictions. Artificial Intelligence in the Life Sciences, 6, 100100. https://doi.org/10.1016/j.ailsci.2024.100100
  7. Buda, M., Maki, A., Mazurowski, M. A. (2018). A systematic study of the class imbalance problem in convolutional neural networks. Neural Networks, 106, 249–259. https://doi.org/10.1016/j.neunet.2018.07.011
  8. Abdelsattar Mohamed Saeed, M., Rasslan, A., Emad-Eldeen, A. (2024). Comparative Analysis of Machine Learning Techniques for Fault Detection in Solar Panel Systems. SVU-International Journal of Engineering Sciences and Applications, 5 (2), 140–152. https://doi.org/10.21608/svusrc.2024.279389.1198
  9. Mienye, I. D., Sun, Y. (2021). Performance analysis of cost-sensitive learning methods with application to imbalanced medical data. Informatics in Medicine Unlocked, 25, 100690. https://doi.org/10.1016/j.imu.2021.100690
  10. Rezvani, S., Wang, X. (2023). A broad review on class imbalance learning techniques. Applied Soft Computing, 143, 110415. https://doi.org/10.1016/j.asoc.2023.110415
  11. Pes, B., Lai, G. (2021). Cost-sensitive learning strategies for high-dimensional and imbalanced data: a comparative study. PeerJ Computer Science, 7, e832. https://doi.org/10.7717/peerj-cs.832
  12. Sangalli, S., Erdil, E., Hoetker, A., Donati, O., Konukoglu, E. (2021). Constrained Optimization to Train Neural Networks on Critical and Under-Represented Classes. arXiv. https://doi.org/10.48550/arXiv.2102.12894
  13. Sleeman IV, W. C., Krawczyk, B. (2021). Multi-class imbalanced big data classification on Spark. Knowledge-Based Systems, 212, 106598. https://doi.org/10.1016/j.knosys.2020.106598
  14. Zubair, M., Yoon, C. (2022). Cost-Sensitive Learning for Anomaly Detection in Imbalanced ECG Data Using Convolutional Neural Networks. Sensors, 22 (11), 4075. https://doi.org/10.3390/s22114075
  15. Joloudari, J. H., Marefat, A., Nematollahi, M. A., Oyelere, S. S., Hussain, S. (2023). Effective Class-Imbalance Learning Based on SMOTE and Convolutional Neural Networks. Applied Sciences, 13 (6), 4006. https://doi.org/10.3390/app13064006
  16. Safi, S. A.-D., Castillo, P. A., Faris, H. (2022). Cost-Sensitive Metaheuristic Optimization-Based Neural Network with Ensemble Learning for Financial Distress Prediction. Applied Sciences, 12 (14), 6918. https://doi.org/10.3390/app12146918
  17. Prasetiyowati, M. I., Maulidevi, N. U., Surendro, K. (2022). The accuracy of Random Forest performance can be improved by conducting a feature selection with a balancing strategy. PeerJ Computer Science, 8, e1041. https://doi.org/10.7717/peerj-cs.1041
  18. Nath, D., Shahariar, G. M. (2023). Gastrointestinal Disease Classification through Explainable and Cost-Sensitive Deep Neural Networks with Supervised Contrastive Learning. arXiv. https://doi.org/10.48550/arXiv.2307.07603
  19. Mustari, A., Ahmed, R., Tasnim, A., Juthi, J. S., Shahariar, G. M. (2023). Explainable Contrastive and Cost-Sensitive Learning for Cervical Cancer Classification. 2023 26th International Conference on Computer and Information Technology (ICCIT), 1–6. https://doi.org/10.1109/iccit60459.2023.10441352
  20. Volk, O., Singer, G. (2024). An adaptive cost-sensitive learning approach in neural networks to minimize local training–test class distributions mismatch. Intelligent Systems with Applications, 21, 200316. https://doi.org/10.1016/j.iswa.2023.200316
  21. Kim, Y.-S., Kim, M. K., Fu, N., Liu, J., Wang, J., Srebric, J. (2025). Investigating the impact of data normalization methods on predicting electricity consumption in a building using different artificial neural network models. Sustainable Cities and Society, 118, 105570. https://doi.org/10.1016/j.scs.2024.105570
  22. Manocchio, L. D., Layeghy, S., Gallagher, M., Portmann, M. (2025). An empirical evaluation of preprocessing methods for machine learning based network intrusion detection systems. Engineering Applications of Artificial Intelligence, 158, 111289. https://doi.org/10.1016/j.engappai.2025.111289
  23. Bashir, R. N., Mzoughi, O., Shahid, M. A., Alturki, N., Saidani, O. (2024). “Principal Component Analysis (PCA) and feature importance-based dimension reduction for Reference Evapotranspiration (ET0) predictions of Taif, Saudi Arabia,” Computers and Electronics in Agriculture, 222, 109036. https://doi.org/10.1016/j.compag.2024.109036
  24. Razali, M. N., Arbaiy, N., Lin, P.-C., Ismail, S. (2025). Optimizing Multiclass Classification Using Convolutional Neural Networks with Class Weights and Early Stopping for Imbalanced Datasets. Electronics, 14 (4), 705. https://doi.org/10.3390/electronics14040705
  25. Wang, Y., Rosli, M. M., Musa, N., Li, F. (2024). Multi-Class Imbalanced Data Classification: A Systematic Mapping Study. Engineering, Technology & Applied Science Research, 14 (3), 14183–14190. https://doi.org/10.48084/etasr.7206
  26. Shoeibi, M., Nevisi, M. M. S., Salehi, R., Martín, D., Halimi, Z., Baniasadi, S. (2024). Enhancing Hyper-Spectral Image Classification with Reinforcement Learning and Advanced Multi-Objective Binary Grey Wolf Optimization. Computers, Materials & Continua, 79 (3), 3469–3493. https://doi.org/10.32604/cmc.2024.049847
  27. Guo, Q., Wang, C., Xiao, D., Huang, Q. (2023). A novel multi-label pest image classifier using the modified Swin Transformer and soft binary cross entropy loss. Engineering Applications of Artificial Intelligence, 126, 107060. https://doi.org/10.1016/j.engappai.2023.107060
  28. Zhao, Y. (2020). A note on new Bernstein-type inequalities for the log-likelihood function of Bernoulli variables. Statistics & Probability Letters, 163, 108779. https://doi.org/10.1016/j.spl.2020.108779
  29. Narendran, A., Inuguri, A. H., Ravindra, A. R., Saga, H., C. S., V., Raj, R., B., K. (2025). Computational Approaches for Classifying Antimicrobial Peptides: A Comparative Analysis of BERT, Word2Vec, One-Hot Encoding, and Physicochemical Analysis. Procedia Computer Science, 258, 3019–3030. https://doi.org/10.1016/j.procs.2025.04.560
  30. Yeung, M., Sala, E., Schönlieb, C.-B., Rundo, L. (2022). Unified Focal loss: Generalising Dice and cross entropy-based losses to handle class imbalanced medical image segmentation. Computerized Medical Imaging and Graphics, 95, 102026. https://doi.org/10.1016/j.compmedimag.2021.102026
  31. Song, Z., Shi, Z., Yan, X., Zhang, B., Song, S., Tang, C. (2024). An Improved Weighted Cross-Entropy-Based Convolutional Neural Network for Auxiliary Diagnosis of Pneumonia. Electronics, 13 (15), 2929. https://doi.org/10.3390/electronics13152929
  32. Han, X., Zhu, X., Pedrycz, W., Mostafa, A. M., Li, Z. (2024). A design of fuzzy rule-based classifier optimized through softmax function and information entropy. Applied Soft Computing, 156, 111498. https://doi.org/10.1016/j.asoc.2024.111498
  33. Sun, Y., Zheng, J., Zhao, H., Zhou, H., Li, J., Li, F. et al. (2024). Modifying the one-hot encoding technique can enhance the adversarial robustness of the visual model for symbol recognition. Expert Systems with Applications, 250, 123751. https://doi.org/10.1016/j.eswa.2024.123751
  34. Polat, G., Çağlar, Ü. M., Temizel, A. (2025). Class distance weighted cross entropy loss for classification of disease severity. Expert Systems with Applications, 269, 126372. https://doi.org/10.1016/j.eswa.2024.126372
  35. He, H., Garcia, E. A. (2009). Learning from Imbalanced Data. IEEE Transactions on Knowledge and Data Engineering, 21 (9), 1263–1284. https://doi.org/10.1109/tkde.2008.239
  36. Ling, C. X., Sheng, V. S. (2008). Cost-Sensitive Learning and the Class Imbalance Problem. Encyclopedia of Machine Learning. Available at: https://www.researchgate.net/publication/268201268_Cost-Sensitive_Learning_and_the_Class_Imbalance_Problem
  37. Johnson, J. M., Khoshgoftaar, T. M. (2019). Survey on deep learning with class imbalance. Journal of Big Data, 6 (1). https://doi.org/10.1186/s40537-019-0192-5
  38. Cui, Y., Jia, M., Lin, T.-Y., Song, Y., Belongie, S. (2019). Class-Balanced Loss Based on Effective Number of Samples. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 9260–9269. https://doi.org/10.1109/cvpr.2019.00949
Development of cost-sensitive artificial neural network optimization in solving imbalanced multi-class classification problems

Downloads

Published

2026-04-30

How to Cite

Sinaga, B., Yuhandri, Y., & Nurcahyo, G. W. (2026). Development of cost-sensitive artificial neural network optimization in solving imbalanced multi-class classification problems. Eastern-European Journal of Enterprise Technologies, 2(9 (140), 40–63. https://doi.org/10.15587/1729-4061.2026.356119

Issue

Section

Information and controlling system