Development of cost-sensitive artificial neural network optimization in solving imbalanced multi-class classification problems
DOI:
https://doi.org/10.15587/1729-4061.2026.356119Keywords:
cost-sensitive, artificial neural network, imbalanced classification, multi-class classification, PCA, machine learningAbstract
The object of the study is the classification process of imbalanced multi-class data in air quality analysis based on the air pollution standard index (ISPU), which involves numerical environmental features and categorical output classes.
This study addresses the problem of imbalanced multi-class classification in air quality data based on the air pollution standard index (ISPU), where conventional classification techniques tend to be biased toward majority classes and fail to accurately identify minority classes. To overcome this limitation, a cost-sensitive learning strategy combined with adaptive error weighting, grid search with k-fold cross-validation, and principal component analysis (PCA) is applied. The dataset consists of 1,147 samples with an imbalanced distribution across three classes. The results demonstrate that the proposed method achieves an accuracy and F1-score of 98.55% and an area under the ROC curve (AUC) of 0.999, while significantly improving minority class sensitivity. This performance is explained by the cost-sensitive method that increases the penalty for minority class errors and by PCA, which enhances feature representation and learning stability. Compared to existing methods, the proposed method provides a more balanced and robust classification performance without modifying the original data distribution. This method can be effectively applied to ISPU-based air quality classification results and other imbalanced multi-class classification problems, although it requires careful parameter optimization for different data characteristics.
References
- Huang, W., Li, T., Liu, J., Xie, P., Du, S., Teng, F. (2021). An overview of air quality analysis by big data techniques: Monitoring, forecasting, and traceability. Information Fusion, 75, 28–40. https://doi.org/10.1016/j.inffus.2021.03.010
- Karmoude, M., Munhungewarwa, B., Chiraira, I., Mckenzie, R., Kong, J., Smith, B. et al. (2025). Machine learning for air quality prediction and data analysis: Review on recent advancements, challenges, and outlooks. Science of the Total Environment, 1002, 180593. https://doi.org/10.1016/j.scitotenv.2025.180593
- Ravindiran, G., Hayder, G., Kanagarathinam, K., Alagumalai, A., Sonne, C. (2023). Air quality prediction by machine learning models: A predictive study on the indian coastal city of Visakhapatnam. Chemosphere, 338, 139518. https://doi.org/10.1016/j.chemosphere.2023.139518
- Fahim, A., Osman, A. M., Tarek, Z., Elshewey, A. M. (2025). Enhancing Air Quality Index Classification Based on Ensemble Machine Learning Techniques. Engineering, Technology & Applied Science Research, 15 (6), 29325–29333. https://doi.org/10.48084/etasr.13875
- Labory, J., Njomgue-Fotso, E., Bottini, S. (2024). Benchmarking feature selection and feature extraction methods to improve the performances of machine-learning algorithms for patient classification using metabolomics biomedical data. Computational and Structural Biotechnology Journal, 23, 1274–1287. https://doi.org/10.1016/j.csbj.2024.03.016
- Prael, F. J., Cox, J., Sturm, N., Kutchukian, P., Forrester, W. C., Michaud, G., Blank, J. et al. (2024). Machine learning proteochemometric models for Cereblon glue activity predictions. Artificial Intelligence in the Life Sciences, 6, 100100. https://doi.org/10.1016/j.ailsci.2024.100100
- Buda, M., Maki, A., Mazurowski, M. A. (2018). A systematic study of the class imbalance problem in convolutional neural networks. Neural Networks, 106, 249–259. https://doi.org/10.1016/j.neunet.2018.07.011
- Abdelsattar Mohamed Saeed, M., Rasslan, A., Emad-Eldeen, A. (2024). Comparative Analysis of Machine Learning Techniques for Fault Detection in Solar Panel Systems. SVU-International Journal of Engineering Sciences and Applications, 5 (2), 140–152. https://doi.org/10.21608/svusrc.2024.279389.1198
- Mienye, I. D., Sun, Y. (2021). Performance analysis of cost-sensitive learning methods with application to imbalanced medical data. Informatics in Medicine Unlocked, 25, 100690. https://doi.org/10.1016/j.imu.2021.100690
- Rezvani, S., Wang, X. (2023). A broad review on class imbalance learning techniques. Applied Soft Computing, 143, 110415. https://doi.org/10.1016/j.asoc.2023.110415
- Pes, B., Lai, G. (2021). Cost-sensitive learning strategies for high-dimensional and imbalanced data: a comparative study. PeerJ Computer Science, 7, e832. https://doi.org/10.7717/peerj-cs.832
- Sangalli, S., Erdil, E., Hoetker, A., Donati, O., Konukoglu, E. (2021). Constrained Optimization to Train Neural Networks on Critical and Under-Represented Classes. arXiv. https://doi.org/10.48550/arXiv.2102.12894
- Sleeman IV, W. C., Krawczyk, B. (2021). Multi-class imbalanced big data classification on Spark. Knowledge-Based Systems, 212, 106598. https://doi.org/10.1016/j.knosys.2020.106598
- Zubair, M., Yoon, C. (2022). Cost-Sensitive Learning for Anomaly Detection in Imbalanced ECG Data Using Convolutional Neural Networks. Sensors, 22 (11), 4075. https://doi.org/10.3390/s22114075
- Joloudari, J. H., Marefat, A., Nematollahi, M. A., Oyelere, S. S., Hussain, S. (2023). Effective Class-Imbalance Learning Based on SMOTE and Convolutional Neural Networks. Applied Sciences, 13 (6), 4006. https://doi.org/10.3390/app13064006
- Safi, S. A.-D., Castillo, P. A., Faris, H. (2022). Cost-Sensitive Metaheuristic Optimization-Based Neural Network with Ensemble Learning for Financial Distress Prediction. Applied Sciences, 12 (14), 6918. https://doi.org/10.3390/app12146918
- Prasetiyowati, M. I., Maulidevi, N. U., Surendro, K. (2022). The accuracy of Random Forest performance can be improved by conducting a feature selection with a balancing strategy. PeerJ Computer Science, 8, e1041. https://doi.org/10.7717/peerj-cs.1041
- Nath, D., Shahariar, G. M. (2023). Gastrointestinal Disease Classification through Explainable and Cost-Sensitive Deep Neural Networks with Supervised Contrastive Learning. arXiv. https://doi.org/10.48550/arXiv.2307.07603
- Mustari, A., Ahmed, R., Tasnim, A., Juthi, J. S., Shahariar, G. M. (2023). Explainable Contrastive and Cost-Sensitive Learning for Cervical Cancer Classification. 2023 26th International Conference on Computer and Information Technology (ICCIT), 1–6. https://doi.org/10.1109/iccit60459.2023.10441352
- Volk, O., Singer, G. (2024). An adaptive cost-sensitive learning approach in neural networks to minimize local training–test class distributions mismatch. Intelligent Systems with Applications, 21, 200316. https://doi.org/10.1016/j.iswa.2023.200316
- Kim, Y.-S., Kim, M. K., Fu, N., Liu, J., Wang, J., Srebric, J. (2025). Investigating the impact of data normalization methods on predicting electricity consumption in a building using different artificial neural network models. Sustainable Cities and Society, 118, 105570. https://doi.org/10.1016/j.scs.2024.105570
- Manocchio, L. D., Layeghy, S., Gallagher, M., Portmann, M. (2025). An empirical evaluation of preprocessing methods for machine learning based network intrusion detection systems. Engineering Applications of Artificial Intelligence, 158, 111289. https://doi.org/10.1016/j.engappai.2025.111289
- Bashir, R. N., Mzoughi, O., Shahid, M. A., Alturki, N., Saidani, O. (2024). “Principal Component Analysis (PCA) and feature importance-based dimension reduction for Reference Evapotranspiration (ET0) predictions of Taif, Saudi Arabia,” Computers and Electronics in Agriculture, 222, 109036. https://doi.org/10.1016/j.compag.2024.109036
- Razali, M. N., Arbaiy, N., Lin, P.-C., Ismail, S. (2025). Optimizing Multiclass Classification Using Convolutional Neural Networks with Class Weights and Early Stopping for Imbalanced Datasets. Electronics, 14 (4), 705. https://doi.org/10.3390/electronics14040705
- Wang, Y., Rosli, M. M., Musa, N., Li, F. (2024). Multi-Class Imbalanced Data Classification: A Systematic Mapping Study. Engineering, Technology & Applied Science Research, 14 (3), 14183–14190. https://doi.org/10.48084/etasr.7206
- Shoeibi, M., Nevisi, M. M. S., Salehi, R., Martín, D., Halimi, Z., Baniasadi, S. (2024). Enhancing Hyper-Spectral Image Classification with Reinforcement Learning and Advanced Multi-Objective Binary Grey Wolf Optimization. Computers, Materials & Continua, 79 (3), 3469–3493. https://doi.org/10.32604/cmc.2024.049847
- Guo, Q., Wang, C., Xiao, D., Huang, Q. (2023). A novel multi-label pest image classifier using the modified Swin Transformer and soft binary cross entropy loss. Engineering Applications of Artificial Intelligence, 126, 107060. https://doi.org/10.1016/j.engappai.2023.107060
- Zhao, Y. (2020). A note on new Bernstein-type inequalities for the log-likelihood function of Bernoulli variables. Statistics & Probability Letters, 163, 108779. https://doi.org/10.1016/j.spl.2020.108779
- Narendran, A., Inuguri, A. H., Ravindra, A. R., Saga, H., C. S., V., Raj, R., B., K. (2025). Computational Approaches for Classifying Antimicrobial Peptides: A Comparative Analysis of BERT, Word2Vec, One-Hot Encoding, and Physicochemical Analysis. Procedia Computer Science, 258, 3019–3030. https://doi.org/10.1016/j.procs.2025.04.560
- Yeung, M., Sala, E., Schönlieb, C.-B., Rundo, L. (2022). Unified Focal loss: Generalising Dice and cross entropy-based losses to handle class imbalanced medical image segmentation. Computerized Medical Imaging and Graphics, 95, 102026. https://doi.org/10.1016/j.compmedimag.2021.102026
- Song, Z., Shi, Z., Yan, X., Zhang, B., Song, S., Tang, C. (2024). An Improved Weighted Cross-Entropy-Based Convolutional Neural Network for Auxiliary Diagnosis of Pneumonia. Electronics, 13 (15), 2929. https://doi.org/10.3390/electronics13152929
- Han, X., Zhu, X., Pedrycz, W., Mostafa, A. M., Li, Z. (2024). A design of fuzzy rule-based classifier optimized through softmax function and information entropy. Applied Soft Computing, 156, 111498. https://doi.org/10.1016/j.asoc.2024.111498
- Sun, Y., Zheng, J., Zhao, H., Zhou, H., Li, J., Li, F. et al. (2024). Modifying the one-hot encoding technique can enhance the adversarial robustness of the visual model for symbol recognition. Expert Systems with Applications, 250, 123751. https://doi.org/10.1016/j.eswa.2024.123751
- Polat, G., Çağlar, Ü. M., Temizel, A. (2025). Class distance weighted cross entropy loss for classification of disease severity. Expert Systems with Applications, 269, 126372. https://doi.org/10.1016/j.eswa.2024.126372
- He, H., Garcia, E. A. (2009). Learning from Imbalanced Data. IEEE Transactions on Knowledge and Data Engineering, 21 (9), 1263–1284. https://doi.org/10.1109/tkde.2008.239
- Ling, C. X., Sheng, V. S. (2008). Cost-Sensitive Learning and the Class Imbalance Problem. Encyclopedia of Machine Learning. Available at: https://www.researchgate.net/publication/268201268_Cost-Sensitive_Learning_and_the_Class_Imbalance_Problem
- Johnson, J. M., Khoshgoftaar, T. M. (2019). Survey on deep learning with class imbalance. Journal of Big Data, 6 (1). https://doi.org/10.1186/s40537-019-0192-5
- Cui, Y., Jia, M., Lin, T.-Y., Song, Y., Belongie, S. (2019). Class-Balanced Loss Based on Effective Number of Samples. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 9260–9269. https://doi.org/10.1109/cvpr.2019.00949
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Bosker Sinaga, Yuhandri Yuhandri, Gunadi Widi Nurcahyo

This work is licensed under a Creative Commons Attribution 4.0 International License.
The consolidation and conditions for the transfer of copyright (identification of authorship) is carried out in the License Agreement. In particular, the authors reserve the right to the authorship of their manuscript and transfer the first publication of this work to the journal under the terms of the Creative Commons CC BY license. At the same time, they have the right to conclude on their own additional agreements concerning the non-exclusive distribution of the work in the form in which it was published by this journal, but provided that the link to the first publication of the article in this journal is preserved.
A license agreement is a document in which the author warrants that he/she owns all copyright for the work (manuscript, article, etc.).
The authors, signing the License Agreement with TECHNOLOGY CENTER PC, have all rights to the further use of their work, provided that they link to our edition in which the work was published.
According to the terms of the License Agreement, the Publisher TECHNOLOGY CENTER PC does not take away your copyrights and receives permission from the authors to use and dissemination of the publication through the world's scientific resources (own electronic resources, scientometric databases, repositories, libraries, etc.).
In the absence of a signed License Agreement or in the absence of this agreement of identifiers allowing to identify the identity of the author, the editors have no right to work with the manuscript.
It is important to remember that there is another type of agreement between authors and publishers – when copyright is transferred from the authors to the publisher. In this case, the authors lose ownership of their work and may not use it in any way.




