Development of heart attack prediction model based on ensemble learning
DOI:
https://doi.org/10.15587/1729-4061.2021.238528Keywords:
heart attack prediction, machine learning, ensemble learning, stacking ensemble techniqueAbstract
With the advent of the data age, the continuous improvement and widespread application of medical information systems have led to an exponential growth of biomedical data, such as medical imaging, electronic medical records, biometric tags, and clinical records that have potential and essential research value. However, medical research based on statistical methods is limited by the class and size of the research community, so it cannot effectively perform data mining for large-scale medical information. At the same time, supervised machine learning techniques can effectively solve this problem. Heart attack is one of the most common diseases and one of the leading causes of death, so finding a system that can accurately and reliably predict early diagnosis is an essential and influential step in treating such diseases. Researchers have used various data mining and machine learning techniques to analyze medical data, helping professionals predict heart disease. This paper presents various features related to heart disease, and the model is based on ensemble learning. The proposed system involves preprocessing data, selecting attributes, and then using logistic regression algorithms as meta-classifiers to build the ensemble learning model. Furthermore, using machine learning algorithms (Support Vector Machines, Decision Tree, Random Forest, Extreme Gradient Boosting) for prediction on the Framingham Heart Study dataset and compared with the proposed methodology. The results show that the feasibility and effectiveness of the proposed prediction method based on group learning provide accuracy for medical recommendations and better accuracy than the single traditional machine learning algorithm.
Supporting Agency
- First of all, I would like to thank, Associate Professor Ibrahim Ahmed Saleh, for his meticulous care and help in my life and academics. Teacher Ibrahim has noble morals, kindness to others, rigorous scholarship, and profound knowledge. he not only taught me how to do the skills of learning have also taught me the principles of life, which will benefit me for life. At the end of this topic, I would like to extend my sincerest gratitude to Teacher Ibrahim again. Thanks to the University of Mosul, college of computer science and mathematics for their care and care for my daily experiments and life.
References
- Waqar, M., Dawood, H., Dawood, H., Majeed, N., Banjar, A., Alharbey, R. (2021). An Efficient SMOTE-Based Deep Learning Model for Heart Attack Prediction. Scientific Programming, 2021, 1–12. doi: https://doi.org/10.1155/2021/6621622
- Muhammad, Y., Tahir, M., Hayat, M., Chong, K. T. (2020). Early and accurate detection and diagnosis of heart disease using intelligent computational model. Scientific Reports, 10 (1). doi: https://doi.org/10.1038/s41598-020-76635-9
- Roth, G. A., Mensah, G. A., Johnson, C. O., Addolorato, G., Ammirati, E., Baddour, L. M. et. al. (2020). Global Burden of Cardiovascular Diseases and Risk Factors, 1990–2019: Update From the GBD 2019 Study. Journal of the American College of Cardiology, 76 (25), 2982–3021. doi: https://doi.org/10.1016/j.jacc.2020.11.010
- Ramdurai, B. (2020). How AI (Artificial Intelligence) can improve Patient Experience in OPD (Out-Patient Dept.). doi: https://doi.org/10.13140/RG.2.2.23267.17440
- Keya, M. S., Shamsojjaman, M., Hossain, F., Akter, F., Islam, F., Emon, M. U. (2021). Measuring the Heart Attack Possibility using Different Types of Machine Learning Algorithms. 2021 International Conference on Artificial Intelligence and Smart Systems (ICAIS). doi: https://doi.org/10.1109/icais50930.2021.9395846
- Rincy, T. N., Gupta, R. (2020). Ensemble Learning Techniques and its Efficiency in Machine Learning: A Survey. 2nd International Conference on Data, Engineering and Applications (IDEA). doi: https://doi.org/10.1109/idea49133.2020.9170675
- Virani, S. S., Alonso, A., Aparicio, H. J., Benjamin, E. J., Bittencourt, M. S. et. al. (2021). Heart Disease and Stroke Statistics – 2021 Update. Circulation, 143 (8). doi: https://doi.org/10.1161/cir.0000000000000950
- Nurmamadovna, I. N. (2021). Coronary Heart Disease. The American Journal of Medical Sciences and Pharmaceutical Research, 03 (02), 31–36. doi: https://doi.org/10.37547/tajmspr/volume03issue02-04
- Dash, S., Shakyawar, S. K., Sharma, M., Kaushik, S. (2019). Big data in healthcare: management, analysis and future prospects. Journal of Big Data, 6 (1). doi: https://doi.org/10.1186/s40537-019-0217-0
- Saw, M., Saxena, T., Kaithwas, S., Yadav, R., Lal, N. (2020). Estimation of Prediction for Getting Heart Disease Using Logistic Regression Model of Machine Learning. 2020 International Conference on Computer Communication and Informatics (ICCCI). doi: https://doi.org/10.1109/iccci48352.2020.9104210
- Yekkala, I., Dixit, S. (2018). Prediction of Heart Disease Using Random Forest and Rough Set Based Feature Selection. International Journal of Big Data and Analytics in Healthcare, 3 (1), 1–12. doi: https://doi.org/10.4018/ijbdah.2018010101
- Shah, D., Patel, S., Bharti, S. K. (2020). Heart Disease Prediction using Machine Learning Techniques. SN Computer Science, 1 (6). doi: https://doi.org/10.1007/s42979-020-00365-y
- Kamboj, M. (2019). Heart Disease Prediction with Machine Learning Approaches. International Journal of Science and Research, 9 (7), 1454–1458. Available at: https://www.ijsr.net/get_count.php?paper_id=SR20724113128
- Bindhika, G. S. S., Meghana, M., Reddy, M. S., Rajalakshmi (2020). Heart Disease Prediction Using Machine Learning Techniques. International Research Journal of Engineering and Technology (IRJET), 07 (04), 5272–5276. Available at: https://www.researchgate.net/publication/344557562_Heart_Disease_Prediction_Using_Machine_Learning_Techniques
- Kim, J. K., Kang, S. (2017). Neural Network-Based Coronary Heart Disease Risk Prediction Using Feature Correlation Analysis. Journal of Healthcare Engineering, 2017, 1–13. doi: https://doi.org/10.1155/2017/2780501
- Kasbe, T., Pippal, R. S. (2017). Design of heart disease diagnosis system using fuzzy logic. 2017 International Conference on Energy, Communication, Data Analytics and Soft Computing (ICECDS). doi: https://doi.org/10.1109/icecds.2017.8390044
- Salhi, D. E., Tari, A., Kechadi, M.-T. (2021). Using Machine Learning for Heart Disease Prediction. Lecture Notes in Networks and Systems, 70–81. doi: https://doi.org/10.1007/978-3-030-69418-0_7
- Kshirsagar, P. (2020). ECG Signal Analysis and Prediction of Heart Attack with the Help of Optimized Neural Network. Alochana Chakra Journal, IX (IV), 497–506. Available at: https://www.researchgate.net/publication/340599087
- Malavika, G., Rajathi, N., Vanitha, V., Parameswari, P. (2020). Heart Disease Prediction Using Machine Learning Algorithms. Bioscience Biotechnology Research Communications, 13 (11), 24–27. doi: https://doi.org/10.21786/bbrc/13.11/6
- Lee, W.‐M. (2019). Supervised Learning-Classification Using K-Nearest Neighbors (KNN). Python® Machine Learning, 205–220. doi: https://doi.org/10.1002/9781119557500.ch9
- Lin, A., Wu, Q., Heidari, A. A., Xu, Y., Chen, H., Geng, W. et. al. (2019). Predicting Intentions of Students for Master Programs Using a Chaos-Induced Sine Cosine-Based Fuzzy K-Nearest Neighbor Classifier. IEEE Access, 7, 67235–67248. doi: https://doi.org/10.1109/access.2019.2918026
- Jiang, L., Cai, Z., Wang, D., Jiang, S. (2007). Survey of Improving K-Nearest-Neighbor for Classification. Fourth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2007). doi: https://doi.org/10.1109/fskd.2007.552
- García, V., Mollineda, R. A., Sánchez, J. S. (2007). On the k-NN performance in a challenging scenario of imbalance and overlapping. Pattern Analysis and Applications, 11 (3-4), 269–280. doi: https://doi.org/10.1007/s10044-007-0087-5
- Khateeb, N., Usman, M. (2017). Efficient Heart Disease Prediction System using K-Nearest Neighbor Classification Technique. Proceedings of the International Conference on Big Data and Internet of Thing - BDIOT2017. doi: https://doi.org/10.1145/3175684.3175703
- Hasija, Y., Chakraborty, R. (2021). Logistic Regression. Hands-On Data Science for Biologists Using Python, 183–196. doi: https://doi.org/10.1201/9781003090113-9-9
- Roback, P., Legler, J. (2021). Logistic Regression. Beyond Multiple Linear Regression, 151–192. doi: https://doi.org/10.1201/9780429066665-6
- Imamovic, D., Babovic, E., Bijedic, N. (2020). Prediction of mortality in patients with cardiovascular disease using data mining methods. 2020 19th International Symposium INFOTEH-JAHORINA (INFOTEH). doi: https://doi.org/10.1109/infoteh48170.2020.9066297
- Casarin, R., Facchinetti, A., Sorice, D., Tonellato, S. (2021). Decision trees and random forests*. The Essentials of Machine Learning in Finance and Accounting, 7–36. doi: https://doi.org/10.4324/9781003037903-2
- Singh, Y. K., Sinha, N., Singh, S. K. (2017). Heart Disease Prediction System Using Random Forest. Advances in Computing and Data Sciences, 613–623. doi: https://doi.org/10.1007/978-981-10-5427-3_63
- Santhi, P., Ajay, R., Harshini, D., Jamuna Sri, S. S. (2021). A Survey on Heart Attack Prediction Using Machine Learning. Turkish Journal of Computer and Mathematics Education (TURCOMAT), 12 (2). doi: https://doi.org/10.17762/turcomat.v12i2.1955
- Frery, J. (2019). Ensemble Learning for Extremely Imbalced Data Flows. HAL. Available at: https://tel.archives-ouvertes.fr/tel-02899943/document
- Pathak, S., Mishra, I., Swetapadma, A. (2018). An Assessment of Decision Tree based Classification and Regression Algorithms. 2018 3rd International Conference on Inventive Computation Technologies (ICICT). doi: https://doi.org/10.1109/icict43934.2018.9034296
- Kocarik Gacar, B., Deveci Kocakoç, İ. (2020). Regression Analyses or Decision Trees? Celal Bayar Üniversitesi Sosyal Bilimler Dergisi, 18 (4), 251–260. doi: https://doi.org/10.18026/cbayarsos.796172
- Larose, D. T., Larose, C. D. (2014). Decision Trees. Discovering Knowledge in Data, 165–186. doi: https://doi.org/10.1002/9781118874059.ch8
- Hasija, Y., Chakraborty, R. (2021). Decision Trees and Random Forests. Hands-On Data Science for Biologists Using Python, 209–217. doi: https://doi.org/10.1201/9781003090113-11-11
- Thomas, T., Vijayaraghavan, A. P., Emmanuel, S. (2020). Applications of Decision Trees. Machine Learning Approaches in Cyber Security Analytics, 157–184. doi: https://doi.org/10.1007/978-981-15-1706-8_9
- Larose, C. D., Larose, D. T. (2019). Decision trees. Data Science Using Python and R, 81–96. doi: https://doi.org/10.1002/9781119526865.ch6
- Suthaharan, S. (2016). Decision Tree Learning. Integrated Series in Information Systems, 237–269. doi: https://doi.org/10.1007/978-1-4899-7641-3_10
- Mrva, J., Neupauer, S., Hudec, L., Sevcech, J., Kapec, P. (2019). Decision Support in Medical Data Using 3D Decision Tree Visualisation. 2019 E-Health and Bioengineering Conference (EHB). doi: https://doi.org/10.1109/ehb47216.2019.8969926
- Alsaleem, M. Y. A., Hasoon, S. O. (2020). Comparison of DT& GBDT algorithms for predictive modeling of currency exchange rates. EUREKA: Physics and Engineering, 1, 56–61. doi: https://doi.org/10.21303/2461-4262.2020.001132
- Perros, H. G. (2021). Support Vector Machines. An Introduction to IoT Analytics, 279–302. doi: https://doi.org/10.1201/9781003139041-11
- Nalepa, J., Kawulok, M. (2018). Selecting training sets for support vector machines: a review. Artificial Intelligence Review, 52 (2), 857–900. doi: https://doi.org/10.1007/s10462-017-9611-1
- Vamshi Kumar, S., Rajinikanth, T. V., Viswanadha Raju, S. (2021). Heart Attack Classification Using SVM with LDA and PCA Linear Transformation Techniques. Algorithms for Intelligent Systems, 99–112. doi: https://doi.org/10.1007/978-981-33-4046-6_10
- Kaestner, C. A. A. (2013). Support Vector Machines and Kernel Functions for Text Processing. Revista de Informática Teórica e Aplicada, 20 (3), 130. doi: https://doi.org/10.22456/2175-2745.39702
- Powers, D. M. W. (2011). Evaluation: From precision, recall and F-measure to ROC, informedness, markedness & correlation. Journal of Machine Learning Technologies, 2 (1). 37–63. Available at: https://www.researchgate.net/publication/276412348_Evaluation_From_precision_recall_and_F-measure_to_ROC_informedness_markedness_correlation
- Alsaleem, M., Hasoon, S. (2020). Predicting Bank Loan Risks Using Machine Learning Algorithms. AL-Rafidain Journal of Computer Sciences and Mathematics, 14 (1), 159–168. doi: https://doi.org/10.33899/csmj.2020.164686
- Gupta, A., Tatbul, N., Marcus, R., Zhou, S., Lee, I., Gottschlich, J. (2020). Class-Weighted Evaluation Metrics for Imbalanced Data Classification. arXiv.org. Available at: https://arxiv.org/pdf/2010.05995.pdf
- Cutler, J., Dickenson, M. (2020). Introduction to Machine Learning with Python. Computational Frameworks for Political and Social Research with Python, 129–142. doi: https://doi.org/10.1007/978-3-030-36826-5_10
- Gneiting, T., Vogel, P. (2018). Receiver Operating Characteristic (ROC) Curves. arXiv.org. Available at: https://arxiv.org/pdf/1809.04808.pdf
- Piegorsch, W. W. (2020). Confusion Matrix. Wiley StatsRef: Statistics Reference Online, 1–4. doi: https://doi.org/10.1002/9781118445112.stat08244
- Vasudev, R. A., Anitha, B., Manikandan, G., Karthikeyan, B., Ravi, L., Subramaniyaswamy, V. (2020). Heart disease prediction using stacked ensemble technique. Journal of Intelligent & Fuzzy Systems, 39 (6), 8249–8257. doi: https://doi.org/10.3233/jifs-189145
- Ravi, S., Sambath, D. M., Thangakumar, D. J., Kumar, D., Naveen, G., Bramiah, M. (2021). Prediction of Heart Disease Using Machine Learning Algorithms. Alinteri Journal of Agriculture Sciences, 36 (1), 260–264. doi: https://doi.org/10.47059/alinteri/v36i1/ajas21039
- Zhang, Y., Diao, L., Ma, L. (2021). Logistic Regression Models in Predicting Heart Disease. Journal of Physics: Conference Series, 1769, 012024. doi: https://doi.org/10.1088/1742-6596/1769/1/012024
- Yadav, K. K., Sharma, A., Badholia, A. (2021). Heart disease prediction using machine learning techniques. Information technology in industry, 9 (1), 207–214. doi: https://doi.org/10.17762/itii.v9i1.120
- Glienke, J. S. (2020). Life and death: Quantifying the risk of heart disease with machine learning. Honors Program Theses, 415. Available at: https://scholarworks.uni.edu/hpt/415
- Latifah, F. A., Slamet, I., Sugiyanto (2020). Comparison of heart disease classification with logistic regression algorithm and random forest algorithm. International Conference on Science and Applied Science (ICSAS2020). doi: https://doi.org/10.1063/5.0030579
- Mienye, I. D., Sun, Y., Wang, Z. (2020). Improved sparse autoencoder based artificial neural network approach for prediction of heart disease. Informatics in Medicine Unlocked, 18, 100307. doi: https://doi.org/10.1016/j.imu.2020.100307
- Chauhan, Y. J. (2020). Cardiovascular Disease Prediction using Classification Algorithms of Machine Learning. International Journal of Science and Research (IJSR), 9 (5), 194–200. Available at: https://www.researchgate.net/publication/341235098
- Kuruvilla, A. M., Balaji, N. V. (2021). Heart disease prediction system using Correlation Based Feature Selection with Multilayer Perceptron approach. IOP Conference Series: Materials Science and Engineering, 1085 (1), 012028. doi: https://doi.org/10.1088/1757-899x/1085/1/012028
- Zaker, N. A., Alsaleem, N., Kashmoola, M. A. (2018). Multi-agent Models Solution to Achieve EMC In Wireless Telecommunication Systems. 2018 1st Annual International Conference on Information and Sciences (AiCIS). doi: https://doi.org/10.1109/aicis.2018.00061
- Kashmoola, M. A., Alsaleem, M. Y. anad, Alsaleem, N. Y. A., Moskalets, M. (2019). Model of dynamics of the grouping states of radio electronic means in the problems of ensuring electromagnetic compatibility. Eastern-European Journal of Enterprise Technologies, 6 (9 (102)), 12–20. doi: https://doi.org/10.15587/1729-4061.2019.188976
- Ahmed, M. K., Aziz, S. F., Alsaleem, N. Y. A., Sielivanov, K., Moskalets, M. (2020). Method for determining the responses from a non-linear system using the Volterra series. Eastern-European Journal of Enterprise Technologies, 4 (9 (106)), 34–44. doi: https://doi.org/10.15587/1729-4061.2020.210754
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2021 Omar Shakir Hasan, Ibrahim Ahmed Saleh
This work is licensed under a Creative Commons Attribution 4.0 International License.
The consolidation and conditions for the transfer of copyright (identification of authorship) is carried out in the License Agreement. In particular, the authors reserve the right to the authorship of their manuscript and transfer the first publication of this work to the journal under the terms of the Creative Commons CC BY license. At the same time, they have the right to conclude on their own additional agreements concerning the non-exclusive distribution of the work in the form in which it was published by this journal, but provided that the link to the first publication of the article in this journal is preserved.
A license agreement is a document in which the author warrants that he/she owns all copyright for the work (manuscript, article, etc.).
The authors, signing the License Agreement with TECHNOLOGY CENTER PC, have all rights to the further use of their work, provided that they link to our edition in which the work was published.
According to the terms of the License Agreement, the Publisher TECHNOLOGY CENTER PC does not take away your copyrights and receives permission from the authors to use and dissemination of the publication through the world's scientific resources (own electronic resources, scientometric databases, repositories, libraries, etc.).
In the absence of a signed License Agreement or in the absence of this agreement of identifiers allowing to identify the identity of the author, the editors have no right to work with the manuscript.
It is important to remember that there is another type of agreement between authors and publishers – when copyright is transferred from the authors to the publisher. In this case, the authors lose ownership of their work and may not use it in any way.