DEVELOPMENT OF HEART ATTACK PREDICTION MODEL BASED ON ENSEMBLE LEARNING

predict early diagnosis is an essential and influential step in treating such diseases. Researchers have used various data mining and machine learning techniques to analyze medical data, helping professionals predict heart disease. This paper presents various features related to heart disease, and the model is based on ensemble learning. The proposed system involves preprocessing data, selecting attributes, and then using logistic regression algorithms as meta-classifiers to build the ensemble learning model. Furthermore, using machine learning algorithms (Support Vector Machines, Decision Tree, Random Forest, Extreme Gradient Boosting) for prediction on the Framingham Heart Study dataset and compared with the proposed methodology. The results show that the feasibility and effectiveness of the proposed prediction method based on group learning provide accuracy for medical recommendations and better accuracy than the single traditional machine learning algorithm


Introduction
Doctors have many tools and methods to predict patients' health risks, but they still cannot cope with the complexity of the human body 100 % [1]. If a person has chest pain and other suspected heart disease symptoms, traditional detection methods sometimes do not necessarily detect whether the patient has a heart attack [2]. For better accuracy and results in medical diagnosis, the health field and artificial intelligence, especially in clinical diagnosis, have been linked [3]. The effect of different features and their weights on heart attack can be analyzed through machine learning, in that case, this will help in the prediction of such disease, reduce the risk, and prevention of heart attack [4,5]. Heart attack is one of the most dangerous, difficult to predict, and common diseases these days, as millions of people are infected with this disease worldwide every year [6,7]. If the disease is discovered early, it can be saved from death and serious complications [8]. Many healthcare organizations face the enormous challenge of providing quality services [9]. This study is used for the medical field to help doctors make accurate, fast, and error-free predictions and investigates the probability of a patient having a heart attack or not based on the patient's medical attributes such as age, blood pressure, gender, etc. The Framingham heart dataset was selected from the UCI repository and contained 16 features. Machine learning algorithms train these features for predicting heart attacks to improve the doctor's decisions. It is difficult or impractical to use traditional algorithms to perform the required tasks, the combination of several machine learning algorithms helps to improve the heart attack prediction.
Heart attack is a dangerous disease, most of the existing studies used traditional machine learning algorithms to predict the heart attack which in turn does not give the desired results. Therefore, it becomes necessary to propose the stacking technique to achieve better performance.

Literature review and problem statement
In recent years, several papers have been proposed for predicting heart attack; in this section, researchers' works for predicting heart attacks using machine learning algorithms are discussed and summarized.
The paper [10] used electronic clinical data, analyzed each feature in the dataset and its effect on the results, and plotted the dataset before wrangling and after the wrangling. The authors used the logistic regression algorithm for classification and random search technique to find the best parameters for building a prediction model. This study classifies the persons whether or not they have heart disease according to the clinical medical record. The "Sklearn" library is used to calculate the score. The accuracy showed 87 %, which is acceptable accuracy for predicting heart risk. Only one machine learning algorithm with a small dataset is used, no other machine learning algorithms are trained and tested on the same dataset. This does not necessarily mean that this algorithm is the best classifier for that dataset, which is considered the disadvantage of this approach.
The paper [11] proposes a coarse group theory to select significant features and used the random forest algorithm for the classification process to predict heart disease. The Heart Stalog dataset from the UCI repository was used and contained 270 instances. The dataset has been preprocessed, namely noisy and irrelevant data were removed. The accuracy of this method reached 84 %. The disadvantage of this approach is that the parameters used were not mentioned, which in turn affects the performance of the algorithms directly and has more impact than the methods of extracting features, in which datasets often have limited features. Another disadvantage is that the random forest algorithm was not applied alone to compare its results with the random forest rough sets.
The paper [12] used the Cleveland dataset from the UCI repository, which consists of 303 states and 76 features, the pre-processing is made into a dataset such as processing the missing value, removing the noise, and extracting the most important features, then applying supervised machine learning algorithms on this dataset such as KNN, Decision trees, random forest, naive Bayes. The KNN algorithm achieved the highest accuracy of 90.78 %. The disadvantage of this approach is that the result is measured only by the accuracy and other measures, such as the ROC and confusion matrix, are not used.
The paper [13] suggested machine learning algorithms such as SVM, KNN, logistic regression, decision tree, random forest, naive Bays and applying them to the Cleveland dataset from the UCI repository to predict heart diseases. The KNN algorithm achieved the highest accuracy of 87 %. The disadvantage of this approach is that parameters selection is adopted using multiple values, while parameter optimization methods can be used such as grid search and random search to select the best parameters.
In [14], the author worked on the heart disease dataset from the UCI repository, then made data pre-processing, features selection and applied these features on the hybrid random forest with a linear model for heart disease prediction. The advantage of the proposed method is high accuracy equal to 92 %. The disadvantage of this method is that the proposed method is considered the best classifier, while the results showed that it had the lowest sensitivity compared to the rest of the algorithms.
In [15], the NN-FCA approach is proposed. FCA consists of two stages, the first stage is feature selection, and the second stage is feature correlation. Then, the Neural Network algorithm for classification on the KNHANES-VI dataset was used. The advantage of this method is high performance of the Neural Network algorithm for predicting heart disease. The disadvantage of the proposed methodology is that the dataset features are not large enough to conduct operations of correlation among the dataset features, and traditional machine learning algorithms can perform well on this dataset without this complication.
In [16], the authors suggested a fuzzy expert system for predicting heart disease, including three main steps, such as fuzzification, rule base, and defuzzification. For defuzzification, the centroid technique was applied. The system contains 13 input parameters and one output parameter, the dataset was taken from the UCI repository. The advantage of this approach is that the heart attack prediction system is simple in use, and the patients can use it by themselves directly. The accuracy is 93.33 %. The disadvantage of this approach is the complexity added by fuzzy logic. The results showed no significant difference compared to previous studies on the same dataset and earlier systems.
In [17], the authors in this method used a local dataset, selected the most significant features using a correlation matrix and applied three algorithms, first, the neural network, second, support vector machine, and third, KNN on the proposed dataset for heart attack prediction. The neural network algorithm showed the best performance compared to other algorithms used in this paper, and the neural network obtained an accuracy of 93 %. The advantage of the proposed approach is the stability of the three algorithms used, despite their implementation on the varying sizes of the dataset. The disadvantage of this work is that the local dataset used is not a certified global dataset.
In [18], the authors suggested a hybrid genetic neural network algorithm. ECG signal dataset taken from MIT-BIH arrhythmia was used. The dataset has been pre-processed, including data cleaning to remove noisy data and pattern identification to identify the pattern of ECG data. For the prediction process, the neural network is used with a genetic algorithm to optimize neural weights. The advantage of the proposed methodology is speeding up the neural network prediction of heart attacks by applying a genetic algorithm on the neural network. The disadvantage of this approach is the complexity added to the neural network.
The paper [19] proposed traditional machine learning algorithms such as naive Bayes classifier, logistic regression, random forest, support vector machine, decision tree classifier, and KNN to predict heart diseases. The classifier algorithms are trained and tested on the dataset available in the UCI repository. The accuracy results of the suggested algorithms are compared and showed that the random forest algorithm obtained the best accuracy, therefore is the best classifier for predicting heart disease. The advantage is that traditional machine learning algorithms are used without any complexity with an accuracy achieved of 91.17 %, which is an acceptable accuracy. The disadvantage is that there are no feature extraction techniques used in the dataset, which in turn helps give better results.
Previous studies rely on a small data set or local set, which makes the results unreliable. In addition, there is no comparison of the results with other algorithms to determine the efficiency of the proposed model. The prediction of diseases needs to use different measures to know the best model and that the results are not biased towards the majority category. Parameters are specified using multiple values, and this does not have to give the best values. Some methods increase the complexity of the system despite the lack of better performance.
All these give reason to develop a highly efficient and stable model for heart attack prediction.

The aim and objectives of the study
This work aims to determine a robust and accurate method for heart attack prediction using ensemble learning and compare it to single traditional classification algorithms.
To achieve the aim, the following objectives were set: -to design a model using a stacking ensemble technique by combining the decision tree algorithm, logistic regression, SVM algorithm, XGboost algorithm, and use the logistic regression algorithm as a meta-classifier to predict heart attack with better accuracy; -to investigate that the designed model has high prediction accuracy, compare this model with the models of single traditional algorithms, at the same time, to verify that the proposed method has a high prediction accuracy, comparing it with previous research on the same data set.

Materials and methods
The Framingham heart study dataset obtained from the UCI repository is used. This data is incomplete and cannot be trained directly. Hence, it needs pre-processing to solve this problem, such as missing value imputation using the mean method and removing noise. The dataset contains 16 attributes and 4.239 instances. The attributes are described in Table 1. In this paper, machine learning algorithms are used to build prediction models and select the models based on accuracy. The models' output chosen is used as new features for training the final ensemble model and compared with single traditional algorithms based on accuracy. The accuracy results of different machine learning classifications and proposed methods have been observed using the Python programming language. Research was performed on the 7 th generation Intel Corei7 having an 8750H processor up to 4.1 GHz CPU and 16 GB ram. Below is a review of the algorithms used.
K-Nearest Neighbor (K-NN). K-NN (K-Nearest Neighbor) is one of the most basic algorithms in machine learning [20]. K-NN is used for both classification and regression [21]. The idea of the KNN algorithm is that N-dimensional input vector corresponds to a point of the feature space, and the output value is the category label or a predicted value corresponding to the feature vector [22]. The KNN algorithm does not have an explicit learning process, it is very special [23]. It uses training data to divide the feature vector space and uses the result of the division as the final algorithm model [24].
It is a generalized linear regression analysis model. It adds the sigmoid function to the original linear regression, thereby mapping the original positive and negative infinity interval to the range of 0 to 1, corresponding to the probability that the model is judged as a positive example, so it is often used in data mining, automatic diagnosis of diseases, economic trend prediction and other fields. This algorithm is also a common two-classification model in essence, and the category corresponding to the object is obtained by inputting the attribute feature sequence of the unknown category object [25,26].
Random Forest.
It is an ensemble learning algorithm in supervised learning. Its essence is an ensemble classifier containing many randomly generated decision trees and combining several weak (base) classifiers to get a strong classifier with significantly superior classification performance [27]. The random forest algorithm requires an enormous difference between the decision trees with no correlation. If there is no strong dependency between the weak classifiers, the trees can be generated in parallel [28]. Random forest uses autonomous sampling to extract multiple samples from the original data. The extracted samples are first trained with a weak classifier-decision tree, and then these decision trees are combined to get the final classification or prediction result through voting [29]. XGBoost.
XGBoost is the abbreviation of "Extreme Gradient Boosting". It is an ensemble learning improvement method based on decision trees, which combines weak base classifiers into stronger classifiers. The algorithm consists of multiple decision trees, and the conclusions of all trees are added together as the final [30]. XGBoost uses Newton's method to solve the extreme value of the loss function, carries out a second-order Taylor expansion of the loss function, and adds a regular term outside the objective function to find the overall optimal solution. It is used to weigh the decline of the objective function and the complexity of the model to avoid overfitting [31].
Decision Tree. A decision tree is a tree structure, in which each internal node represents a test on an attribute, each branch represents a test output, and each leaf node represents a category [32]. A classification tree is a kind of supervised learning. Generally, a decision tree contains a root node, several internal nodes, and several leaf nodes [33]. The leaf nodes correspond to the decision results, and each other node corresponds to an attribute test [34]. Each node contains the sample set divided into subnodes according to the results of the attribute test [35]. The full set of samples is contained in the root node [36]. The path from the root node to each leaf node corresponds to a decision test sequence [37,38]. The purpose of decision tree learning is to produce a decision tree with strong generalization ability and a strong ability to deal with unseen strength [39]. Often, a decision tree is built based on a data set [40]. Support Vector Machine (SVM). Support Vector Machine (SVM) is a linear classifier that implements binary classification of data through supervised learning [41]. The decision boundary of its classification is to solve the maximum interval hyperplane for the learned data samples. The samples are divided into two categories by constructing this dividing hyperplane [42]. The sample points closest to the hyperplane are called support vectors, and the distance between these points and the segmentation plane is called the interval [43]. By maximizing the distance interval between the support vector and the segmentation plane, the algorithm performance is optimized, thereby enhancing the reliability of the classifier's prediction [44].
The evaluation of the proposed model was carried out using several criteria, namely, accuracy, recall, precision, F1 score, and ROC. The most important criterion is accuracy.
Here are some assumptions for explaining the algorithm measurement tools [45]: Assuming there are two classes p, n to be classified When the data set is unbalanced, when the number of positive samples and negative samples is significantly different, the accuracy of the model alone cannot be used to evaluate the model performance. Precision and recall are better indicators for measuring unbalanced data sets [46,47].
Precision: it refers to the effect of the degree of correctness of the prediction as a positive example in all classifications where the prediction is a positive example [48].
Recall Rate: recall rate refers to the classification sample of all positive predictions (correctly predicted to be true and incorrectly predicted but true). Recall rate refers to the degree of correctness of the prediction. It is also called sensitivity or true positive rate (TPR) [47]. Recall .
F1 score: it is usually practical to combine the accuracy and recall rate into one index F1 value, especially when a simple method is needed to measure the performance of two classifiers. The F1 value is the harmonic average of precision and recall [48].
Receiver Operating Characteristic (ROC): it is a graph consisting of a false positive rate (FPR) as a horizontal axis and true positive rate (TPR) as a vertical axis, which shows the relationship between the true rate and the false positive rate of the classifiers. The ROC curve is a very important indicator to measure the classifier performance, and it represents the degree to which the model predicts accurately [49].
Confusion Matrix: the classification algorithm's performance is based on a confusion matrix. It can be said that the easiest way to regulate the performance of the classification model is comparing the number of positive cases that are correctly rated (true/false) and the number of negative cases that are correctly rated (true/false). In the confusion matrix, as shown in Fig. 1, the column represents expected labels while the rows represent the actual labels [50].
Stacking ensemble technique achieves better performance than any single traditional training model. It has been used for supervised learning tasks (including regression, classification, etc.) and unsupervised learning (density estimation), which can estimate the bagging error rate. A group of different classifiers is combined to produce a robust and high-level learner model. Usually, this technique performs better than a single learner model for making the final prediction; the following steps are used [51]: -level zero data: all learners (classification algorithms) work on the dataset; -level one data: it takes the prediction produced by the classification algorithms as new data; -final prediction: it is another new learning process, it takes the level one data as new inputs and as output, and the final prediction is obtained.
In Fig. 2, the first layer is single models, cross-validation is used to produce a one-fold prediction result, and then all the prediction results are spliced into a completed result as the prediction feature of the model. The second layer can use a classifier or traditional fusion methods such as averaging, voting, or weighting. In this study, the ensemble stacking technique improves the accuracy of various single classifier algorithms for heart attack prediction. The method used (KNN, Logistic regression, random forest, XGBoost, decision tree, and SVM) then combining the prediction models of SVM, XGBoost, decision tree, and random forest using the ensemble stacking technique. The output of prediction models generated new features and has been input for meta-classifier and appeared the final prediction; this method showed the highest prediction accuracy compared to other single classification algorithms. Fig. 3 shows the framework of the proposed method for heart attack prediction.

1. Investigating the performance of the stacking model for heart attack prediction
Heart attack prediction model is based on ensemble learning using the stacking technique shown in Fig. 3. The performance of the model has been tested and measured using recall, F1 score, accuracy as shown in Table 2, confusion matrix as shown in Fig. 4, and ROC as shown in Fig. 5 and high performance was obtained. Various measures are used to verify the method's performance since research in the medical field needs to present results that are not biased towards a specific scale. Our approach used different measures, as shown in Table 2

2. Comparing the performance of the proposed method with other machine learning methods
The proposed method compared to single traditional algorithms achieved better accuracy as shown in Table 2. The results of the proposed method compared to the results of previous research methods as shown in Table 4 on the same data set showed better accuracy.
As shown in Table 4, many of the researchers relied on the Framingham heart study dataset and used different

Discussion of experimental results of studying the heart attack prediction model based on ensemble learning
The results of the study show the superiority of the proposed method compared to the traditional methods as shown in Table 2 and compared to previous studies that used the same dataset as in Table 4 and different datasets, which also showed the preference of the proposed method.
The performance of the algorithms does not necessarily depend on the accuracy measure, so the performance of the proposed model was measured using the confusion matrix as shown in Fig. 4, ROC as shown in Fig. 5, recall, F1 score, and accuracy as shown in Table 2.
The method of aggregate results from more than one learning model and building a predictive model using the results shown in Fig. 3, has high performance because the final results will depend on the majority vote that the meta-classifier trained on.
The limits and disadvantages of the study are that most of the existing research to predict heart attacks depends on two separate data sets, one of which uses the electronic record of the patient and the other uses the ECG scheme, but the machine learning algorithms, including the proposed method, do not deal with the ECG directly. That is why we suggest using deep learning to extract important features from an ECG scheme and combine them with the electronic patient record dataset.
This study can be developed by applying it to a different dataset, comparing the results among them in addition to including ECG with data of the electronic patient record dataset, and finding a mathematical model that explains these results.

Conclusions
1. The presented work mainly uses machine learning algorithms based on electronic medical record data. A heart attack prediction learning model was developed by combining the algorithms of the decision tree, logistic regression, SVM, XGboost based on the ensemble learning technique, and using logistic regression as a meta-classifier. Thereby making up for the limitations of traditional single machine learning algorithms for heart attack prediction. The model performance was measured using recall, F1 score, accuracy, and ROC and high performance of 0.95 %, 0.97 %, 96.69 %, and 0.98 %, respectively, was achieved.
2. The accuracy of the stacking ensemble technique reached 96.69 with stability, as shown in Table 3, compared to the proposed methodology with models of single traditional machine learning algorithms, at the same time compared to methods of previous research that used the Framingham heart study dataset and various datasets. The experimental results of the proposed method showed higher prediction accuracy, better performance, proved the superiority and reliability. That allows this model to help doctors to be used for practical tasks of heart attack prediction.

Acknowledgments
First of all, I would like to thank Associate Professor Ibrahim Ahmed Saleh for his meticulous care and help in my life and academics. Teacher Ibrahim has noble morals, kindness to others, rigorous scholarship, and profound knowledge. He not only taught me the skills of learning, but also the principles of life, which will benefit me for life. At the end of this topic, I would like to extend my sincerest gratitude to Teacher Ibrahim again. Thanks to the University of Mosul, college of computer science and mathematics for their care for my daily experiments and life.