DEVELOPMENT OF PREDICTION MODEL OF STEEL FIBER-REINFORCED CONCRETE COMPRESSIVE STRENGTH USING RANDOM FOREST ALGORITHM COMBINED WITH HYPERPARAMETER TUNING AND K-FOLD CROSS-VALIDATION

Concrete is a brittle material. The final product of common stresses such as impact, loading, and fatigue is cracking, which leads eventually to the failure of concrete. For this reason, the brittleness of concrete has been a challenge in civil engineering since the beginning. This difficulty has been solved thanks to advancements in concrete material technology, such as the integration of scattered fibers. Fiber-reinforced concrete (FRC) is a word that has been questioned [1–3]. It’s not a novel concept to incorporate fibers into brittle materials. Horsehair and straw, for example, were once used to strengthen brittle construction materials like bricks made from clay. Fibers from asbestos were utilized in the matrix of cement paste in the 1900s; however, due to its negative effect on human health, alternative types of fiber were necessary in the 1960s and 1970s. The papers [4, 5] were the first pioneers to bring the term steel fiber-reinforced concrete (SFRC) to the attention in the early 1960s. Thus, fibers can be created from steel, plastic, glass, and natural materials in modern times; nonetheless, the fiber that is made from steel is considered the most widely utilized variety [2, 3]. The bridging influence of disconnected fibers in fiber-reinforced concrete (FRC) can boost the mechanical properties of standard concrete. As a result, FRC improves the flexural

and tensile properties of concrete, as well as its hardness and crack resistance. In specific, the most widely applied SFRC has shown major results, with a flexural strength improved significantly over standard concrete when they added steel fibers [6]. In addition, using the proper amount of steel fibers in concrete improves its durability and freeze resistance [7].
Because of the SFRC's flexibility and ductility, thin and long-span structural members can be constructed at a lower cost than ordinary concrete. Furthermore, because steel fibers inhibit fracture opening, SFRC is more durable than traditional concrete in the presence of humidity and temperature fluctuations [1,8].
The compression strength test of concrete determines the characteristic strength of concrete and is usually performed after 7 days, 28 days, or 90 days to determine the working stress of concrete. The stress test is a measure of quality control of concrete production in a factory or workshop and is used as a standard during structural design to determine the working stress of concrete.
SFRC has more forecasting parameters, such as fiber kind, volume fraction, and aspect ratio compared with normal concrete. As a result, traditional direct or nonlinear regression analyses struggle to predict the compressive and flexural properties of SFRC. The problem of predicting the strength of SFRC can be solved using machine learning methods. Because it is a machine learning application, Random forest (RF) has been utilized for years to tackle a variety of civil engineering challenges, particularly in the field of building materials. Moreover, the RF's most amazing feature is its capacity to learn the problem straight from instances in nature. Furthermore, the RF may respond correctly or nearly correctly to incomplete data, predict from noisy or bad data, and develop generalized results from novel examples. These abilities enable the RF to be a very helpful tool.

Literature review and problem statement
The application of machine learning and artificial intelligence algorithms for the prediction of the compressive strength of SFRC has been investigated in several studies. In [9], the researchers studied several mechanical properties of concrete, such as the compressive strength of fifty concrete mixes. In these mixtures, certain proportions of cement were replaced by metakaolin. They added PVA fibers at 45, 60, 90, and 120 aspect ratios and volume fractions of 0, 1, 2, and 3 % to the concrete mix. They employed two models to predict the compressive strength of their type of concrete. Those models are neural networks and linear regression. Despite their models achieving good performance, they trained them on the whole dataset and without testing their performance on unseen data. The paper [10] used a neural network to predict the compressive strength of SFRC. They utilized five training algorithms to train their model, which were Levenberg-Marquardt backpropagation (LM), batch backpropagation (BBP), incremental backpropagation (IBP), quick propagation (QP), and genetic algorithm (GA). For compressive strength prediction, the best trained algorithm was IBP. They split their dataset into two parts to build their model, which were 80 % for training and 20 % for testing. The cross-validation method was not included in their work. The K-nearest neighbor algorithm was also used as a prediction model for the compressive strength of SFRC in [11]. In their work, they added steel fiber at 0.25 percent intervals from 0.50 percent to 2.00 percent and they used 150×300 mm cylindrical specimens to cast their concrete with at least three batches for each percentage. They generated 100 samples for their model with the range of the mean and standard deviation for those batches. To build and verify their model, they split the dataset into three different groups, which were 60-40 %, 70-30 %, and 80-20 %. Although they achieved a good prediction performance, their method did not include tuning for the hyperparameter of their model nor cross-validation procedure. The paper [12] compared the prediction performance of neural network and multiple regression to predict the compressive strength of fiber-reinforced concrete. They used different independent variables such as water/cement ratio, cross-sectional area of test specimen, Young's modulus of fiber in order to build their models. This paper did not include any method for hyperparameter optimization or cross-validation. Also, they did not test the performance of their models on testing data. Two artificial neural networks were utilized to predict the compressive strength of ultra-high-performance steel fiber-reinforced concrete (UHPFRC) [13]. They used 133 samples to build their models. The predicted values of compressive strength of UHPFRC were more accurate compared with existing analytical models. Despite the results that were achieved in their work, they did not involve the cross-validation procedure in their paper. The cross-validation enables your model to train on numerous train-test splits and prevent it from overfitting. In [14], the researcher used three algorithms to build a high accuracy model for the prediction of compressive strength and slump for ultra-high-performance concrete (UHPC) with steel fiber. They compared the prediction performance of those models by using the coefficient of correlation (R), mean absolute error (MAE) and root mean square error (RMSE). Their goal from training those models was to obtain the most accurate objective function that can be used in multi-objective mixture design and optimization of UHPC reinforced with steel fibers. Their method did not include any training-testing splitting procedure or hyperparameter tuning.
Random forest (RF) is one of the most highly developed ensemble algorithms, with the appealing features of variable importance measures (VIMs), minimal model parameters, and strong resistance to overfitting. The theoretical part behind the RF algorithm was first introduced to the world by the paper [15].
Since vast strength tests to measure compressive strength are daunting, this paper proposes supervised machine learning algorithms relying on RF development models to predict FRC's compressive strength. RF models are capable of deriving a more robust predictive accuracy as compared to traditional statistical learning methods. Although the above-mentioned models achieved a good prediction performance, more reliable prediction accuracy can be obtained by using the hyperparameter tuning technique with cross-validation, which we are implementing in this paper. The above-mentioned studies did not treat the problem of overfitting properly since some of them used training-testing splitting percentage techniques while others used the whole dataset to train their model. Overfitting occurs when a model fits the data too closely. Because the fitted line will pass through every point in the graph, the model may fail to correctly forecast future data. To overcome this problem, we use a technique named k-fold cross-validation.
Therefore, RF models are used to catch the nonlinear relationship between FRC's compressive strength and multiple independent variables, the constructed models are tested using k-fold cross-validation to defeat the problem of overfitting and combined with grid search hyperparameter tuning strategy to find our optimal model. The output of the suggested models is compared to multiple linear regression models.

The aim and objectives of the study
This study aims to predict the compressive strength of SFRC concrete using a random forest algorithm.
The following objectives have been set to achieve the goal: -building a random forest machine learning-based system to predict the compressive strength of SFRC and achieving more precise and reliable prediction by combing hyperparameter tuning with the grid search procedure; -the optimized model with higher prediction accuracy obtained from the point above will be tested using the testing dataset and its result will be compared with a traditional statistical linear regression model.

1. System methodology
In this paper, we developed an efficient system that enables us to predict the compressive strength of SFRC. Fig. 1 shows the system methodology.
In this system, we use a Random Forest (RF) algorithm combined with hyperparameter tuning and cross-validation to forecast the compressive strength of FRC concrete. As shown in Fig. 1, the first stage of the system is collecting experimental data from previous research. The second stage is the process of data splitting into two parts. The third stage is hyperparameter tuning using the k-fold cross-validation and grid search method to extract the optimal hyperparameter, which gives us the best RF performance model in terms of prediction. The fourth stage is applying the RF algorithm on the hyperparameter domain that we use in the third stage. The fifth stage is finding the optimized RF model with the best parameters. The sixth stage is comparing our prediction results from the optimal RF model with the linear regression model using the training dataset. Finally, our best RF is used to predict the compressive strength of SFRC for the testing dataset and the results are compared with the linear regression model for the same observations using performance metrics.

2. Dataset
The dataset used in this research (Table 1) consists of 133 different observations collected from different literature sources [16]. The predictors used from the literature to build the models are: steel fiber volume fraction (Vf), maximum size of aggregate (Dmax), steel fiber length (lf) and diameter (df), class of SFRC (CC), water (W), cement (C), fine aggregate (FA), coarse aggregate (CA), and chemical admixture (Sp). The target variable is SFRC compressive strength. The same type of steel fiber (hooked end) was used in all concrete mixture observations.

3. Data splitting procedure
After obtaining the dataset, we split the original dataset randomly into two separate parts: the training set and the testing set. In this study, the ratio of training to testing sets is 9:1. The training set consists of 120 observations that we used to train our model while the testing set that consists of 13 observations is used to test our model performance. The training sample is chosen from the initial datasets using a stratified sampling procedure [15] to ensure a similar outcome distribution of the entire data.

Hyperparameter tuning with grid search and k-fold cross-validation
To avoid overfitting, RF must be carefully tuned. Because hyperparameter adjustment in the ML model can have a significant impact on its predictive performance, it is critical to select an appropriate hyperparameter for the model. The most common method in hyperparameter tuning in the ML model has been done through trial and error (Grid search). Users can alter two hyperparameters in RF to monitor the model's uncertainty: a) the number of trees (or iterations) (ntree), which also refers to the number of decision trees; if the number is too high, RF will overfit; b) the number of variables arbitrarily selected as candidates at each split is represented by mtry.
We adjusted two parameters in this research study, that is the mtry and ntree parameters, which have the subsequent impact on our RF prediction model. Using grid search and cross-validation (CV) methods, the output for every combination of hyperparameter optimization with those two methods is recorded in the study [17]. By putting all configurable grids in the parameter range, the grid search strongly suggests parameter setting.
The grid's axes represent an RF parameter, and each point in the grid represents a unique combination of RF parameters. The function must be optimized at each grid's point. The popular validation approach, which is k-fold CV [18], was utilized in this paper during the RF hyperparameters tuning phase to reduce bias in data selection. The k-fold CV is widely used in machine learning and it is known as the most popular type of CV. Since there is no definite/strict law for deciding the value of k, k=10 (or 5) is a shared standard in the area of practical machine learning. According to the paper [18], when the folds are five or ten, the bias of a precise estimation is smaller.
In this work, we fixed the k-fold number to ten. As a consequence, ten sessions of validation and training were carried out using distinct divisions, with the results being averaged to reflect the RF output on the training set. The WEKA (Waikato Environment for Knowledge Analysis) program was used to process all of the data in this study.

5. Applying the RF model
The paper [15] proposed a random forest, and this method was based on a non-parametric and tree-based ensemble. In contrast to typical statistical techniques, random forests use many simple decision tree algorithms rather than parametric models. A more robust prediction system can be created by combining the analysis findings of decision tree techniques. The major goal of this research is to predict the compressive strength of FRC concrete using merely regression modeling. K sets of trees are developed by the non-parametric random forest regression {DT1(S), DT2(S), ..., DTk(S)}, where S={s1,s2,…,sn} is an n-dimension input vector, which forms the forest. N results for each tree Yn (n=1, 2,.., n) are generated by the ensemble method. Calculating the mean of all tree models yields the final result. The following is the training procedure: a) draw a bootstrap sample from the available dataset, the sample is chosen arbitrarily and with replacement; b) from a bootstrap sample, develop a tree including the following changes: at each node, pick the optimal split from an arbitrarily chosen subset of mtry descriptors, which rep-resent the number of non-identical independent variables examined at each node. In this case, mtry is an important optimization parameter for the RF model. The tree is enabled to reach its maximum size without being pruned; c) step (b) is iterated till the number of decision trees (ntree) that the user specifies is reached based on the bootstrap sampling method of the data.
RF builds k regression trees and averages the results for regression. The outcomes of all individual trees are combined to produce the final expected values [15]. The RF regression predictor is defined by the following equation after k trees DTk(x) have been developed.
A new training dataset (bootstrap samples) is created for each RF regression tree development, with the old training set being replaced. As a result, each time a regression tree is built from the original dataset, a randomized drawn training sample is used. To assess its accuracy, the out-of-bag (OOB) sample is used [15].
Random forest is a valuable method in the prediction of the compressive strength of different types of concrete.

6. Optimized RF with the best parameters
To evaluate the impacts of the RF hyperparameters (mtry and ntree), the model was optimized using CV with adequately fine search grid methods, and the parameters are optimized as follows: the ultimate range of mtry is chosen from the range of [1:9] and it is increased by 2 at a time while the final range of nrtee is chosen from the range of [200:1,000] and it is increased by 200 at a time. The optimal optimization model is achieved when the lowest RMSE is obtained.

7. Comparing the RF optimized model with the linear regression model for training and testing datasets
In order to compare the RF best model with linear regression, we used performance metrics. The root mean square error (RMSE), coefficient of determination (R2), and mean absolute error (MAE) were used to evaluate the predictive performance of RF and regression models on a regular basis [19][20][21]. The three equations below are used to calculate these metrics: Here n denotes the overall number of data points used to calculate the bias; obs i y and pred i y are the observed compressive strength and the predicted compressive strength of the FRC values of the i-th observation; and y -obs is the average of all observed data. For each RF best model and linear regression model, boxplots of RMSE, MAE and R2 were generated based on the results of tenfold cross-validation cycles. The skewness and distribution of model results are visualized by boxplots, which allow us to display data quartiles (or percentiles) and averages. In addition to these boxplots, pair plots were generated to visually analyze the model's performance on prediction testing data. In general, pair plots are made by plotting the actual values on the x-axis and the anticipated values on the y-axis. To see how the anticipated values differ from the true values, a 45-degree line is drawn from the origin. Fig. 2 shows an easy chart of the RMSE cross-validated for RF; the RF model's RMSE value is more robust to mtry than to ntree. With only a 10-fold CV method, the ideal optimization settings for the RF model were ntree=400 and mtry=9 (Fig. 2). The optimized RF model's RMSE, R2, and MAE for 120 sets of training data are 5.21, 0.8, and 3.59, respectively.

1. Optimized RF model with the best parameters extracted from combining hyperparameter tuning and cross-validation procedure
Additionally, the linear regression (LR) method was used to compare the performance of our best RF with it. For linear regression, we used a training dataset with only 10 k-fold cross-validation and its R2, RMSE and MAE of the predicted values for the training dataset were 0.6, 7.7517, and 5.6966. Fig. 3 shows us the variance in the performance prediction with a tenfold cross-validation training set for the RF best model and linear regression.
High R2 values, as well as low RMSE and MAE values, imply higher model performance. Fig. 4 shows all RF predicted values for testing data using pair plots. The MAE, RMSE, and R2 between the observed and predicted values of the RF optimized model for the test data were 3.8039, 5.6627, and 0.88, respectively. Furthermore, for the test data, the MAE, RMSE, and R2 of the predicted values using the traditional multiple linear regression method were 5.6583, 8.6867 and 0.64, respectively.

2. Comparing the prediction performance of our RF best model with linear regression for the testing dataset
In terms of R2, RMSE and MAE values, the RF simulation approach outperformed the traditional multiple linear regression approach (Table 2). In the segment of evaluating datasets, the RF model generated the best results.  The aforementioned methodology was employed in this paper to predict the SFRC compressive strength. As illustrated in Fig. 1, the predictive models were built using a training set with integrated variables and then applied to the testing set. Based on the original training dataset, optimal parameter tuning values were determined using grid search and tenfold CV approach in this study. The best parameter setting for the RF model was obtained from the lowest rmse value for all the models. The rmse values vary significantly across different mtry values with different models. This indicates that the accuracy of RF models is more sensitive to mtry than ntree parameters as shown in Fig. 2. Boxplots (Fig. 3) depict the range and variation of the performance measures, R2, RMSE, and MAE, for each of the RF optimized and linear regression model's training performance. These boxplots are created using estimated values from tenfold cross-validation. In terms of the median and mean values of the three metrics, it is evident that the RF best parameters non-linear model outperforms the linear model (linear regression). Furthermore, the RF model has lower variances than the linear regression model because the RF model has a smaller interquartile range of the three metrics. The nonlinear interactions of compressive strength with the input variables explain why nonlinear models perform better. Table 2 displays the prediction performance of the RF model and linear regression on the testing dataset. Fig. 4 depicts the distribution of predicted vs. observed compressive strength values for each model. The variation between the expected and observed values in a successful model should be low. This is achieved with a dense distribution of points around a 45-degree line through the origin.
As presented in Table 2 and Fig. 4, the non-linear RF model is more resilient than linear regression. A reasonable accuracy and reliability of RF in the prediction of compressive strength are verified. A higher R2 value in the RF model means more data variance while lower RMSE, MAE values mean a higher prediction accuracy. The technique underlying more effective nonlinear model learning is complex, involving the tuning of hyperparameters and other aspects of the training and prediction process.
The limitation of this work is that the dataset used to build our predictive model was relatively small with just 133 cases. The precision and reliability of RF models can be improved by using a larger dataset. In theory, there could be other signs or variables that were overlooked due to data gathering issues. The quality and type of data used in supervised machine learning-based systems have a significant impact on their performance, which might vary based on the data, the number and quality of training datasets used, and the size at which the testing is run. In future works, the researchers can investigate whether the suggested model accuracy can be enhanced by adjusting/optimizing the hyperparameters of the models or extending or decreasing the training dataset, or introducing a new prediction model. To measure the performance of the random forest in forecasting the compressive strength values of fiber-reinforced concrete that are beyond the range of the pre-existing training dataset, more trials with a larger number of datasets are recommended.

Conclusions
1. Grid search and cross-validation is an efficient method in searching optimum hyperparameters. The RF best model achieved lower RMSE and higher R2 and MAE for the training dataset (RMSE=5.21, R2=0.8, and MAE=3.59) compared with the traditional linear regression model (R2=0.6, RMSE=7.7517, and MAE=5.6966). This indicates that RF outperforms the linear regression in the process of prediction of compressive strength of fiber-reinforced concrete. The reason for the good performance of the RF model is its non-parametric nature, which implies that the RF model does not follow the normal distribution.
2. The validation of our RF model with testing data indicates that the RF model has an excellent performance. The predicted and observed values are near the 45-degree diagonal line. The residual between them is low. The results between our RF best model with the linear regression model for testing data are also compared and showed with this type of dataset, the RF model (RMSE=5.6627, R2=0.88, and MAE=3.8039) looked to be superior to linear regression (RMSE=8.68, R2=0.64, and MAE=5.66).