IMPLEMENTATION OF DRIVER'S DROWSINESS ASSISTANCE MODEL BASED ON EYE MOVEMENTS DETECTION

NB and SVM using an eye detection dataset with training 90 % and testing 10% is 96 % and 97 %, respectively


Introduction
Driving is a complicated job, which requires mental awareness as well as physical and behavioral resources. The need for complete mental awareness makes it a risky job since people have limited ability to become mindful for a long time. The lack of awareness and drowsiness are capable of causing earnest considerable injuries and losing the life of drivers and travelers. There are lots of causes that lead to road accidents like the condition of roads, the condition of the vehicle, weather, drivers' skills, drivers' drowsiness, etc. [1]. As estimated by the administration of National Highway Traffic Safety, 7 % of all road accidents and 16.5 % of fatal road accidents in the United States are related to drowsy drivers [2]. The drowsiness state is usually indicated as sleepiness, in which persons/drivers have the tendency to

Literature review and problem statement
Drowsy vehicle drivers are more vulnerable to have road accidents than other drivers. The concerning economic burden is considerable, additionally, these accidents may lead to serious injuries and death. In order to reduce the economic and health impact of drowsiness, accurate and simple models are desired for objectively measuring drowsiness [11]. Numerous physiological, subjective, and behavioral measures of driver's drowsiness have been existing, however, each has limitations. Detecting the state of the driver's eye is regarded as a simple and effective way to measure the driver's drowsiness. Thus, many experts and researchers of all over the world have produced great efforts in this issue by following the general procedure of extracting facial features using camera feed. In addition to the process of feature extraction, other processes are required for determining the drowsiness level, typically, machine learning algorithms are applied like Naive Bayes classifier (NB), Support Vector Machines (SVM), K-Nearest Neighbors (KNN), Artificial Neural Networks (ANN), etc. These algorithms are trained by utilizing features and labeled outputs for building models that could be utilized for predicting drowsiness. In [12], a model of driver's drowsiness detection was proposed based on tracking the human eye. This model was presented for addressing the issues resulting from changes in driver posture and illumination. Several measurements were computed with the eyelid closure percentage, the maximum duration of the closure, the frequency of blink, the average level of eyes opening, and opening and closing velocities of the eyes. These utilized measurements are integrated using the functions of Fisher's linear discriminant using a stepwise method for reducing the correlations and extracting a distinct index. The obtained results based on these measurements in simulation experiments illustrate the feasibility of this drowsiness detection model with 86 % accuracy. In this model, utilizing other feature extraction techniques may lead to improving the detection rate of drowsiness. Also, classifying the features can improve the detection accuracy of drowsiness. In [13], an online driver monitoring model was implemented and considerable features of the eyes region were extracted in both spatial and frequency domains involving new features that are black ratio and circularity. Four classification models of SVM are improved depending on relevant features combinations. Although this model used several techniques to extract the features and also these features were classified, only one technique showed a high detection rate, which is 91.3 % when the wavelet transform domain, texture features, black ratio, and circularity are utilized. [14] presents a model towards real-time drowsiness detection. This model depends on a method of deep learning, which can be performed on the applications of Android with considerable accuracy. In this model, the minimal network structure is implemented depending on the facial landmark key point detection for recognizing if the driver is drowsy or not. Since the size of the utilized model is small, the obtained results are not high, and the obtained rate of accuracy is 81 %. Another real-time driver monitoring model was presented in [15], in which the method of Histogram of Oriented Gradients (HOG) is used for detecting the driver's face from the acquired video frames. Then, the ensemble of regression trees is used for obtaining bounding boxes for the eyes. After that, these bounding boxes are resized and passed to a convolutional neural for alerting drowsy drivers in a suitable time, and that way prevents lots of crashes. In the systems of the driver's drowsiness detection based on physiological signals, the metrics of reliability and accuracy are high compared with other drowsiness detection methods, whereas, the intrusive nature to measure the physiological signals is still a major problem that works on preventing these methods to be used in the real world [5].
The road accidents caused by drowsy drivers generally depend on subjective evidence like self-reports of drivers after the event and crash reports of the police [6]. The gathered evidence refers to the behavior of typical vehicles and drivers, and over these events attributes often appear like high level of speed with or without braking, vehicles leaving the road, accidents happen on high-speed roads, drivers do not avoid crashing. All the mentioned attributes referred to the vehicles involved in accidents driven by drowsy drivers create specified patterns of driving that can be utilized for detecting the driving situation of a potential drowsy. The most widespread utilized vehicle-based methods for detecting driver's drowsiness are the standard deviation of lane position and steering wheel movement [7].
The behavioral-based methods work on detecting particular behavioral evidence given by drivers when they are in a state of drowsiness. Typically, these methods concentrate on facial expressions that could indicate attributes like frequent yawning, rapid and constant blinking, or head swinging, these expressions might refer that the driver is sleep-deprived or drowsy [8]. Generally, the behavioral-based methods utilize a video camera for acquiring images based on an integration of machine learning and computer vision approaches for detecting events of interest, measuring them, then making a decision if the driver is drowsy or not. When the captured images sequence and the obtained parameters (for example, the time lapsed in the state of the closed eyes) suggest that the driver is drowsy, an audible alarm will warn. The behavioral-based methods are regarded as non-invasive and cost-effective while leading to notable challenges. Besides the challenges related to the underlying computer vision, image processing, and machine learning algorithms. These methods need to be performed in real-time and exhibit high robustness when there are lighting changes, bumpy roads, improperly mounted cameras, etc. [9].
The eye represents one of the important features of the human face. The features that are extracted via the eye movement, gaze, and texture attract high interest for possible utilization in expressing human needs, emotional states, and mental processes. Since the human eye is moving constantly, so, this will help in developing robust methods of nonintrusive eye detection and tracking. Generally, the eye can be classified into several states: opened, partially opened, and closed. The final two categories can be utilized as an indicator of the driver's sleepiness. When the eyes remain in these states for a long period of time, then, the driver is suffering from unusual behavior. The system of eyes state detection should be capable of reliably detecting and distinguishing these diverse eyes states [10]. Therefore, studies concentrate on the process of detecting the eyes state to specify whether a driver is drowsy. Most of these studies worked on extracting features, followed by training and utilization of machine learning algorithms of distinct abilities, weaknesses, and strengths. network (CNN) to check the driver's eyes state. The predictions of eyes state for the past 20 frames are then stored in a buffer. When the buffer includes data that the driver's eyes were closed, the driver is alerted. The obtained percentage accuracy of this model is 93.3 %. According to the obtained results, it is observed that CNN can give better performance when the processing layers are increased. In [16], an implementation of a real-time system of drowsiness detection was presented for preventing the driver from falling asleep while driving using a sound alert. In this system, the region of the driver's eyes in each video frame is found based on a cascade classifier using the AdaBoost training algorithm in the Viola-Jones method. After the detection of the eyes region, the color segmentation and thresholding depending on the area of binarized sclera are used to analyze the eyes either opened or closed. The utilization of thresholding led to obtaining these results. This model requires using one of the machine learning algorithms rather than thresholding to obtain high accuracies. In [17], a driver's drowsiness detection model was proposed based on monitoring the driver's eyes state. In this model, many operations of image and video processing were applied to the video frames in order to detect the eyes state. Several significant drowsiness features were extracted from the eyes; the blink frequency, eyelid closure percentage, and the largest duration of eyes closure. After that, the extracted features are passed to several machine learning algorithms; Logistic Regression (LR), KNN, SVM, and ANN. The obtained results showed that the KNN and ANN (Adadelta optimizer was used with three hidden layer network) were the best utilized algorithms. This model using KNN obtained an accuracy of 72.25 %, while the ANN model obtained an accuracy of 71.61 %. Despite the fact that there are various improvements included in this model, it is considered that the obtained result is somehow suitable to be used in the real world scenario. Therefore, this model can be regarded as an inception for detecting the driver's drowsiness ubiquity. In [18], real-time Machine Learning-based models are presented for correctly identifying the early stages of drowsiness, which differs from one individual to another. The random forest (RF), multilayer perception (MLP), and support vector machine (SVM) are trained from driver's images to infer the driver state whether is drowsy or not. These models utilize face landmarks for estimating the eye aspect ratio and apply the Machine Learning techniques for classifying the driver state. Among these techniques, the SVM has the best accuracy of 94.9 %. In spite of the difference in camera position and frames per second, this model presented high results, but, there is still room for more improvement.
In the literature, the existing detection systems provide slightly less accurate results because of low clarity in acquired videos, in addition to the low efficiency of utilized techniques in feature extraction and classification. Therefore, a model of driver's drowsiness detection has been proposed in this paper for solving these issues.

The aim and objectives of the study
The aim of the study is to create an accurate and low-cost system capable of quickly alerting the drivers of drowsiness for ensuring their safety. The developed system is based on utilizing a behavioral-based method by detecting the visual features of the drivers using eye movements (closed or opened).
To accomplish this aim, the following objectives are achieved: -to extract more robust features of the eyes region using the Advanced Local Binary Pattern (Advanced LBP) algorithm; -to utilize two models of machine classification dependping on the relevant features; -to develop a driver's drowsiness assistance model with high accuracy to monitor and alarm drivers.

Proposed model
This section is related to the design and implementation of the proposed intelligent system for the driver's drowsiness detection analysis based on behavioral-based method (detecting the driver visual features using eye movements (closed or opened)). The description of the planning of the designed system includes details of each stage that have been implemented according to the applicable system. Fig. 1 explains the proposed model, which depends on detecting the eyes (closed/opened) for alerting the driver.
In the proposed model, the driver's eyes will be determined through a short video that is recorded with a duration of five minutes to see if the driver is drowsy or not. In order to carry out the process of eye detection, several processes are required.

1. Frame selection
The camera that is installed in front of the driver is used for recording video. The driver's cases are recorded and the video sequences contain sequences created inside a controlled surrounding (static lighting) and another captured in a real car (variable lighting). So that a video of at least five minutes is recorded in the normal and abnormal state, this video is divided into frames, and each recorded video has 30 f/s in each case. Only one image is taken of each of the five frames.

2. RGB to grayscale image conversion
To extract the eyes region of the driver, the images need to be converted into grayscale images and then the eyes region is detected using the Viola-Jones algorithm. In the application of behavior-based methods, when competing directly with this RGB value, the processing time for this method will be very high. Therefore, the transformations to the gray level values of pixels should be done. Thus, there is a necessity to convert captured RGB color images into the grayscales. Grayscale images constitute only black and white variations. The color images are converted into grayscale using the following equation: where R, G and B denote the 8-bit value of red, green and blue, respectively, and the weight coefficients α 1 , α 2 , and α 3 are equal to 0.299, 0.587, and 0.114, respectively, and these coefficients are nonnegative and sum to one. y denotes the equivalent gray value of the RGB pixel. The gray image pixels' values range from 0 to 255; 0 indicates black and 255 indicates white. These values variations play a significant role in the segmentation or differentiating between the pixels.

3. Eye detection
The eye is then detected and its condition is determined in a normal and abnormal state. Haar features are utilized for eye features using the integral image with several types; edge features, linear features, and central features (bi-adjacency, tri-adjacency, and quadra-adjacency matrices), and the Ada-Boost algorithm is utilized for two types of classifiers (weak and strong), which are aggregated to reinforce the detection.
The AdaBoost algorithm is utilized to select features, then classify training (weak classifiers), after that, aggregate these classifiers to be strong classifiers. All the samples obtained from Haar features are weighted to be utilized in classification in each training of the global classification. These weights are then sent to a lower classification for extra training and finally, the last decision is obtained. The utilization of the Ada-Boost classifier is beneficial in removing some needless features and controlling data training. The steps of eye detection using the Viola-Jones algorithm are as follows: -step 1: Input image file, and store the coordinates for all the faces detected. When these eyes were detected, they are first checked for their presence within the coordinates of the face; -step 2: All the eyes of existing detected faces are considered for the algorithm; -step 3: If (p1, q1) and (p2, q2) are coordinates of the face and (r1, s1) and (r2, s2) are coordinates of the eyes, (eyes are present on the face), then p<(r1 and r2 both)<p2, q1<(s1 and s2 both)<q2. Otherwise, the eyes are discarded (lie outside the face); -step 4: For the detection of eyes, compute the distance between the eyes, if that distance is satisfied, then the distance signifies that eyes are paired eyes of the face or not; -step 5: If one eye is detected, then it is checked with other eyes for being a pair of eyes by their intermediate distance and near about the same sizes. If the two detected eyes are rectangles such that their topmost, leftmost points are (x1, y1) and (x2, y2), calculate the distance between these two points; -step 6: For all the pairs of eyes: The Region of interest (ROI) is constructed for all the pairs of eyes.

Feature extraction using Advanced Local Binary Pattern
Here, the image of eyes is split into regions. In every region, a mask of 3×3 is implemented to compute the binary patterns. These patterns are serialized for deriving eyes descriptors. The descriptors represent the eyes feature, which is also called the texture feature. Generally, this algorithm is utilized for grayscale eyes images. There are several steps of Advanced LBP: -step 1: Input the detected eye region. For each pixel in this detected region, select P neighboring pixels at a radius R. When the value of R is 1, then the number of P neighboring pixels is 8; -step 2: Compute the intensity difference of the current pixel with the P neighboring pixels; -step 3: Threshold the intensity difference, where all the negative differences are assigned 0 and all the positive differences are assigned 1, forming a bit vector; -step 4: Converting the P-bit vector to its corresponding decimal value and replacing the intensity value at the current pixel with this decimal value. Thus, the Advanced LBP descriptor for every pixel is given as: AdvancedLBP min CRS(LBP , , where g i and g c represent the intensity of the neighboring and current pixel, respectively, CRS represents the bitwise circular right shifting, and i=0,1, …, 7; -step 5: Compute the histogram, over the cell, of the frequency of each "number". The occurring of this histogram can be seen as a -256dimension feature vector; -step 6: Normalize the histogram (optionally); -step 7: Concatenate histograms of all cells and output the vector of features.

5. Classification of eye state detection
The proposed model utilized two of the most popular machine learning classifiers (NB and SVM) to decide if the person is in a drowsy state or not. Several classifiers of Naive Bayes and SVM models separately are examined, where many known parameters of learning algorithms are defined, and the values of these parameters are optimized during the experimental phase to achieves the best performance score.
The Naive Bayes classifier represents a simple probabilistic classifier family depending on the Bayes' theorem. Suppose T denotes a training set of datasets. Every dataset is denoted by the m-T vector of features. H involves m independent features (h 1 , h 2 , h 3 , …,h m ). Assume there are classes (L 1 , L 2 , L 3 , …,L n ), thereafter, the classification is for deriving the maximum P(L i |H). This can be derived from the Bayes' theorem as follows: where P(H) needs to be maximum, since it holds equal value to the whole classes; In NB, the simplifying assumption refers to conditional independent attributes (no dependency among the attri-butes). Therefore, the class assignments of the testing samples are dependent on the next equations: For instance, when a new sample is coming and the posterior probability P(L 2 |H) possesses the maximum value among the whole P(L z |H) for the whole z classes, then, it belongs to the L 2 class regarding the rule of NB.
Here, two probabilistic classifier models that use the Bayes' theorem in the learning pattern are built, a Gaussian and Bernoulli model of NB type. Then, every model is tested separately and the best model is selected based on the results. The performance of the Gaussian model shows high accuracy. The model used in the proposed model is dependent on the conditional probability, as described in the following Gaussian equation: where μ represents the mean, σ is the standard deviation, h i denotes the features, and L denotes the class and here there are two-class labels that can be identified; the first one is for indicating the normal case and the second one is the drowsy case.
The linear SVM classifier is one of the most frequently utilized classifiers. It relies on separate vectors of the training features matrix linearly to identify the optimal separating hyperplane, which finds a margin with a maximum distance between the two classes, as described in the following equation: where w t indicates the vector weight, f(x) represents the feature sets of both classes, μ i indicates the dual function returned after training, x denotes the training dataset, y indicates the output, bias (bi) indicates omega 0, and K(x i , x j ) represents (x i , x j ).

6. Implementation of the proposed model
The proposed model is implemented in two cases; the first case when the system is offline and the second case when the system is online.
In the case of offline implementation, the dataset on the driver's eye was collected by recording a video clip and then the eye is detected as explained in the previous paragraphs, and we conducted the training of the samples and testing. To find out the result, we will obtain the accuracy using two methods of classification, NB and SVM. Fig. 2 shows the structure of offline implementation.
In the case of online implementation, the data are collected directly where the camera is located in front of the driver to record video with a duration of five minutes and then the eyes are detected. Every half minute represents a complete reading and is stored in a folder and then features are extracted, after that, the model is classified by using SVM instead of NB because it is more accurate, depending on experiments. Fig. 3 shows the structure of online implementation.
The extracted features are eventually merged and presented to the SVM classifier, leading to only one case from two potential cases for the driver. Two classifications were identified: 1 for normal and 2 for abnormal.

Experimental results
This section contains the results of one set of features that are utilized for evaluating the proposed model performance of driver drowsiness detection. The developed model has been established using the Matlab programming language, and the tests have been conducted in the environment of the Windows-10 operating system, laptop computer processor: HP type laptop, CORE i7, Ram 4GB, CPU 1.70 GH. After the video acquisition process, the video is converted into frames, and then, the driver's eyes are detected. To extract the features, the image of the eyes is divided into small regions through the LBP and sequenced into a single feature vector.

1. Frame extraction
Using the camera installed inside the car, we can get the driver's image. The camera creates a video. Then, for every five frames, only one image of the person is taken in the normal and abnormal case, as shown in Fig. 4.
The Viola-Jones algorithm is dependent on several fundamental concepts. Firstly, representing the frames in a new processing format called integral frames. Secondly, extracting simple while efficient Haar-like features and utilizing a collected application process of the cascade classifiers. This provides fast elimination of the regions of the background frame for preserving only regions in accordance with a face.

2. Eye detection
After converting the RGB image into a grayscale image, the eye is determined using the Viola-Jones method. Fig. 5 shows an example of eye region detection for the driver. In this stage, it is noticeable that excellent results were obtained, since the Viola-Jones algorithm has low computational cost, high detection accuracy and speed.

3. Feature extraction using LBP
The LBP algorithm was applied to the eye detection dataset used, in which the eye image is divided into many regions. In each region of the eye image, a 3×3 filter is applied so that it calculated the binary patterns of each divided region of the eye image and then extracted the features to the eye in the normal and abnormal state. Table 1 illustrates a sample of the extracted features using LBP. Now the extracted features are ready for classification, and the NB and SVM algorithms are used. Table 1 Example of feature extraction of eye detection

4. Classification
The accuracies of using both NB and SVM algorithms are calculated after determining the percentage of training and testing. Table 2 shows the classification accuracy of both NB and SVM methods of eye detection obtained from (70 %, 80 %, 90 %) training and (30 %, 20 %, 10 %) testing, respectively.  Fig. 4. Example of the driver dataset: a -normal; b -abnormal The best-obtained accuracies (when training 90 % and testing 10 %) using the NB and SVM classifiers are 96 % and 97 %, respectively.

Discussion of the proposed system
This proposed driver's assistance model, which is based on the behavioral methods for measuring the drowsiness, requires the utilization of a mounted camera in the car for observing the features of the eyes state. In accordance with these obtained features, more processes are implemented for determining if the driver is drowsy or not, by applying two machine learning algorithms, NB and SVM. These algorithms are trained based on features and labeled outputs for building a drowsiness prediction model with high accuracy. The main challenging issue with this model is obtaining a dataset that includes all the expected variations toward races and pigments of the skin. This challenge represents a particular issue owing to confidentiality and security problems that emerge when datasets are published for commercial and academic use. Therefore, there is a limitation in finding standardized datasets. The results of the proposed model have been compared with several previous related models, as illustrated in Table 3.
As we see in the above comparison, the proposed model is more accurate than the other related models.
In the proposed behavioral-based method, the LBP algorithm was used to extract the features of the eye. For future works, it would be interesting to examine other feature extraction algorithms, such as Speeded Up Robust Features (SURF), and compare their performance with LBP. The investigation of other algorithms will allow us to more generalize the force field features extraction concepts and impacts. Additionally, an expansion of this work will include the implementation of a mechanism for detecting the driver's yawning, which can be combined with the eyes state for detecting the driver's vigilance.

Conclusions
1. Utilizing the Advanced LBP for extracting powerful features enables the model to be more robust to variable lighting conditions and noise.
2. The obtained accuracies of NB and SVM with training 90 % and testing 10 % are 96 % and 97 %, respectively. Therefore, SVM can be utilized instead of NB in the online implementation since it provides a high accuracy.
3. Selecting effective classifiers led to the high accuracy of the model, and the obtained decision is reliable enough for the alarm activation.