DEEP LEARNING-BASED IRAQI BANKNOTES CLASSIFICATION SYSTEM FOR BLIND PEOPLE

Modern systems have


Introduction
Modern systems have been spotlighting on improving the people life quality. So, new technologies and modern models are currently utilized extensively in different applications of our societies, such as electronic education, medical therapy, and many other fields. One of the medical applications is using computer vision technologies in assisting visually impaired people in their daily financial transactions. The abilities of visually impaired people for detection and recognition of banknotes are limited or influenced. Owing to this incentive, many visually impaired people bring a sighted friend or family member to assist them in their daily financial activities [1]. Due to repeated use, tactile markings on the banknote's surface disappear or fade away, making it a frustrating task for blind people to recognize and count various types of banknotes by touch. Several kinds of research were proposed in the literature and suggested many approaches to overcome the issues of blind people to recognize and distinguish every given banknote. However, the proposed solutions of these approaches are generally deemed untactile enough and unreliable [2][3][4].
Helping blind people to hear and understand the category of paper currency is the ultimate objective of the current research paper. The idea of achieving this objective lies under some topics, such as human-computer interaction (HCI), image classification, and computer vision. Therefore, the problem deals with image classification, and it would be the most suitable approach in solving such a problem. Deep learning is the relevant solution, as it matches our case that is needing to classify an image of a certain set of paper currency images and to provide a new prediction based on an extracted model from the deep learning method [5,6].
Motivated by this issue, the current research attempts to develop a new real-time banknotes classification system based on deep learning. The developed model will be used to recognize the denomination of the papers of Iraqi Dinar and convert it to specific vocal commands.

DEEP LEARNING-BASED IRAQI BANKNOTES CLASSIFICATION SYSTEM FOR BLIND PEOPLE S o h a i b R a j a b A w a d
currency classification. A group of researchers presented a system for recognizing the Pakistani's paper currency using the Convolutional Neural Network (CNN) together with the Support Vector Machine (SVM) [13] to guide visually impaired people with money transactions. The experimental results revealed that the presented model is able to recognize Pakistan's paper currency with an accuracy of 96.85 %. Furthermore, the developed portable model to be employed for detecting and recognizing the Euro banknotes to assist blind people [14] was presented. The proposed model was developed based on the Viola-Jones algorithm through Raspberry Pi instruments. The recognition and detection rates are 97.5 % and 84 %, respectively. On the other hand, in [15], a computer vision-based Korean paper currency recognition system for assisting the blind using the Pixel-Based Adaptive Segmenter (PBAS) technique was presented. The proposed model can deliver 98 % detection accuracy. A survey on banknote recognition systems based on Machine Learning (ML) and Deep Learning (DL) methods was carried out in [16]. The authors concluded that DL-based systems are more accurate than ML-based approaches. Therefore, DL systems can improve the standard of visually impaired people by providing them with more security, especially during financial activities, thereby reducing their reliance on others. However, the high-level correct prediction of the above-mentioned approaches is achieved at the cost of increased algorithm complexity. YOLOv3 algorithm was proposed in [17] to assist visually impaired people in their financial activities. The model applies detection and recognition approaches to classify Iraqi paper currency. The results showed that the presented system achieves an accuracy of 97.41 %. Nonetheless, when some extra noisy information is attached to the paper currency, the built model is unable to accurately distinguish the banknote images, because the bounding box fails to predict the presence of more than one object in the image. Based on the aforementioned, the main advantage for most of the works is that they have proposed a banknotes recognition model that keeps an acceptable trade-off between the accuracy and prediction time. On the other hand, the main drawback is to achieve a banknotes recognition model that maintains a state of equilibrium among accuracy, complexity, presence of irrelated objects, real-time application, and prediction time during the process of banknotes recognition. Therefore, in this scholarly study, we develop a new real-time CNN model that predicts Iraqi banknotes with high accuracy and low complexity. The developed model is able to classify banknote images even when the paper contains other unrelated features. The DL approaches are utilized to first classify the Iraqi paper currency and then convert the categorized banknote to specific vocal commands in order to assist visually impaired people in their financial activities.

The aim and objectives of the study
This research paper aims to assist the visually impaired in their financial transactions by developing a multi-class classification system for Iraqi banknotes based on the deep learning approach. The categorized paper coins will turn up to specific voice commands that are matched to each captured banknote image through the user camera. These vocal commands will help visually impaired people figure out the equivalent value of the captured currency paper. The proposed deep learning model can classify banknote images alike when they involve other unrelated features.

Literature review and problem statement
In the literature, several techniques have been introduced by many researchers for recognizing images. Some of them are relied on image processing algorithms, as in [5], which used many image processing techniques for the Indian paper currency recognition model with an accuracy of 90 %. Meanwhile, other authors have relied on deep learning and machine learning algorithms using feature extraction for an image, as in [6], which used DL in image classification to classify the intestinal hemorrhage with an accuracy of 95 %. For instance, in [7], a system that can deal with a bionic eyeglass, which combines the functions of visual detection and recognition, was introduced. The dataset of banknotes images has been collected using a mobile camera, and the related shapes and features were extracted from the dataset using adaptive filters. The proposed system needed more time to be compatible and trusted by the users. Meanwhile, a simple system for Egyptian paper currency recognition based on image processing was presented [8]. The main techniques used include image histogram enhancement, foreground segmentation, and region of interest extraction. The cross-correlation technique was utilized for matching the dataset and the captured image. The results showed that the presented method is capable of recognizing Egyptian paper money with a short time and accuracy of 89 %.
The proposed system has a short recognition time, but the accuracy is not within the boundary of the desired ambition. Also, the authors in [9] proposed an Egyptian paper currency recognition system based on the Oriented FAST and Rotated BRIEF (ORB) algorithm. The ORB technique was used to extract features for the input image and then use the hamming distance to match the binary descriptors obtained from the feature extraction phase. The results revealed that the proposed model recognizes the unknown banknote images with an accuracy of 96 %. The proposed system has high accuracy, but the recognition time is out of the desired condition. On the other hand [10], the author proposed a non-parametric Saudi currency recognition system. The classification process is based on the determination of the coefficients for matching correlation between pre-built models and the captured category. The issue here is that this work has conducted recognition experiments for specific clear (without noisy information) labels and not for all applicable categories. However, this system has a high recognition accuracy.
Similarly, a research group from the University of Pune developed an image dataset for Indian banknotes based on an Android platform [11]. The modified dataset was applied in an automatic mobile recognition system for smartphones using the developed Scale Invariant Feature Transform (SIFT) algorithm. The presented model has been submitted for visually impaired people to detect and recognize the Indian paper currency. Likewise, in [12], an Indian paper currency recognition system is presented for the blind based on the SIFT algorithm along with the hamming distance technique. The results reported that the accuracy of the proposed method reached 93.71 %, with a running time of 0.73 seconds. Although these previous studies have demonstrated good prediction accuracy, more precise models for identifying and classifying paper currency under various surrounding variables are still sought.
To improve the prediction accuracy, Deep Learning-based approaches have been widely employed in paper To achieve this aim, the following objectives are accomplished by using deep learning as a convolutional neural network (CNN) technique: -proposing a recognition system for Iraqi banknote Arabic side; -proposing a recognition system for Iraqi banknote English side; -proposing a recognition system for Iraqi banknote both Arabic and English sides.

Materials and methods
The developed system holds the necessary functionalities like a camera module and audio jack that allow its users to identify the denomination of the Iraqi banknotes in real time. The camera will capture the provided Iraqi banknote as an image. The captured image will be analyzed and classified by deep learning methods, which leads to the classification of the Iraqi Dinar banknotes as an output. In the end, the classified banknotes will be translated to voice commands that are equivalent to each paper currency. These commands will be pronounced through the audio jack for the user to help him figure out the corresponding value of the paper currency. For this research, the use of deep learning is being implemented to classify the Iraqi Dinars denominations by feeding in a dataset of Iraqi Dinars banknotes in their seven different classes, respectively, (250, 500, 1,000, 5,000, 10,000, 25,000, and 50,000 IQD). Table 1 illustrates the pictorial for each class of the seven Iraqi paper currency for both sides: Arabic and English. It is worth noting that this work is dedicated to classifying the banknote denomination and doesn't involve any operation for the forged detection for the specific paper currency. Table 1 The seven categories of Iraqi paper currency The operation of recognizing Iraq paper currency is achieved using the supervised deep learning (DL) algo-rithm. The DL algorithm is principally based on the convolutional neural network (CNN), which has been wildly applied in different applications such as medical therapy, face recognition, voice classification [18][19][20], and so on. Therefore, the big data is collected aggregating all of the currency categories in both the front side and back side of Iraq paper currency. The dataset is divided into two groups, the first group is for training, and the other is for testing, shown in Fig. 1. Before the training operation, a preprocessing technique is applied to the image samples to improve the contrast. Another preprocessing procedure was carried out to resize the image so that makes all the images with unique sizes. These preprocessing operations are very important steps for CNN classification. The next step is designing and applying the convolutional neural network. The design is conducted using the proposed idea based on extensive three experiments until achieving acceptable recognition accuracy. As a result, the number of CNN layers is 19. Afterward, the reference model will be re-used for future prediction by comparing the real-time query currency sample image with the stored model. Finally, the output of the comparison process will represent the outcome, which is equivalent to one of the seven designed categories for the Iraqi banknote, as illustrated in Fig. 1.
Furthermore, it is worth mentioning that DL has been used rather than ML due to gaining a higher accuracy in terms of the recognition rate. There is no need to involve the feature extraction methods that are essential for the machine learning algorithms as in Fig. 1. After the training process, the outcome CNN model is reserved in the work environment at the embedded system going to be used. The training operation has occurred in the personal computer (PC) or laptop, which is specified as having a high capacity and power in terms of CPU and RAM to train and build a model with a limited and acceptable time. In this work, Fig. 1 illustrates the optimal solution for banknotes recognition by including preprocessing as image enhancement and re-sizing to unify all the input image sizes for further processing operations.  The developed algorithm consists of CNN layers, which have been stated after conducting extensive experiments. Ultimately, the layers are designed using four rounds, each round has four layers. The explanation for each one is as follows: firstly, the convolution layer with several edge detection filters for the sake of highlighting all possible edges and shapes in the banknote images. Secondly, the normalization layer is used to stabilize and speed up the learning process. Thirdly, the ReLU layer is just an activation layer to activate or highlight the essential feature in the shape or edge. And finally, the max-pooling layers, which have been used for downsampling image features to ignore some unnecessary data. Then, these rounds have been repeated four times. After that, fully connected layers were used to input features directly to the multi-layer neural network. And then the output layer, there are seven nodes, as each node is related to one class (denomination). More specifications for each layer are illustrated in Table 2. It is worth mentioning that there are four times of convolution operations, each having a different number of filters, as follows: Conv_1 is 10, Conv_2 is 20, Conv_3 is 64, and Conv_4 is 30 filters. In general, the sequence of layers can be arranged freely as long as it achieves high accuracy. In other words, different kinds of arranging have been done on these layers mentioned in Table 2. Accordingly, several recognition rates have been resulted (low and high). Thus, the organizational hierarchy of the CNN layers illustrated in Table 2 has the highest accuracy. Therefore, it is not obligated in the future to have this arranged convolution, then normalization, and then ReLU, and max-pooling layers. It is just a design of this paper.
To evaluate the proposed convolutional neural network (CNN) model, three experiments have been conducted to extract the accuracy of the recognition rate. A dataset is used, which consists of 3,961 image samples containing seven categories or classes, as each class represents an Iraqi paper currency. The details of the seven classes are illustrated in Table 3, as the Iraqi paper currency has two sides, one is for the Arabic side, and the other is for English writing. Table 3 lists the image samples number of each side for all seven used categories. Each experiment, like experiment 1 has seven classes as follows: Arabic_250, Arabic_500, Arabic_1000, Ara-bic_5000, Arabic_10000, Arabic_25000, and Arabic_50000. Three experiments have been applied. Experiment 1 was based on the Arabic side, experiment 2 was stationed on the English side, and experiment 3 was conducted on both sides of the paper currency. Generally, to evaluate any prediction system, the confusion matrix (CM) [21,22] as in Table 4 is exploited. Considering two classes, the standard CM frequently has four factors as follows: True Positive (TP), True Negative (TN), False Positive (FP), and False Negative (FN). In this work, to analyze the performance evaluation for the proposed model, the accuracy metric is utilized as in (1), which is considered as a result of the experiments.
To derive the CM that can extract the accuracy for the seven classes prediction system, an alteration will have happened on the two classes CM, then formula (1) is applied to evaluate the accuracy. The problem here is that we have seven categories, and the confusion matrix must have seven classes, which will be called a multiclass confusion matrix. The method of the accuracy extraction is the same as the concept of CM by applying the formula (1) on the two classes. But the issue is how to extract the overall TP, TN, FP, and FN parameters for the seven classes. Here is an example for the three classes confusion matrix that has a solution for the accurate evaluation of the four parameters as in Table 5. The computation of the four parameters, which are necessarily required to compute the accuracy according to formula (1), is performed below based on Table 5. Eventually, the accuracy calculation is TP+TN/(TP+TN+FP+FN), which is (7+8)/(7+9+17+4). Accordingly, the accuracy of the 7 classes is computed for this paper. Fig. 2 depicts the CM for the target classes and the output classes by the proposed model. The output and results of experiment 1 are presented, and the total accuracy of the recognition rate is 98.5 % for 1975 image samples. In which the classes 1,000Ar, 25,000Ar, and 500Ar have an accuracy of 100 %. Meantime, the worse accuracy has been recorded for the class 10,000, which is 96.7 %. The testing samples are taken randomly to avoid any biasing in the testing operation. The number of samples is selected as 20 % of the total Arabic side data-set, which is 1,975 image samples. The time taken for the training is 12 minutes for 480 iterations with 40 epochs as 12 iterations per epoch, and the optimization algorithm for training is selected to adam type.

1. Iraqi banknotes Arabic side recognition
As can be inferred from Fig. 2, one of the worst prediction accuracies is 96.9 % for the 5,000Ar classification, which indicates 3.1 % for misclassification. This is due to the fact that both 250Ar and 5,000Ar paper currency have some similar background color and features. For instance, for the 5,000Ar paper currency, we use 65 banknote images for testing data (20 % of the overall 5,000Ar dataset). The model confuses in categorizing the two samples as 250Ar due to the correlated color and features embedded in both banknotes. It is worth mentioning that this can be solved by increasing the training correlated samples of the aforementioned case. Fig. 3 shows the CM for the English side of the Iraqi paper currency. The confusion matrix contains the target and the output classes by the proposed system. The output and results for experiment 2 are presented, and the overall accuracy of the recognition rate is 97 % for 1,986 images. In which only the class 1,000En has an accuracy of 100 %.  While the worse accuracy has occurred in the class 50,000 En, which is 89.7 %, as shown in Fig. 3. The testing samples have been taken randomly to avoid any biasing in the testing operation. The number of testing samples is selected as 20 % of the total English side dataset, which is 1,986 image samples. Also, the time taken for the training is 12 minutes for 480 iterations with 40 epochs as 12 iterations per epoch, and the optimization algorithm for training is selected to adam type.

3. Iraqi banknotes both Arabic and English sides recognition
Regarding the third objective, both Arabic and English sides have been selected. As shown in Fig. 4, the total recognition rate is 98.6 % for the seven classes output, and testing has resulted from 20 % of the remaining samples for each category (the samples are 3,961 images). For example, the class 500ArEn has 130 images of the total testing samples entered in the experiment, and all of them have been recognized correctly without any error.
In another class, which is 50,000ArEn, the total number for the testing is 79 image samples, where 75 images are correctly predicted, and the remaining four images are inaccurately predicted as 250ArEn class, meantime the other samples are incorrectly recognized as 10,000ArEn, and two images are also incorrectly identified as 1,000ArEn. Therefore, the total accuracy for the 50,000ArEn class is 94.9 %, and so on. It is worth mentioning that the advantage of combining both Arabic and English sides is to increase the accuracy of the recognition rate. As is revealed, the overall recognition rate for the front side (Arabic side) is 98.5 %, and for the back side (English side) is 97 %, while in the case of combining both Arabic and English sides, the recognition rate is boosted up to 98.6 %.
The max epochs are 40 for 480 iterations as 12 iterations per epoch, gradient decay factor is 0.9000, the mini-batch size is 128, verbose frequency is 50, initially learn rate is 1.0000e-03, validation frequency is 50, and the optimization algorithm for training is selected to adam type.

Discussion of experimental results of the proposed classification model
As shown, the results can give a promising solution for blind people to hear and understand the recognized category of specific banknotes. The proposed DL model can categorize the banknote images even containing other irrelevant features. Therefore, the proposed system can be put inside a banknote counting machine for category detection. In terms of comparison, the correct recognition rate for the Iraqi currency has been selected. The proposed CNN layers outperform the existing work in the DL approach, which used Yolov3 [17]. Table 6 shows the details regarding the comparison, in which it is clear that the proposed work has a higher recognition accuracy. The reason for selecting this paper for comparing statements is because it is based on the same currency type as the Iraqi paper currency. Therefore, it is fair to compare and check the proposed work of this research paper. Here the solution of the problematic item, as in the literature review and problem statement section, is trying to raise the accuracy of the recognition rate. So, the comparison with an existing work, which has the same problematics (Iraqi banknotes). Thus, the solution has been achieved using the proposed CNN algorithm and with a specific arrangement of layers, as explained in the materials and methods section. Table 6 Performance comparison with the state-of-the-art similar approach

Method
Recognition accuracy Deep Learning-based Yolov3 [17] 97.41 % The proposed multi-layer CNN model 98.6 % Moreover, the proposed CNN-based system has expanded to the voice pronunciation based on the recognized banknotes. This expansion has been achieved by applying an expert method, which is the rule (if-else) to switch and play the voice guidance according to the classified banknote. This improvement will be pretty beneficial for blind people. Furthermore, it could be used for healthy people by easing the financial dealing in the modern lifestyle. On the other hand, it is worth mentioning that the disadvantages of the proposed work are the following two factors: light reflection and brightness. These factors are affecting negatively by decreasing the correct recognition rate. Therefore, adjusting lightness for the camera is an essential step that must be followed to obtain accurate results. The training has finished when reached the final iteration, which is 960 iterations. The type of training optimization is named adam. The size of the augmented image is [64*64]. The training has been done by using Corei5, 8-GB RAM workstation with Matlab 2018. It is recommended that the embedded system, which will mount this software must be strong enough by providing a high capacity and CPU speed so as to be able to process operations in real-time without any delay. Also, it is recommended to improve and control the light intensity once the query is happened in real time by the user to get a correct resulting class. 2. The CNN model was trained to predict the English side of the Iraqi paper currency using 80 % out of 1986 images. It is concluded that the model delivers 97 % recognition accuracy for testing data. This can be drawn to the fact that the English side has less information compared to the Arabic one. Therefore, considering the Arabic side, the confusion among the paper currency was reduced, which leads to improving the prediction accuracy.
3. The proposed model was trained by combining the Arabic and English sides to build the developed deep learning model. Here, the study concludes that the best recognition accuracy (98.6 %) was achieved when using both sides. Moreover, considering both sides, the CNN-based model is able to categorize the captured paper currency if there are unrelated objects (noise) involved within the details of the captured image. Based on the aforementioned contributions, the developed model can be employed in some real-time banknotes image recognition applications. This significantly helps in guiding blind people through providing them with the equivalent currency category as voice commands when doing purchasing activities or money exchange.