Optical Character RecognitionOn Bank Cheques Using 2D Convolution Neural Network Abstract— Banking system worldwide suffers fromhuge dependencies upon man power and written documents thus making conventionalbanking processes tedious and time consuming. Existing methods for processingtransactions made through cheques causes delay in the processing as the detailshave to be manually entered. Recentdevelopments in image processing and computer vision enable machines to analysedata and extract information from images. In banking sector, automation hasbecome one of the important aspects.
Automatic cheque processing is a vision ofresearchers for decades. Handwritten Character Recognition is a method in whichpattern recognition clubbed with machine learning is used to design OpticalCharacter Recognizer. The machine learning model used to develop the automaticcheque processing is 2D Convolution Neural Network. The machine learning modelused for the prediction purposed was trained on the EMNIST dataset foralphabets and digits. The training resulted in an accuracy of 97% for digitsand 98% for letters. When the same was tested with the input images for thecheque OCR, the accuracy was 95.71% Keywords— machine learning, optical character recognition, 2DConvolution Neural Network, OCR image processing, handwritten characterrecognition, pattern recognition.
1. Introduction Handwrittencharacter recognition is a field of research in artificial intelligence,computer vision, and pattern recognition. A computer performing handwritingrecognition is said to be able to acquire and detect characters in paperdocuments, pictures, touch-screen devices and other sources and convert theminto machine-encoded form. Its application is found in optical characterrecognition and more advanced intelligent character recognition systems. Mostof these systems nowadays implement machine learning mechanisms such as neuralnetworks.
Optical character recognition (also opticalcharacter reader, OCR) is the mechanical or electronic conversion of images oftyped, handwritten or printed text into machine-encoded text, whether from ascanned document, a photo of a document, a scene-photo (for example the text onsigns and billboards in a landscape photo) or from subtitle text superimposedon an image (for example from a television broadcast). .Advanced systemscapable of producing a high degree of recognition accuracy for most fonts arenow common, and with support for a variety of digital image file format inputs.13 Machine learning isa branch of artificial intelligence inspired by psychology and biology thatdeals with learning from a set of data and can be applied to solve a widespectrum of problems. A supervised machine learning model is given instances ofdata specific to a problem domain and an answer that solves the problem foreach instance. When learning is complete, the model is able not only to provideanswers to the data it has learned on but also to yet unseen data with highprecision.
1 Neural Networks arecollections of mathematical models that represent some of the observedproperties of biological nervous systems and draw on the analogies of adaptivebiological learning. Neural networks can be applied in various fields such asOCR pattern recognition and language translation etc. Unlike the originalPerceptron model, shown by Minsky and Papert to have limited computationalcapability, the neural network of today consists of a large number of highlyinterconnected processing elements (nodes) that are tied together with weightedconnections (links). Learning in biological systems involves adjustments to thesynaptic connections that exist between the neurons.
This is true for neuralnetworks as well. Machine learning typically occurs by example through trainingor exposure to a set of input/output data (pattern) where the trainingalgorithm adjusts the link weights. The link weights store the knowledgenecessary to solve specific problems. 11 In bankingapplications, neural networks can be used to recognize characters on bankcheques. The important text details on the cheque are segmented out and inputto the machine learning OCR that stores the output in the text format. The textobtained can be directly fed to the system of the banks and the need ofmanually typing the banking details in the banking system becomes automated.
The cheque is scanned and the required information from the cheque is obtainedas text that can be fed to the banking software.2. LiteratureSurvey Ivor Uhliarik et al., 1 Proposedwork primarily discusses machine learning methods; thescope is limited to corresponding algorithms. Specifically, as the multilayerperceptron has been chosen as a method for feature extractionand classification, we deal with algorithms required for MLP implementation. Itsmain advantagesinclude that it is able to achieve increased accuracy with top performance comparedto other systems The drawback of this work is that the cost of computation is high.
YichangShih et al., 9 Proposed workprimarily recognizes handwritten Sanskrit characters using machine learning.Combined with the robustness of nonparametric modeling and the expressivenessof parametric modeling, but also the computation efficiency during training andtesting respectively.Thekey aspect of this work is that the handwritingrecognition is done for a language other than English. However its limitationis that the accuracy was 85% which could still be improved. Y. LeCunet al., 3 The paper demonstrates how learning networks and backpropagation networkcan be applied to the recognition of the handwritten zip code.
The main featureof this work is that the constrained backpropagation used bythem helps in shift-invariance but also vastly reduces the entropy. The defectin this model is that mage segmentation performed was not able to separate thecharacters which overlap.AdamCoates et al., 2 In this work character recognition was done usingUnsupervised feature Learning The main advantage is that t the work makesuse of larger scale algorithm for learning the featuresautomatically from the unlabeled data. However the disadvantageis that the image has not been filtered to a great extent.OCR in banking applications using neural networkshas been found as another useful application of machine learning forefficiently and accurately recognizing characters on bank cheques to automatethe whole banking procedure. The method trains on the dataset given to it anduses its prior training experience to recognize characters on bank cheques. Theefficiency of the method lies in the fact that software has to be trained manytimes initially in optimal conditions and the best possible result with respectto accuracy is extracted and used.
From then on the neural network can recognizecharacters with the prior experience it has had with the characters that wereused to train it.3. ProposedWork Theobjective of the machine learning model is to provide an efficient system forrecognizing characters on bank cheques to do away with the manual data enteringmethods prevalent in banks. The presence of neural networks ensures highaccuracy and efficiency in finding out the handwritten characters. The imagedecomposes into segments of fixed pixel values and these segmented charactersare fed to the neural network which predicts the characters from the cheque andstores it in the text format.
3.1 Algorithm Input: Bank Cheque ImageStep1:- Segmentation: It involvessegmenting the text part of the bank cheque such as name, amount in words, amount in digits, date etc.by cropping or segmenting those parts from the cheque image.
Step 2:-Character Extraction: Segmentingeach word or numbers obtained from step 1 into individual characters. Step 3:-Prediction: Feeding thesegmented characters into the convolution 2D neural networks for character recognition.In this step the neural network recognizes the characters which are trainedwith the EMNIST character and digit dataset given to it. The neural network istrained for 15 iterations and the digit training neural network was trained for9 iterations.
Output: The result obtained from the neural network is concatenated and storedin a text file from where it can be fed to the banking system. Table 1: Description Of Architectural Model ARCHITECTURE DESCRIPTION INPUT LAYER A bank cheque image is capture at 1601 X 719 pixels SEGMENTATION AND EXTRACTION LAYER Cropping The input image is cropped into various rectangular field that capture the payable amount(words and digits),account number, date, cheque number, signature, recipient’s name Otsu Thresholding Otsu Thresholding is used for background removal of the above regions Connected Component Labeling Connected component labeling is done on the outputted regions and the boundary location of the component are extracted Component extraction The Character are extracted from the original image form the boundary location obtained in the previous step and is given to the CNN for prediction PREDICTION LAYER A 2 Dimensional convolutional Neural network is used to recognize the character extracted at the end of the previous step and output the character OUTPUT The characters recognized by the prediction layer are stored into a text file for further use The EMNIST dataset is a set of handwritten characterdigits derived from the NIST Special Database 19 and converted to a 28×28 pixelimage format and dataset structure that directly matches the MNIST dataset.It isimportant in picture processing to select an adequate threshold of gray levelfor extracting objects from their background. This paper makes use of Otsuthresholding which is an automaticthreshold selection region based segmentation method. Otsu method is a type ofglobal thresholding in which it depends only on the gray value of the image. Otsumethod was proposed by Scholar Otsu in 1979. Which is widely used because it issimple and effective 10. The proposed work involves usage of 2DConvolution Neural Network which is an important operation in signal and imageprocessing.
Convolution operates on two signals (in 1D) or two images (in 2D).Convolution is an incredibly important concept in many areas of math andEngineering. 3.3 Performance Analysis Trained Data Accuracy in % Lettersalpha 97.45 Digits 98.87 Theconvolution model employed for the character recognition task was trained onthe EMNIST dataset that consists of handwritten digits and alphabets.
Thedataset was trained separately for alphabets and digits where alphabetstraining fetched a validation accuracy of 97.45% and the digit training fetcheda validation accuracy of 98.87%. The same model was applied to the characterand digit recognition on the bank cheques and accuracy of about 95% wasobtained on the test cases that were taken. Themachine learning model was tested for its performance of the cheques as shownin Figure 2.
The accuracy varied from cheque to cheque as different writingstyles were set as input to the learning model. The best case accuracy was 100% when everything letter and digit waspredicted accurately, the worst case accuracy was 89.74 % and the average caseaccuracy, when considered for the characters and digits in all the cheques, wasabout 95.71%.4.
Conclusion andFuture ScopeTheneural network currently used in the system deploys handwriting characterrecognition for block letters and digits using pattern recognition. This givesus an optical character recognition tool built using machine learning and 2-Dconvolution neural network. This OCR can be used in banking applications toautomate the whole banking transaction procedure. This reduces human effort andcan speed up the entire banking process. The proposed work produced an accuracyof about 95.71% when tested with the bank cheques as shown in Table 2, howeverthe model’s validation accuracy while training was 98% for letters and 97% foralphabets.
Futurescope of the system includes increasing the accuracy of the characterrecognition, support for cursive handwriting and better image segmentation. Thesystem can also be scaled up to other applications that require reading of thetext as their main operation in other sectors as well.