Artificial Intelligence, Machine Vision, Neural Network

TRIBHUVAN UNIVERSITY INSTITUTE OF ENGINEERING PULCHOWK CAMPUS DEPARTMENT OF ELECTRONICS AND COMPUTER ENGINEERING CERTIFICATE OF APPROVAL The undersigned certify that they have read, and recommended to the Institute of Engineering for acceptance, a project report entitled "VLPD-R" submitted by Love Shankar Shrestha, Promisha Mishra, Ravi Bhagat and Tanka Bahadur Pun in partial fulfillment of the requirements for the Bachelor’s degree in Electronics & Computer Engineering. _________________________________________________ Supervisor, Dr. Sanjeeb Prasad Panday Lecturer, Department of Electronic and Computer Engineering _________________________________________________ Co-Supervisor, Mr. Anil Verma Lecturer, Department of Electronic and Computer Engineering __________________________________________________ Internal Examiner, Coordinator, Dr. AmanShakya Deputy Head, Department of Electronic and Computer Engineering -----------------------------------------------------------------------------External Examiner, Mr. Subhash Dhakal Ministry of Science and Technology, Nepal DATE OF APPROVAL: 26.08.2013 I COPYRIGHT The author has agreed that the Library, Department of Electronics and Computer Engineering, Pulchowk Campus, Institute of Engineering may make this report freely available for inspection. Moreover, the author has agreed that permission for extensive copying of this project report for scholarly purpose may be granted by the supervisors who supervised the project work recorded herein or, in their absence, by the Head of the Department wherein the project report was done. It is understood that the recognition will be given to the author of this report and to the Department of Electronics and Computer Engineering, Pulchowk Campus, Institute of Engineering in any use of the material of this project report. Copying or publication or the other use of this report for financial gain without approval of to the Department of Electronics and Computer Engineering, Pulchowk Campus, Institute of Engineering and author’s written permission is prohibited. Request for permission to copy or to make any other use of the material in this report in whole or in part should be addressed to: ArunTimalsina Head Department of Electronics and Computer Engineering Pulchowk Campus, Institute of Engineering Lalitpur, Kathmandu Nepal II ACKNOWLEDGEMENT It is an immense pleasure for us to acknowledge the guidance, encouragement and assistance received from several individuals during the project period. Our heart-felt gratitude goes to our project supervisor, Dr. Sanjeeb Prasad Panday and our cosupervisor Mr. Anil Verma who have inspired, encouraged and provided invaluable advice to accomplish this project. We also would like to thank him for showing us some example that related to the topic of our project. We are equally indebted to Prof. Dr. Arun Timalsina, Head of Department of Electronics and Computer Engineering for providing us an opportunity and environment for the project. Our words of appreciations are short of praising the guidance of Assistant Dean Dr. Subarna Shakya, Professor of Department of Electronics and Computer Engineering. We also wish our thankfulness to Dr. Aman Shakya, Deputy Head of Department of Electronics and Computer Engineering and our B.E project coordinator. We would like to convey our acknowledgement to Mr. Ashok Kumar Pant for guiding us during the project session and giving us his invaluable time. Finally, we would also like to offer our gratitude to all our teachers whose ideas were the basis for our project research and finally we would like to thank all our friends who gave us their suggestions, ideas and support for this project. Love Shankar Shrestha (16219) Promisha Mishra (16223) Ravi Bhagat (16224) Tanka Bahadur Pun (16243) III ABSTRACT This project deals with the development of an application which recognizes the Vehicle License Plate (VLP) that can be used for traffic system, parking area and border crossings in Nepal. The current project work uses Artificial Intelligence (AI), Machine Vision, and Neural Network (NN) along with image processing to construct the Vehicle License Plate Recognition (VLPR) system for Nepal. Specifically the system first takes images of vehicle from camera and then localizes VLP from the image. Once the VLP is detected, it is segmented into individual characters and the characters are recognized. The focus is on the design of algorithms used for extracting the license plate from the image, segmenting the characters of the plate and identifying the individual characters. Keywords: Artificial Intelligence, Machine Vision, Neural Network, Image-processing, Optical Character Recognition IV TABLE OF CONTENTS LETTER OF APPROVAL …………………………………………………………….... I COPYRIGHT ..................................................................................................................... II ACKNOWLEDGEMENT ................................................................................................III ABSTRACT ..................................................................................................................... IV TABLE OF CONTENTS ................................................................................................... V LIST OF FIGURES ....................................................................................................... VIII LIST OF TABLES ........................................................................................................... IX LIST OF ABBREVIATIONS AND SYMBOLS .............................................................. X 1. INTRODUCTION ..........................................................................................................1 1.1. Background ........................................................................................................2 1.2. Problem Statement .............................................................................................2 1.3. Objectives ...........................................................................................................3 1.3.1. General Objective ....................................................................................3 1.3.2. Specific Objective ....................................................................................3 1.4. Scope of Work ....................................................................................................4 1.5. Organization of Report .......................................................................................5 2. LITERATURE REVIEW ...............................................................................................6 2.1. Related Work ......................................................................................................7 2.2. Feature of Nepali Vehicle License Plate ............................................................8 2.3. Image Processing ..............................................................................................12 2.3.1. Image Acquisition and Preprocessing ....................................................12 2.3.2. Plate Localization...................................................................................12 2.3.3. Segmentation..........................................................................................14 2.4. Feature Extraction ............................................................................................14 V 2.5. Neural Network ................................................................................................15 3. METHODOLOGY ......................................................................................................17 3.1. Generic Description ..........................................................................................18 3.2. System Design ..................................................................................................19 3.3. Technical Description.......................................................................................24 3.3.1. Pre-processing and VLP Localization ...................................................25 3.3.2. Thinning .................................................................................................30 3.3.3. Feature Extraction ..................................................................................32 3.3.3.1. Fast Fourier Transform (FFT) ................................................33 3.3.3.2. Density over Different Zones .................................................33 3.3.3.3. Area of Image .........................................................................34 3.3.3.4. Moment Invariants .................................................................34 3.4. Artificial Neural Network (ANN) ....................................................................36 3.4.1. Multilayer Perceptron ............................................................................37 3.4.2. Feedforward Back Propagation Neural Network ...................................37 3.4.3. Backpropagation ....................................................................................38 3.4.4. Training FFNet.......................................................................................39 3.5. Training Neural Network .................................................................................39 3.5.1. Supervised Learning ..............................................................................40 3.5.2. Error Correction Learning......................................................................40 3.6. Validating Neural Networks .............................................................................40 4. IMPLEMENTATION ..................................................................................................42 4.1. Global Thresholding .........................................................................................43 4.2. Region Based Segmentation (Horizontal and Vertical) ...................................43 4.3. Back Propagation .............................................................................................44 5. RESULT AND DISCUSSION .....................................................................................46 VI 5.1 Result ................................................................................................................47 5.2. Observation and Discussions............................................................................47 5.3. Output ...............................................................................................................49 5. CONCLUSION AND FUTURE ENHANCEMENT...................................................51 6.1. Conclusion ........................................................................................................52 6.2. Future Enhancement ........................................................................................52 6. EPILOGUE ..................................................................................................................53 7.1. References ........................................................................................................54 7.2. Glossary ............................................................................................................56 VII LIST OF FIGURES Figure 2-1: Vehicle Identifier ...........................................................................................10 Figure 2-2: VLP in 4:3 ratio..............................................................................................11 Figure 2-3: VLP in 4:1 ratio..............................................................................................11 Figure 3-1: Use Case Diagram for VLP Segmentation. ...................................................18 Figure 3-2: Use Case Diagram of Optical Character Recognition ...................................19 Figure 3-3: Use Case Diagram for Main Program ............................................................19 Figure 3-4: Level 0 DFD of the System............................................................................20 Figure 3-5: Level 1 DFD for Process "Preprocessing" .....................................................21 Figure 3-6: Level 1 DFD of process "Training" ...............................................................22 Figure 3-7: Level 1 DFD for Process "Recognition" ........................................................23 Figure 3-8: Technical Description of the System .............................................................24 Figure 3-9: Horizontal Projection of VLPD .....................................................................29 Figure 3-10: Vertical Projection of the Image ..................................................................30 Figure 3-11: First Sub-iteration ........................................................................................31 Figure 3-12: Second Sub-iteration ....................................................................................32 Figure 3-13: Original pattern and Skeleton as a result of Zhang-Suen thinning algorithm ...........................................................................................................................32 Figure 3-14: Feed Forward Multilayer Perceptron ...........................................................37 Figure 3-15: Operation on Layer's Node ..........................................................................37 Figure 5-1: Accuracy Rate of different stages of the system ............................................48 Figure 5-2: Recognition accuracy of individual character ................................................49 Figure 5-3: Input Image for VLPR System.......................................................................49 Figure 5-4: VLP Localization ...........................................................................................49 Figure 5-5: VLP Vertical Segmentation ...........................................................................50 Figure 5-6: VLP Horizontal Segmentation .......................................................................50 VIII LIST OF TABLES Table 2-1: Major Categories of Vehicle .............................................................................9 Table 5-1: Accuracy rate corresponding to different stages .............................................47 Table 5-2: Recognition Result of individual character .....................................................48 IX LIST OF ABBREVIATIONS AND SYMBOLS Φ Activation Function δ Delta as error AI Artificial Intelligence ANN Artificial Neural Network ART Adaptive Resonance Theory DOCR Devanagari Optical Character Recognition DFD Data Flow Diagram FFT Fast Fourier Transform FFNet Feedforward Neural Network INGO International Non-governmental Organization ITS Intelligent Transport System LL Letter Letter MLP Multilayer Perceptron NGO Non-governmental Organization NN Neural Network NN Number Number NNNN Number NumberNumberNumber OCR Optical Character Recognition VLP Vehicle License Plate VLPD Vehicle License Plate Detection VLPR Vehicle License Plate Recognition VLPD-R Vehicle License Plate Detection & Recognition VMTR Vehicle and Transport Management Rule X 1. INTRODUCTION Page | 1 1.1. Background VLPR has been intensively studied in many countries [1] . Due to variation in license plates currently in practice, the requirement of an automatic license plate recognition system is different for each country. This project is laser focused for the development of license plate localization and recognition system for vehicles in Nepal. This system is developed based on digital images manipulation and can be easily applied to commercial purpose like car park systems for the use of documenting access of parking services, secure usage of parking houses and also to prevent car theft issues and many more. In the current era of information technology, the use of automatics and intelligent systems is becoming more and more widespread. The Intelligent Transport System (ITS) technology has gotten so much attention that many systems are being developed and applied all over the world. VLP recognition has turned out to be an important research issue. VLP recognition has significant role in traffic monitoring system including controlling the traffic volume, ticketing vehicle without human control, vehicle tracking, and so on. In some countries, VLPR systems installed on country borders automatically detect and monitor border crossings. Each vehicle can be registered in a central database and compared to a black list of stolen vehicles. 1.2. Problem Statement In most of the countries, the attributes of the vehicle license plates are strictly maintained. For example, the size of the plate, color of the plate, font face/ size/ color of each character, spacing between subsequent characters, the number of lines in VLP, script etc. are maintained very specifically. However, in Nepal, the VLP are not yet standardized especially in size of plate and font of the characters, which makes the system less accurate. Only the numeric letter “5 “ is written in more than three styles, making localization and subsequent recognition of vehicle number plates extremely difficult in this condition. Page | 2 The problem of vehicle number plate recognition is interestingly difficult one. These tasks become more sophisticated when dealing with plate images inclined in various angles and with noise. Because this problem is usually used in real-time systems, it requires not only accuracy but also fast processing. The most vital and most difficult part of any vehicle number plate recognition system is the detection and extraction of the vehicle number plate, which directly affects the systems overall accuracy. The presence of noise, blurring in the image, uneven illumination, dim light and foggy condition make the task even more difficult. Localization of the VLP is also a problem due to distance between the camera and the vehicle .Sometimes it also becomes difficult due to angular image. This problem leads to inaccuracy in further steps [2]. Next problem in VLPR system is recognition of the characters. In Devanagari script "5" is written in more than 5 styles. Similar is the case with "8"and "9". Lack of standardization in Devanagari script is the cause of this problem. 1.3. Objectives The core objective of this project is to make the vehicle number plate recognition system automatic so that it will help the traffic and other aspect of the national security system. There are mainly two types of objective of the this project namely, 1.3.1. General Objective The general objective of this project is to recognize the number of the different types of vehicle of Nepal such as government owned, Non-governmental Organization (NGO), International Non-governmental Organization (INGO), public and private. As we know the color of number plate of different sector are different and this will be the core thing that help us to recognize the number plate easily and efficiently. 1.3.2. Specific Objective The specific objectives of the project are as follows: Page | 3 I) Detection of the number plate. II) Recognition of the number of vehicle using the Devanagari Optical Character Recognition. 1.4. Scope of Work The scope of this project is to build an automatic system that can recognize the vehicle by taking the image of the vehicle with VLP as input and obtaining the vehicle registration. The main feature of the system is, it can separate the VLP from the image and obtain the character in the Devanagari. Recognition of Devanagari is always a difficult task and the worst scenario is performing such task in the error prone surrounding. VLPR system plays a major role in monitoring traffic rules and maintaining law enforcement on public roads. This area is challenging because it requires an integration of many areas in computer science, which include Object detection (plate localization) and Character recognition. There are many scope of such recognition systems, some of the examples where system fits are discussed below.  Traffic Systems: VLPR systems can be used for traffic systems to recognize the number plate of the vehicle and store it in database. From which the wanted or stolen vehicles can be searched easily as well as the density of running vehicle in an area can be easily taken.  Parking: VLPR system can be used for parking places to keep the record of the vehicles. The VLPR system can be used to automatically enter pre-paid members and calculate parking fee for non-members (by comparing the exit and entry times) by using some more technology along with VLPR.  Border Crossings: VLPR system can be used to monitor the border crossings for keeping the track of vehicles which exits of the country. Each vehicle information can be registered into a central database and can be linked to additional information. Page | 4 1.5. Organization of Report Chapter 1: It deals with the general introduction of the project, system, problems, scope and organization of the report. Chapter 2: It occupies the literature review related to the project system. It includes the features of Nepali VLP, Devanagari script, image processing, neural network and optical character recognition. Chapter 3: It describes the process and methodology applied on the raw image, overall system view and technical description of the system. It presents the system diagram, data flow diagram and data model on which the system is built. It also explains the interaction among the different component of the system. Chapter 4: It focuses on the implementation model and application overview of the system describing the algorithm followed during the system design. Chapter 5: It explains the results and discussions of the project. It also shows the error calculation and accuracy rates of recognizing the VLP and registration number in it. Chapter 6: It explains the conclusion and future enhancement of the system. Chapter 7: It deals with the epilogue part of the report. It contains the glossary, references and the appendix part of the report. Page | 5 2. LITERATURE REVIEW Page | 6 2.1. Related Work In this section focus is on the related work that has been done previously by several researchers. In literature we can find many methods for license plate detection and recognition system. The major drawback is that how long it will take to compute and recognize the license plates. This is critical and most needed when it is applied to real time applications. However, there is always a trade-off between computational time and performance rate. In order to achieve an accurate result and increase the performance of the system more computational time is required. The problem of automatic VLP recognition has been studied since 1990s. The first approach was based on characteristics of boundary lines [1, 3] . The input image was first processed to enrich boundary lines’ information by some algorithms such as the gradient filter, and resulted in an edging image. This image was binarized and then processed by certain algorithms, such as Hough transform, to detect lines. Eventually, couples of 2parallel lines were considered as a plate-candidate. Another approach was morphology-based [4, 5, 6] . This approach focuses on some properties of plate images such as their brightness, symmetry, angles, etc. Due to these properties, this method can detect the similar properties in a certain image and locate the position of license plate regions. The third approach was texture-based. In this approach, a VLP was considered as an object with different textures and frames [1, 7] . The texture window frames of different sizes were used to detect plate-candidates. Each candidate was passed to a classifier to confirm whether it is a plate or not. This approach was commonly used in finding text in images tasks. In addition, there have been a number of other methods relating to this problem focusing on detecting VLP in video data (objects appear in a chain of sequent images). The fourth approach was based on statistical properties of text [7] . In this approach, text regions were discovered using statistical properties of text like the variance of gray level, number of edges, edge densities in the region, etc. This approach was commonly used in Page | 7 finding text in images, and could well be used for discovering and designating candidate number plate areas as they include alphabets and numerals [8]. In addition, there have been a number of other methods relating to this problem focusing on detecting VLP using AI and genetic algorithms [1, 9] . These systems used edge detection and edge statistics and then artificial intelligence techniques to detect the location of the number plate-designate area. All of the systems discussed above have some kind of limitations for example they are plate size dependent, color dependent, work only in certain conditions or environment like indoor images etc. 2.2. Feature of Nepali Vehicle License Plate License plate is the unique identification number provide to each vehicle. Its registration number is binded with the chasis number of the vehicle. The VNP's number is issued by the zonal-level Transport Management Office, a government agency under the Department of Transport Management [10] . The vehicle number plates are placed in the front as well as back of the vehicle. The plates are required to be either in Devanagari or Latin script. In practice, the registration plates of Nepal are bilingual. As per the latest guidelines issued by the Traffic Police Division, the plate must not be reflective and digitally printed [10]. The vehicle are provided with 6 major categories and 4 identifier and 2 physical form. For the purpose of vehicle registration Vehicle & Transport Management Act, 2049 (1992) and Vehicle & Transport Management Rule, 2054 (1997) of Nepal, classifies vehicles into the following 5 main categories on the basis of size and capacity:  Heavy and medium-sized vehicle: This includes bus, truck, dozer, dumper, loader, crane, Fire engine, tanker, roller, pick-up, van, mini bus, mini truck, minivan etc. having the capacity to carry more than 14 people (for passenger vehicle) or more than 4 tons (for cargo vehicle).  Light vehicle: This includes car, jeep, van, pick-up, micro bus, etc. having the capacity to carry less than 24 people or less than 4 tons. Page | 8  Two-wheeler: This includes vehicle having two wheels like motor cycle, scooter etc.  Tractor and power-trailer:  Three-wheeler: This includes vehicle having three wheels like tempo, autorickshaw etc. The above mentioned each categories are further divided into 5 sub categories on the basis of ownership and service-type which are as follows: Table 2-1: Major Categories of Vehicle Type of vehicle Heavy size Middle size Motorcycle, scooter Government ग झ ब Private क च प, त Local ख ज थ Tourist य य Government Organization/ घ ञ सि डी सि डी Institution Diplomatic Constitutional  सि डी झ Private vehicle: The vehicles which are for entirely personal purpose and uses a red license plate with the letters written in white.  Public vehicle  Government vehicle: The vehicles owned by the government agencies and constitutional bodies such as ministries, departments, directorates, along with Page | 9 police, military etc. falls under this category which uses white plate with the letters written in red.  National Corporation vehicle: The vehicles which are registered under the name of public corporations fully or partially owned by the government falls under this category. These vehicles uses yellow plate with the letters written in blue.  Tourist vehicle The vehicles are provide with the letter as shown in the Table 2-1: Major categories of the vehicle so that the visual distinction can be made. Every development region in Nepal follows the same trend for vehicle classification except the region has own identifier described below. The 4 identifiers are: The license plate of Nepal is more detailed in comparison with other countries. The current license plate format for vehicles in Nepal consists of 4 parts composed of letters and digits in the LL NN LL NNNN format: 1 2 3 4 Figure 2-1: Vehicle Identifier The identifier of the vehicle shown in Figure 2-1 is described as:  The first part indicate the zone code, signifying the zone in which the vehicle is registered.  The second part is the set number which is prefixed when the four digit number runs out from the last part. Page | 10  The third part indicate the type of vehicle like private, public, governmental, national corporation, tourist etc. as well as the class of vehicle like two-wheeler, light vehicle, heavy and medium-sized vehicle etc.  The last part signifies four digits running in sequence.  The color of the background and foreground represent the owned type of vehicle like public, private, government, diplomatic vehicle, etc. All 14 zones of Nepal have their own abbreviated code for reference purpose. These codes are normally single letter in Nepali and two letters (sometimes three letters also, but the third letter 'a' can be omitted) in English which are shown in the Figure 2-1 [4]. The two physical form of VLP are: The physical standard of the VLP is determined by Ministry of Infrastructure and Transport. As per the Vehicle and Transport Management Rule (VTMR), the two physical form are shown in the Figure 2-2 & Figure 2-3. I. The VLP present in the 4:3 ratio. Figure 2-2: VLP in 4:3 ratio II. VLP present in 4:1 ratio Figure 2-3: VLP in 4:1 ratio Page | 11 According the VMTR, the character and number are written inside the number plate by leaving ½ inch of space around the border for heavy and middle size four wheeler as for two wheeler bikes, scooter the space are made at ¼ inch. The distance between the number and the character must be ¼ inch and the distance between upper line and lower line of character must be ½ inch [10]. 2.3. Image Processing 2.3.1. Image Acquisition and Preprocessing We have image file as problem domain. Image Acquisition is the first step in a VLPR system and there are a number of ways to acquire images, the current literature discusses different image acquisition methods used by various authors. Yan et. al. used an image acquisition card that converts video signals to digital images based on some hardware-based image preprocessing [6] . Naito et. al. developed a sensing system, which uses two CCDs (Charge Coupled Devices) and a prism to split an incident ray into two lights with different intensities [7, 8 9] . To do so requires the imaging sensor and capability to digitalize the signal produced by the sensor. After digital image has been obtained, next step is to deal with pre-processing of the image. The main purpose of the pre-processing is to increase the efficiency of other processes. Pre-processing deals with the processes like image enhancement, noise reduction, histogram equalization, edge detection, binarization of the image. 2.3.2. Plate Localization The next step is characterized as vehicle license plate (VLP) localization. VLP localization is concerned with the finding the position of license plate in the captured image. It is also known as license plate detection. The main idea of VLP localization is to extract the license plate from whole image which is used in next step. It is the most important phase in a VLPR system. This section discusses some of the previous work done during the extraction phase. Hontani et. al. proposed a method for Page | 12 extracting characters without prior knowledge of their position and size in the image [11] . The technique is based on scale shape analysis, which in turn is based on the assumption that, characters have line-type shapes locally and blob-type shapes globally. In the scale shape analysis, Gaussian filters at various scales blur the given image and larger size shapes appear at larger scales. To detect these scales the idea of principal curvature plane is introduced. By means of normalized principal curvatures, characteristic points are extracted from the scale space x-y-t. The position (x, y) indicates the position of the figure and the scale t indicates the inherent characteristic size of corresponding figures. All these characteristic points enable the extraction of the figure from the given image that has line-type shapes locally and blob-type shapes globally. Kim et. al. [12] used two Neural Network-based filters and a post processor to combine two filtered images in order to locate the license plates. The two Neural Networks used are vertical and horizontal filters, which examine small windows of vertical and horizontal cross sections of an image and decide whether each window contains a license plate. Cross-sections have sufficient information for distinguishing a plate from the background. Lee et. al. [13] and Park et. al. [14] devised a method to extract Korean license plate depending on the color of the plate. A Korean license plate is composed of two different colors, one for characters and other for background and depending on this they are divided into three categories. In this method a neural network is used for extracting color of a pixel by HLS (Hue, Lightness and Saturation) values of eight neighboring pixels and a node of maximum value is chosen as a representative color. After every pixel of input image is converted into one of the four groups, horizontal and vertical histogram of white, red and green (i.e. Korean plates contains white, red and green colors) are calculated to extract a plate region. To select a probable plate region horizontal to vertical ratio of plate is used. Dong et. al [15] presented histogram based approach for the extraction phase. Kim G. M [16] used Hough transform for the extraction of the license plate. The algorithm behind the method consists of five steps. The first step is to threshold the 14 gray scale source image, which leads to a binary image. Then in the second stage the resulting image is passed through two parallel sequences, in order to extract horizontal and vertical line segments respectively. The result is an image with edges highlighted. In the third step the resultant image is then Page | 13 used as input to the Hough transform, this produces a list of lines in the form of accumulator cells. In fourth step, the above cells are then analyzed and line segments are computed. Finally the list of horizontal and vertical line segments is combined and any rectangular regions matching the dimensions of a license plate are kept as candidate regions. The disadvantage is that, this method requires huge memory and is computationally expensive. 2.3.3. Segmentation The next step is characterized as segmentation of license plate. Segmentation carried out in two steps. The first one deals with separation of license plate in two consecutive rows of two-row license plate such that each row consists of series of Devanagari numbers with the application of horizontal projection. The second step is separation of each character of the license plate. After that another process comes into account that is enhancement of segmentation. The segmentation of plate contains beside the characters also undesirable such as dots and stretches as well as redundant spaces on the sides of characters. There is need to deals with these problems in segmentation. Many different approaches have been proposed in the literature and some of them are as follows, Nieuwoudt et. al. [17] used region growing for segmentation of characters. The basic idea behind region growing is to identify one or more criteria that are characteristic for the desired region. After establishing the criteria, the image is searched for any pixels that fulfill the requirements. Whenever such a pixel is encountered, its neighbors are checked, and if any of the neighbors also match the criteria, both the pixels are considered as belonging to the same region. Morel et. al. [18] used partial differential equations (PDE) based technique; Neural network and fuzzy logic were adopted in for segmentation into individual characters. 2.4. Feature Extraction The purpose of feature extraction is the measurement of those attributes of patterns that are most pertinent to a given classification task. The task of the human expert is to select or invent features that allow effective and efficient recognition of patterns. All images Page | 14 are down sampled before being used. This prevents the neural network from being confused by size and position. By down sampling the image down to a consistent size, it will not matter how large the letter, as the down sampled image will always remain a consistent size. Down-sampling involves taking the image from a larger resolution to a 24*24resolution. To see how to reduce an image to 24*24, think of an imaginary grid being drawn over top of the high resolution image. This divides the image into quadrants, 24 across and 24 down. If any pixel on a region is filled, then the corresponding pixel in the 24*24 down sampled image is also filled it. The skeleton of the image may contain several numbers of pixels. So the image is down sampled to 576 pixels of 24*24 grids. These 576 pixels values represent the pattern of the image. In the literature of this section, a lot of work has been done for feature extraction of the segmented image of the number plate. We can have many different features that can be extracted from segmented character images [19, 20, 21, 22]. There are three major categories of feature extraction techniques: o Geometrical and Topological Features: Extracting and Counting Topological Structures, Geometrical Properties, Coding, Graphs, Trees, Strokes, Chain Codes etc. o Statistical Features: Zoning, Crossing and Distances, Projections, Distribution measures, etc. o Global Transformation and Series Expansion Features: Fourier Transform, Cosine o Transform, wavelets, Moments, Karhuen-Loeve Expansion, etc. 2.5. Neural Network Computers can perform many operations considerably faster than a human being. Yet there are many tasks where the computer falls considerably short of its human counterpart. There are numerous of this. Given two pictures, a preschool child could easily tell the difference between a cat and dog. Yet this same simple task would confound today's computers. Page | 15 Artificial intelligence (AI) is the field of computer science that attempts to give computer human like abilities. One of the primary means by which computers are endowed with humanlike abilities is through the use of a neural network. The human brain is the ultimate example of a neural network. The human brain consists of a network of over a hundred billion interconnected neurons. Neurons are individual cells that can process small amount of information and then activates other neurons to continue the process. A work is done on Devanagari optical character recognition by Anil et. al. [23] shows the use of feed-forward for training the data and back-propagation for recognition of characters. The paper by Dong Xiao Ni [24] uses the basic biological neuron and the artificial computation model; outlines network architectures and learning processes and multilayer feed-forward network is used for optical character recognition. It uses neural network as a powerful data modeling tool that is able to capture and represent complex input/output relationships. The paper by Anne et. al. [25] uses MLF neural networks , trained with a back-propagation learning algorithm, the most popular neuralnetworks which is applied to a wide variety of problems. Bishnu Chaulagain et. al . uses Hidden Markov Model (HMM) for neural network [26, 27] . The paper also shows the use of Tesseract engine for character recognition in Devanagari script. It is not possible to find weights which enable Single Layer Perceptron to deal with nonlinearly separable problems like XOR. However, Multi-Layer perceptions (MLPs) are able to cope with non-linearly separable problems. Minsky & Papert (1969) offered solution to XOR problem by combining perceptron unit responses using a second layer of units [28]. The most common neural network model is known as a supervised network because it requires a desired output in order to learn [24] . The binary data is then fed into a neural network that has been trained to make the association between the character image data and a numeric value that corresponds to the character. The output from the neural network is then translated into ASCII text and saved as a file [24]. Page | 16 3. METHODOLOGY Page | 17 3.1. Generic Description The general overview of the VLPD-R system is shown in the Figure 3-1 and Figure 3-2. The system includes the two cases: i) Localization of license plate ii) Recognition of character from the plate And these are the main goal of the system. In the first case, the user provides the images and the system perform processing with the output of the segmented number plate from the provided image. The second case is about the neural network learning and training. Teach pattern is another sub use case. This is included dependency of the use case train neural network. This means teaching a neural network must include some kind of pattern teaching. Then character recognition use case comes. This goal is accomplished by a sub goal which includes some kind of pattern recalling. Thus character recognition system works. Neural network system is secondary actor whose main purpose is to interact with overall system. The main program is represented by the Figure 3-3. Our main goal is to obtain the character from the segmented part of the image. It has the three main sub cases namely image processing, localization and optical character recognition. The final output of the system is user readable in the text format. Figure 3-1: Use Case Diagram for VLP Segmentation. Page | 18 Figure 3-2: Use Case Diagram of Optical Character Recognition Figure 3-3: Use Case Diagram for Main Program 3.2. System Design The data flow diagram (DFD) – level 0 of the system is shown in the Figure 3-4. Further the DFD is expanded to level 1 for different process: preprocessing, training and recognition as shown in Figure 3-5, Figure 3-6 and Figure 3-7 respectively. Page | 19 Scanner Digital Image 1. Pre-processing of Image Enhancement on Image 2. VLP Localization USER Normalized Image data Extraction of VLP Weights 3. Character Extraction 4. Train 5. Trained data Normalized Image data 6. Recognize Weights Result Readable Text USER 7. Output Figure 3-4: Level 0 DFD of the System Page | 20 Digital Image 1.1 Grey Scaling Grey Scaled Image 1.2 Histogram Equalization Uniform Pixel Density Distribution in image 1.3 Sobel Filter (Horizontal and Vertical) Obtain Horizontal and Vertical edges in image 1.4 Binarization Obtain the Image in 0’s and 1’s 1.5 Horizontal and vertical Projection Obtain Square Shaped Images 1.6 Segmentation Obtain VLP from the Image Figure 3-5: Level 1 DFD for Process "Preprocessing" Page | 21 Normalized Image 4.1 Get Normalized Image Image in matrix form 4.2 Initialize weight for first time Weight matrix 4.3 Calculate output by backward propagation Calculated output 4.4 Compare teacher output to calculated output 4.5 Back propagate error USER Teacher output Error Trained Data (weights) Change in weight Trained data (weight) 4.6 Adjust weights Figure 3-6: Level 1 DFD of process "Training" Page | 22 Normalized Image 6.1 Get Normalized Image Image in matrix form Trained Data(weights) Weight matrix form 6.2 Calculate Output by back propagation Calculated output 6.3 Make Decision Result Figure 3-7: Level 1 DFD for Process "Recognition" Page | 23 3.3. Technical Description The technical description of the system is shown in the Figure 3.8. Scanned Image Pre-processing Segmentation Feature Extraction Neural Network Testing Training Output Recognized Text Figure 3-8: Technical Description of the System Page | 24 3.3.1. Pre-processing and VLP Localization Image taken from camera is processed by the preprocessing module. The purpose of this module is to enrich the edge features. Because our detection method is based on the boundary features, it improves the successful rate of the VLP Localization module. The algorithms which are used in this module can be sequentially stated as graying, normalization and histogram equalization. After having a gray scaled image, we have used Sobel filters to extract the edging image, and threshold the image to the binary one. We have used the local adaptive threshold algorithm for binarization of image. The resulted image is used as input for the VLP localization module [21, 29]. In VLP Localization step, detection of a number plate area is carried out. This problematic includes algorithms that are able to detect a rectangular area of the number plate in an original image. Humans define a number plate in a natural language as a “small plastic or metal plate attached to a vehicle for official identification purposes”, but machines do not understand this definition as well as they do not understand what “vehicle”, “road”, or whatever else is. Because of this, there is a need to find an alternative definition of a number plate based on descriptors that will be comprehensible for machines. Let us define the number plate as a “rectangular area with increased occurrence of horizontal and vertical edges”. The high density of horizontal and vertical edges on a small area is in many cases caused by contrast characters of a number plate, but not in every case. This process can sometimes detect a wrong area that does not correspond to a number plate. Because of this, we often detect several candidates for the plate by this algorithm, and then we choose the best one by a further heuristic analysis [2, 21, 29,]. Page | 25 3.3.1.1. Edge Detection We can use a periodical convolution of the function f(x,y), where x and y are spatial coordinates and f(x,y) is intensity of light at that point of the image, with specific types of matrices m to detect various types of edges in an image 𝑤−1 ℎ−1 𝑓`(𝑥, 𝑦) = 𝑓(𝑥, 𝑦) ∗ 𝑚[𝑥, 𝑦] = ∑ ∑ 𝑓(𝑥, 𝑦). 𝑚[𝑚𝑜𝑑𝑤 (𝑥 − 𝑖), 𝑚𝑜𝑑ℎ (𝑦 − 𝑗)] 𝑖=0 𝑗=0 ………………….. (3.1) Where w and h are dimension of image represented by the function f and m[x,y] represent the element in xth rows and jth column of the matrix m. Horizontal and vertical edge detection To detect horizontal and vertical edges, we convolve source image with matrices mhe and mve. The convolution matrices are usually much smaller than the actual image. Also, we can use bigger matrices to detect rougher edges. −1 −1 −1 mhe= 0 0 0 1 1 1 −1 0 1 mve= −1 0 1 −1 0 1 Sobel edge detector The Sobel edge detector uses a pair of 3x3 convolution matrices. The first is dedicated for evaluation of vertical edges, and the second for evaluation of horizontal edges. −1 −2 −1 Gx = 0 0 0 1 2 1 −1 0 1 Gy = −2 0 2 −1 0 1 The magnitude of the affected pixel is then calculated using the formula, G = √𝐺𝑥2 + 𝐺𝑦2 In praxis, it is faster to calculate only an approximate magnitude as, G =| 𝐺𝑥 | + |𝐺𝑦 |. Page | 26 Horizontal and Vertical projection The vertical projection of the image is a graph, which represents an overall magnitude of the image according to the axis y. The vertical projection of so transformed image can be used for a vertical localization of the number plate. The horizontal projection represents an overall magnitude of the image mapped to the axis x. We can mathematically define the horizontal and vertical projection as: 𝑝𝑥 (𝑥) = ∑ℎ−1 𝑗=0 𝑓(𝑥, 𝑗) ……………………………………………………………… (3.2) 𝑝𝑦 (𝑦) = ∑𝑤−1 𝑖=0 𝑓(𝑖, 𝑦) ……………………………………………………………... (3.3) 3.3.1.2. Double-phase statistical image analysis The statistical image analysis consists of two phases. The first phase covers the detection of a wider area of the number plate. The output of double-phase analysis is an exact area of the number plate. The detection of the number plate area consists of a “band clipping” and a “plate clipping”. The band clipping is an operation, which is used to detect and clip the vertical area of the number plate (so-called band) by analysis of the vertical projection of the snapshot. The plate clipping is a consequent operation, which is used to detect and clip the plate from the band (not from the whole snapshot) by a horizontal analysis of such band. Snapshot Assume the snapshot is represented by a function f(x,y), where 𝑥0 <𝑥<𝑥1 and 𝑦0 <𝑦<𝑦1 The [𝑥0 ,𝑦0 ] represents the upper left corner of the snapshot, and [𝑥1 ,𝑦1 ] represents the bottom right corner. If w and h are dimensions of the snapshot, then 𝑥0 = 0, 𝑦0 = 0, 𝑥1 = 𝑤 − 1 and𝑦1 = ℎ − 1. Page | 27 Band The band b in the snapshot f is an arbitrary rectangle𝑏 = (𝑥𝑏0 , 𝑦𝑏0 , 𝑥𝑏1 , 𝑦𝑏1 ), such as: (𝑥𝑏0 = 𝑥𝑚𝑖𝑛 ) and (𝑥𝑏1 = 𝑥𝑚𝑎𝑥 ) and (𝑦𝑚𝑖𝑛 ≤ 𝑦𝑏0 ≤ 𝑦𝑏1 ≤ 𝑦𝑚𝑎𝑥 ) Plate Similarly, the plate p in the band b is an arbitrary rectangle 𝑝 = (𝑥𝑝0 , 𝑦𝑝0 , 𝑥𝑝1 , 𝑦𝑝1 ), such as: (𝑦𝑝0 = 𝑦𝑏0 ) and (𝑦𝑝1 = 𝑦𝑏1) and (𝑥𝑏0 ≤ 𝑥𝑝0 ≤ 𝑥𝑝1 ≤ 𝑥𝑏1 ) Band clipping The band clipping is a vertical selection of the snapshot according to the analysis of a graph of vertical projection. If h is the height of analyzed image, the corresponding vertical projection 𝑝𝑦𝑟 (𝑦) contains h values, such as y € < 0; h-1 >. The fundamental problem of analysis is to compute peaks in the graph of vertical projection. The peaks correspond to the bands with possible candidates for number plates. The maximum value of 𝑝𝑦 (𝑦) corresponding to the axle of band can be computed as: 𝑦𝑏𝑚 = arg 𝑚𝑎𝑥𝑦0≤𝑦≤𝑦1 {𝑝𝑦 (𝑦) } ………………………………... (3.4) The 𝑦𝑏0 and 𝑦𝑏1 are coordinates of band, which can be detected as: 𝑦𝑏0 = 𝑚𝑎𝑥𝑦0≤𝑦≤𝑦𝑏𝑚 {𝑦 | 𝑝𝑦 (𝑦) ≤ 𝑐𝑦 . 𝑝𝑦 ( 𝑦𝑏𝑚 ) }………….. ….. (3.5) 𝑦𝑏1 = 𝑚𝑎𝑥𝑦𝑏𝑚≤𝑦≤ 𝑦1 {𝑦 | 𝑝𝑦 (𝑦) ≤ 𝑐𝑦 . 𝑝𝑦 ( 𝑦𝑏𝑚 ) }……………..... (3.6) 𝑐𝑦 is a constant, which is used to determine the foot of peak 𝑦𝑏𝑚 . In praxis, the constant is calibrated to 𝑐1 = 0.55 for the first phase of detection, and 𝑐2 = 0.42 for second phase. Page | 28 3.3.1.3. Segmentation Segmentation is a classifier which helps to separate each character and line from the given image with the proper definition of boundaries. In VLPD, segmentation is one of the important step for automatic number plate recognition, because the further processing of image depends in it. In segmentation, boundaries are defined and each lines and character is separated so that character-wise manipulation on the license plate can be done. Horizontal Segmentation: It is the method for horizontal separation of the image for the separation of lines in the license plate. Normally, VLP in the back side are provided with 2 rows, upper row containing the development region, available numbers of vehicle and its type and lower row containing the numeric value. The image document is horizontally projected so that the band of minima can be obtained describing the region for splitting. Figure 3-9: Horizontal Projection of VLPD The minima obtained after projection is the point required for the intersection of the word. The horizontal segmentation separate the upper row and lower row as shown in the Figure 3-9. Page | 29 Vertical segmentation: It is the method for determining the character of the image from the line of the words. The vertical projection of the image as shown in the Figure …… will provide the lowest point for determining the separation of word. Since the density value for the pitch black pixel is 0 and for white is 255. The vertical projection will calculate the column wise pixel density and multiple minimum points for segmentation is obtained. The region between the two minimum points is the required region for segmentation of the image. The process is same as the horizontal segmentation but the difference is here character is segmented and ready for the input in the neural network. Figure 3-10: Vertical Projection of the Image 3.3.2. Thinning Thinning is the process of peeling off a pattern as many pixels as possible without affecting the general shape of the pattern. In other words, after pixels have been peeled off, the pattern can still be recognized. Page | 30 Zhang-Suen Thinning Algorithm This algorithm has two steps, which will be successively applied to the image [30] . In each step contour points of the region that can be deleted are identified. Contour points are defined as points that have value “1” and they have at least one 8-neighbor pixel value equal to “0.” Step 1: Pixel I(i,j) is marked for deletion if ALL of the following 4 conditions are true 1. Its connectivity = 1 2. Has at least 2 object neighbors and not more than 6 3. At least one of P2, P4, P6are background 4. At least one of P4, P6, P8are background 5. Delete marked Step 2: Same as first except rules 3 and 4 are changed 3. At least one of P2, P4, P8are background 4. At least one of P2, P6, P8are background 5. Delete Marked If at the end of any sub-iteration, there are no pixels to be deleted, the skeleton is complete. First sub-iteration, as shown in Figure3-11, removes south or east boundary pixels or north-west corner pixels. Figure 3-11: First Sub-iteration Page | 31 Second sub-iteration, as shown in Figure 3-12, removes north or west boundary pixels or south-east corner pixels Figure 3-12: Second Sub-iteration Each of the above steps reduces the amount of the information to be processed by a feature extraction. Figure 3-13 shows the result of Zhang-Suen thinning algorithm. Figure 3-13: Original pattern and Skeleton as a result of Zhang-Suen thinning algorithm 3.3.3. Feature Extraction Information contained in a bitmap representation of an image is not suitable for processing by computers. Because of this, there is need to describe a character in another way. The description of the character should be invariant towards the used font type, or deformations caused by a skew. In addition, all instances of the same character should have a similar description. A description of the character is a vector of numeral values, so-called “descriptors”, or “patterns”: Page | 32 Generally, the description of an image region is based on its internal and external representation. The internal representation of an image is based on its regional properties, such as color or texture. The external representation is chosen when the primary focus is on shape characteristics. The description of normalized characters is based on its external characteristics because we deal only with properties such as character shape. Then, the vector of descriptors includes characteristics such as number of lines, bays, lakes, the amount of horizontal, vertical and diagonal or diagonal edges, and etc. The feature extraction is a process of transformation of data from a bitmap representation into a form of descriptors, which are more suitable for computers. 3.3.3.1. Fast Fourier Transform (FFT) FFT is one of the efficient method for extracting the signature of an image. It decomposes an image into its real and imaginary components which is a representation of the image in the frequency domain. If the input signal is an image then the number of frequencies in the frequency domain is equal to the number of pixels in the image or spatial domain. The inverse transform re-transforms the frequencies to the image in the spatial domain. The FFT of a 2D image are given by the following equations: 𝑛 −𝑖2𝜋(𝑥 ) 𝑁 𝐹(𝑥) = ∑𝑁−1 𝑛=0 𝑓(𝑛)𝑒 ……………………………………………… (3.6) The Fourier transform of an image is a simple extension of the 1-D Fourier transform into two dimensions, and is achieved by simply applying the 1-D transform to each row of an image, and then transforming each column of the resulting image. It produces essentially the same thing. A picture of smooth water waves travelling in a diagonal direction will transform to a series of spikes along that same diagonal. The Fourier transform is defined over continuous functions. The FFT is a technique for efficiently evaluating the Fourier transform over discrete sets of data. 3.3.3.2. Density over Different Zones Page | 33 The density of each pixel over various region defines the structure of the image. The image is splitted into different zones as per the need for the accuracy. The system over here splits the image into 8 different region i.e. 24*24 image into 3*3 image and the density of each region is calculated. It will provide the data of 8 different region which is assumed to be nearer or equivalent for the target value. 3.3.3.3. Area of Image The area of the image is the count of the number of pixel for which 𝑓(𝑥, 𝑦) = 1. The area of the image for the constant size image is almost invariable due to which the properties is one of the reason for defining the image properties [8]. 3.3.3.4. Moment Invariants Moment invariants are important tools in object recognition problem. These techniques grab the property of image intensity function. Moment invariants were first introduced to the pattern recognition community in 1962 by Hu [29] , who employed the results of the theory of algebraic invariants and derived his seven famous invariants to rotation of 2-D objects. Moment invariants used in this research for extracting statistical patterns of character images are given in [10] . Moment invariants are pure statistical measures of the pixel distribution around the center of gravity of the character and allow capturing the global character shape information. The standard moments mpq of order (p+q) of an image intensity function f(x;y) is given by, …………………... (3.7) A uniqueness theorem states that if f(x;y) is piecewise continues and has non-zero values only in a finite part of the xvis ; yvis plane, moments of all order exist and the moment Page | 34 sequence (mpq) is uniquely determined by f(x;y). Conversely, (mpq) is uniquely determines f(x;y). For discrete domain, the 2-D moment of order (p+q) for a digital image f(x;y) of size M x N is given by, ……………….. (3.8) The corresponding central moment of order (p+q) is defined as, ………... (3.9) Where, ………………………………….. (3.10) The normalized central moments, denoted by ɳpq , are defined as, ………………………………………………… (3.11) Where, ……………………………. (3.12) A set of seven invariant moments can be derived from the second and third moments [23] which are invariant to translation, scale change, mirroring, and rotation, are given as follows. Page | 35 …………………………………………………….. (3.13a) ………………………………………… (3.13b) ………………………………... (3.13c) ……………………………… (3.13d) …… (3.13e) ……………………. (3.13f) ………. (3.13g) 3.4. Artificial Neural Network (ANN) ANN is a non-linear, parallel, distributed, highly connected network having capability of adaptivity, self-organization, fault tolerance and evidential response, which closely resembles with physical nervous system. Physical nervous system is highly parallel, distributed information processing system having high degree of connectivity with capability of self-learning. f1(e) X1 f2(e) f5(e) f3(e) X2 f6(e) f4(e) Page | 36 Figure 3-14: Feed Forward Multilayer Perceptron 3.4.1. Multilayer Perceptron It consists of multiple layers of computational units, usually interconnected in a feedforward way. Each neuron in one layer has directed connections to the neurons of the subsequent layer. It consists of a layer of input units, one or more layers of hidden units, and one layer of output units shown in Figure 3-14 and 3-15. The output from each layer is the weighted linear summation of all input vectors along with the bias term, passed through some activation function. The network weight adjustment is done by backpropagating the error of the network. Weight U1 Adder Activation W1 U2 W2 Function ∑ Ф Output Y U3 W3 Figure 3-15: Operation on Layer's Node 3.4.2. Feedforward Back Propagation Neural Network A Feedforward Neural Network (FFNet) is a biologically inspired classification algorithm. It consist of a (possibly large) number of simple neuron-like processing units, organized in layers. Every unit in a layer is connected with all the units in the previous layer. These connections are not all equal, each connection may have a different strength Page | 37 or weight. The weights on these connections encode the knowledge of a network. Often the units in a neural network are also called nodes. Data enters at the inputs and passes through the network, layer by layer, until it arrives at the outputs. During normal operation, that is when it acts as a classifier, there is no feedback between layers. This is why they are called FFNet. 3.4.3. Backpropagation Backpropagation, an abbreviation for "backward propagation of errors", is a common method of training artificial neural networks. From a desired output, the network learns from many inputs, similar to the way a child learns to identify a dog from examples of dogs. It is a supervised learning method, and is a generalization of the delta rule. It requires a dataset of the desired output for many inputs, making up the training set. It is most useful for feed-forward networks (networks that have no feedback, or simply, that have no connections that loop). Backpropagation requires that the activation function used by the artificial neurons (or "nodes") be differentiable. For better understanding, the Backpropagation learning algorithm can be divided into two phases: propagation and weight update. Phase 1: Propagation Each propagation involves the following steps: 1. Forward propagation of a training pattern's input through the neural network in order to generate the propagation's output activations. 2. Backward propagation of the propagation's output activations through the neural network using the training pattern target in order to generate the deltas of all output and hidden neurons. Page | 38 Phase 2: Weight update For each weight-synapse follow the following steps: 1. Multiply its output delta and input activation to get the gradient of the weight. 2. Bring the weight in the opposite direction of the gradient by subtracting a ratio of it from the weight. This ratio influences the speed and quality of learning; it is called the learning rate. The sign of the gradient of a weight indicates where the error is increasing, this is why the weight must be updated in the opposite direction. 3.4.4. Training FFNet The FFNet uses a supervised learning algorithm: besides the input pattern, the neural net also needs to know to what category the pattern belongs. Learning proceeds as follows: a pattern is presented at the inputs. The pattern will be transformed in its passage through the layers of the network until it reaches the output layer. The units in the output layer all belong to a different category. The outputs of the network as they are now are compared with the outputs as they ideally would have been if this pattern were correctly classified: in the latter case the unit with the correct category would have had the largest output value and the output values of the other output units would have been very small. On the basis of this comparison all the connection weights are modified a little bit to guarantee that, the next time this same pattern is presented at the inputs, the value of the output unit that corresponds with the correct category is a little bit higher than it is now and that, at the same time, the output values of all the other incorrect outputs are a little bit lower than they are now. (The differences between the actual outputs and the idealized outputs are propagated back from the top layer to lower layers to be used at these layers to modify connection weights. This is why the term backpropagation network is also often used to describe this type of neural network. 3.5. Training Neural Network Page | 39 The individual neurons that make up a neural network are interconnected through the synapses. These connections allow the neurons to signal each other as information is processed. Not all connections are equal. Each connection is assigned a connection weight. If there is no connection between two neurons, then their connection weight is zero. These weights are what determine the output of the neural network. Therefore, it can be said that the connection weights form the memory of the neural network. Training is the process by which these connections weights are assigned. Most training algorithm begins by assigning random numbers to the weights matrix. Then the validity of the neural network is examined. Next the weights are adjusted based on how valid the neural network performed. This process is repeated until the validation error is within an acceptable limit. The type of learning in neural networks is determined by the manner in which the parameter changes. There are many ways to train neural networks. 3.5.1. Supervised Learning In a supervised learning process, the adjustment of weights is done under the supervision of a teacher or ideal output, that is, precise information about the desired or correct network output is available from a teacher when given a specific input pattern. 3.5.2. Error Correction Learning The goal is to minimize the cost to correct the errors. This leads to the well-known delta rule (or Widrow-Hoff rule), which is stated as the adjustment made to a synaptic weight of a neuron is proportional to the product of the error signal and the input signal of the synapse in question 3.6. Validating Neural Networks Once a neural network has been trained, it must be evaluated to see if it is ready for actual use. This is important so that it can be determined if additional training is required. Page | 40 To correctly validate a neural network, validation data must be aside that is completely separate from the training data. Page | 41 4. IMPLEMENTATION Page | 42 4.1. Global Thresholding The basic global threshold, T is calculated as follows:  Select an initial estimate for T (typically the average grey level in the image)  Segment the image using T to produce two groups of pixels, G1 consisting of pixels with grey levels > T and G2 consisting pixels with grey levels < T  Compute the average grey levels of pixels in G1 to give µ1 and G2 to give µ2.  Compute a new threshold value 𝑇=  µ1+ µ2 2 ………………………………………………………………… (4.1) Repeat steps 2 to 4 until the difference in T in successive iterations is less than a predefined limit T. 4.2. Region Based Segmentation (Horizontal and Vertical) The goal of the segmentation algorithm is to find peaks, which correspond to the spaces between characters. At first, there is a need to define several important values in a graph of the horizontal projection px(X):  Vm – The maximum value contained in the horizontal/ vertical projection px(x).  Va – The average value of the horizontal projection px(x). The algorithm of segmentation iteratively finds the maximum peak in the graph of vertical/ horizontal projection. The peak is treated as a space between characters, if it meets some additional conditions, such as height of peak. The algorithm then zeroizes the peak and iteratively repeats this process until no further space is found. This principle can be illustrated by the following steps: 1. Determine the index of the maximum value of horizontal projection. Xm = arg {max (px)} 0≤x≤w 2. Detect the left foot and the right foot of the peak as: Xl = max {x|px(x)≤ Cx . px(xm)} Xr = min {x|px(x)≤ Cx . px(xm)} 3. Zeroize the horizontal projection px(xm) on inverval (xl,xr) Page | 43 4. If px(xm)< Cw. Vm, go to step 7. 5. Divide the plate horizontally in the point xm. 6. Go to step 1. 7. End. Two different constants have been used in the algorithm above. The constant x c is used to determine foots of peak xm . The optimal value of cx is0.7. The constant cw determines the minimum height of the peak related to the maximum value of the projection ( m v ). If the height of the peak is below this minimum, the peak will not be considered as a space between characters. It is important to choose a value of constant w c carefully. An inadequate small value causes that too many peaks will be treated as spaces, and characters will be improperly divided. A big value of w c causes that not all regular peaks will be treated as spaces, and characters will be improperly merged together. The optimal value of w c is 0.86. To ensure a proper behavior of the algorithm, constants x c and w c should meet the following condition: Where P is a set of all detected peaks m x with corresponding foots xl and xr. 4.3. Back Propagation It is a supervised learning method, and is a generalization of the delta rule. It requires a dataset of the desired output for many inputs, making up the training set. It is most useful for feed-forward networks (networks that have no feedback, or simply, that have no connections that loop). Backpropagation requires that the activation function used by the artificial neurons (or "nodes") be differentiable. 1. Initialize the weights to small random values. 2. Feed the training sample through the network and determine the final output. 3. Compute the error for each output unit, for unit k it is: δk=(tk-yk)f'(y_ink) where tk=Required Output yk=Actual Output f'(y_ink)=Derivative of f 4. Calculate the weight correction term for each output unit, for unit k it is: Δwjk=αδkzj Where, α is a small constant zj is hidden layer signal. Page | 44 5. Propagate the delta terms (errors) back through the weights of the hidden units where the delta input for the jth hidden unit is : Δ_inj=Σk=1m δkwjk The delta term for the jth hidden unit is: δj=δ_injf'(z_inj) 6. Calculate the weight correction term for the hidden units Δwij=α δjxi 7. Update the weights. wjk(new)=wjk(old)+ Δwjk 8. Test for stopping (maximum cycles, small changes etc.) Page | 45 5. RESULT AND DISCUSSION Page | 46 5.1 Result The experiments conducted were targeted for the validation of the model for deployment. The result was analyzed in various stages of the system. First stage of experiment was carried out for license plate localization following with horizontal segmentation and vertical segmentation. After that experiment for character recognition was carried out. The experiments for the recognition of individual character of the license plate were carried out in two different modes. Training the System: In this mode, the errors in each iteration were analyzed. The main issue of this experiment was to test whether the system is reaching to the stable state or not. Recognize the Characters: This mode of experiment was conducted to test whether the model trained is capable of predicting the correct values or not. 5.2. Observation and Discussions Table 5-1 shows accuracy of the system at different stages. The plate localization, horizontal segmentation of localized license plate, vertical segmentation and finally character recognition are considered as different stages of the system. Figure 5-1 shows the accuracy of the system at different stages. Table 5-1: Accuracy rate corresponding to different stages Stage/Part Accuracy Plate Localization 67% Horizontal Segmentation 98% Vertical Segmentation 90% Character Recognition 92% Page | 47 Accuracy Rate 100% 80% 60% 40% 20% 0% Plate Localization Horizontal Segmentation Vertical Segmentation Character Recognition Stages of the system Figure 5-1: Accuracy Rate of different stages of the system Table 5-2 shows the recognition rate of each character of the vehicle license plate presented to the system. Figure 5-2 shows the graph of recognition accuracy rate of the system corresponding to individual character. Table 5-2: Recognition Result of individual character Class Accuracy Rate Class Accuracy Rate 0 97% 8 88% 1 85% 9 82% 2 82% k 87.5% 3 97% r 86.6% 4 94% af 90.7% 5 84% h 92.3% 6 86% u 85.4% 7 85% s 94.1% Page | 48 Recognition Accuracy 100% 95% 90% 85% 80% 75% 70% 0 1 2 3 4 5 6 7 8 9 Character k r af h u s Figure 5-2: Recognition accuracy of individual character 5.3. Output Figure 5-3: Input Image for VLPR System Figure 5-4: VLP Localization Page | 49 Above Figure 5-3 indicate the image taken from camera for the VLPR system. The result that shown in Figure 5-4 indicate the VLP localization. The area surrounded by blue rectangle indicates the position of the VLP in the image. Segmentation Figure 5-6: VLP Horizontal Segmentation Figure 5-5: VLP Vertical Segmentation Page | 50 5. CONCLUSION AND FUTURE ENHANCEMENT Page | 51 6.1. Conclusion The project is an attempt to emphasize on the recognition of Nepali Vehicle License Plate Detection and Recognition in the simplest possible manner with the use of image processing techniques and neural network enable us to determine the best possible output. The system developed has capability to locate the license plate from the supplied image and determine the character present in it. The various image processing techniques like filtering, thinning, binarization, cropping, etc. help in determining the plate and individual character’s image and its features are used as input to the neural network which recognizes the character. 6.2. Future Enhancement VLPD-R is a vast area of research. VLPD-R like project need deep research and knowledge to address different problem associated with it in different conditions. The system is optimized to work with straight image and in proper lighting condition of the license plate image, skewness and low contrast in the image reduces accuracy rate of plate localization. Moreover the similar color of vehicle and license plate creates problem in detection of license plate. The present system requires series of manual input which is not acceptable in real life implementation. Hence, further work to improve the present system is needed. The system should be made to work with any type of image in any lighting condition. Moreover system has to build fully automatic so that it can easily deploy in real life for various purpose. The system is working with only image; it will make to operate video too. Page | 52 6. EPILOGUE Page | 53 7.1. References [1] MukeshKumar,”A Real-Time Vehicle License Plate Recognition (LPR) System”, A thesis report submitted for the completion of Master degree in Electronics Instrumentation and control engineering, July 2009. [2] OndrejMartinsky, “Algorithmic and Mathematical Principles of Automatic Number Plate Recognition System”, A thesis report submitted for the completion of BSC to Faculty of Information Technology, BRNO University Of Technology, August 2007. [3] Vehicle and Transport Management Act of Nepal, 2003. [4] An article Published on Himalayan times [5] Nepal Traffic Police, License Plate Information cited from http://traffic.nepalpolice.gov.np/other-notices/number-plate1.html [6] Tran DucDuan, Tran Le Hong Du, Tran VinhPhuoc, Nguyen Viet Hoang, “Building an Automatic Vehicle License-Plate Recognition System”, Intl. Conf. in Computer Science – RIVF’05, Feb 21-24, 2005, Can Tho, Vietnam. [7] An article on Tesseract based Nepali OCR - Resarch Report published on http://nepalinux.org/index.php?option=com_content&task=view&id=46&Itemid =53 [8] Ashok Kumar Pant, Sanjeeb Prasad Panday and Prof. Dr. Shashidhar Ram Joshi, "Off-line Nepali Handwritten Character RecognitionUsing Multilayer Perceptron and Radial Basis Function Neural Networks". [9] Dr. Richard Spillman, “Artificial Intelligence”, PLU, Fall 2003 [10] Gonzalez Woods and Eddins, “Digital Image Processing”, vol-3. [11] Oivind Due Trier, Anil K. Jain, and TorfinnTaxt, “Feature Extraction Methods for Character Recognition – A Survey”, Pattern Recognition, Vol. 29.No. 4, pp. 641-662, 1996. [12] Huang L., Wan G., Liu C., “An Improved Parallel Thinning Algorithm”, 2003. [13] Martin F. Møller, "A Scaled Conjugate Gradient Algorithm for Fast Supervised Learning", University of Aarhus, Denmark, 1990. [14] KiriWagstaff, "ANN Backpropagation: Weight updates for hidden nodes", 2008 Page | 54 [15] BalázsEnyedi, LajosKonyha and KálmánFazekas, "Real Time Number Plate Localization Algorithms",Journal of ELECTRICAL ENGINEERING, VOL. 57, NO. 2, 2006, 69–77. [16] Hamid Mahini, ShohrehKasaei, FaezehDorri and FatemehDorri, "An Efficient Features–Based License Plate Localization Method", IEEE, 2006. [17] Dong Xiao Ni, "Application of Neural Networks to Character Recognition",Proceedings of Students/Faculty Research Day, CSIS, Pace University, May 4th, 2007. [18] Augusto Celentanoand Vincenzo Di Lecce, "A FFT based technique for image signature generation". [19] [20] Vehicle and Transportation Management act",2054B.S.. KuruGollu, B. Sankur and A.E. Harmanci, "Color Image Segmentation Using Histogram Multithresholding And Fusion",2001. [21] BishnuChaulagain, BrizikaBantawaRai and Sharad Kumar Raya, "Final Report on Nepali Optical Character Recognition ",2009. [22] Anne Magaly De Paula Canuto,"Combining Neural Networks And Fuzzy Logic For Applications In Character Recognition", University of Kent at Canterbury, 2001. [23] M.-K. Hu, “Visual pattern recognition by moment invariants,”IRE Transactions on Information Theory, vol. IT-8, pp. 179–187, 1962. Page | 55 7.2. Glossary Activation function: A mathematical function used in neural network to map input values to closed range of values between -1 to 1. Back Propagation training: A sound and systematic means of training a multilayer network. Digital image processing: Manipulation, improvement analysis of pictorial information of image that is digitally represented. Epoch: Number of iteration taken for one cycle. Feature extraction: The process of extracting essential characteristics of an input. Momentum: A method that is used to accelerate the training process of back propagation neural network. Neurons: Interconnected nerve cells that make up most of the brain tissue in a living organism. Neural network: A computer model that simulates the working of biological neuron. Output vector: A vector that hold the output values generated by a trained network from an input vector during the process of knowledge retrieval. Pixel: The smallest unit of an image. All images are composed of 2-d array of pixels. Segmentation: Differentiation of object of interest from the background. Thinning: The process of extracting shape/skeleton of image. Page | 56

Artificial Intelligence, Machine Vision, Neural Network

Related documents

Products

Support

Artificial Intelligence, Machine Vision, Neural Network

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib