Artificial Intelligence, Machine Vision, Neural Network

advertisement
TRIBHUVAN UNIVERSITY
INSTITUTE OF ENGINEERING
PULCHOWK CAMPUS
DEPARTMENT OF ELECTRONICS AND COMPUTER ENGINEERING
CERTIFICATE OF APPROVAL
The undersigned certify that they have read, and recommended to the Institute of
Engineering for acceptance, a project report entitled "VLPD-R" submitted by Love
Shankar Shrestha, Promisha Mishra, Ravi Bhagat and Tanka Bahadur Pun in partial
fulfillment of the requirements for the Bachelor’s degree in Electronics & Computer
Engineering.
_________________________________________________
Supervisor, Dr. Sanjeeb Prasad Panday
Lecturer,
Department of Electronic and Computer Engineering
_________________________________________________
Co-Supervisor, Mr. Anil Verma
Lecturer,
Department of Electronic and Computer Engineering
__________________________________________________
Internal Examiner, Coordinator, Dr. AmanShakya
Deputy Head,
Department of Electronic and Computer Engineering
-----------------------------------------------------------------------------External Examiner, Mr. Subhash Dhakal
Ministry of Science and Technology,
Nepal
DATE OF APPROVAL: 26.08.2013
I
COPYRIGHT
The author has agreed that the Library, Department of Electronics and Computer
Engineering, Pulchowk Campus, Institute of Engineering may make this report freely
available for inspection. Moreover, the author has agreed that permission for extensive
copying of this project report for scholarly purpose may be granted by the supervisors
who supervised the project work recorded herein or, in their absence, by the Head of the
Department wherein the project report was done. It is understood that the recognition
will be given to the author of this report and to the Department of Electronics and
Computer Engineering, Pulchowk Campus, Institute of Engineering in any use of the
material of this project report. Copying or publication or the other use of this report for
financial gain without approval of to the Department of Electronics and Computer
Engineering, Pulchowk Campus, Institute of Engineering and author’s written
permission is prohibited.
Request for permission to copy or to make any other use of the material in this report in
whole or in part should be addressed to:
ArunTimalsina
Head
Department of Electronics and Computer Engineering
Pulchowk Campus, Institute of Engineering
Lalitpur, Kathmandu
Nepal
II
ACKNOWLEDGEMENT
It is an immense pleasure for us to acknowledge the guidance, encouragement and
assistance received from several individuals during the project period. Our heart-felt
gratitude goes to our project supervisor, Dr. Sanjeeb Prasad Panday and our cosupervisor Mr. Anil Verma who have inspired, encouraged and provided invaluable
advice to accomplish this project. We also would like to thank him for showing us some
example that related to the topic of our project. We are equally indebted to Prof. Dr.
Arun Timalsina, Head of Department of Electronics and Computer Engineering for
providing us an opportunity and environment for the project.
Our words of appreciations are short of praising the guidance of Assistant Dean Dr.
Subarna Shakya, Professor of Department of Electronics and Computer Engineering. We
also wish our thankfulness to Dr. Aman Shakya, Deputy Head of Department of
Electronics and Computer Engineering and our B.E project coordinator.
We would like to convey our acknowledgement to Mr. Ashok Kumar Pant for guiding us
during the project session and giving us his invaluable time.
Finally, we would also like to offer our gratitude to all our teachers whose ideas were the
basis for our project research and finally we would like to thank all our friends who gave
us their suggestions, ideas and support for this project.
Love Shankar Shrestha (16219)
Promisha Mishra (16223)
Ravi Bhagat (16224)
Tanka Bahadur Pun (16243)
III
ABSTRACT
This project deals with the development of an application which recognizes the Vehicle
License Plate (VLP) that can be used for traffic system, parking area and border
crossings in Nepal. The current project work uses Artificial Intelligence (AI), Machine
Vision, and Neural Network (NN) along with image processing to construct the Vehicle
License Plate Recognition (VLPR) system for Nepal.
Specifically the system first takes images of vehicle from camera and then localizes VLP
from the image. Once the VLP is detected, it is segmented into individual characters and
the characters are recognized. The focus is on the design of algorithms used for
extracting the license plate from the image, segmenting the characters of the plate and
identifying the individual characters.
Keywords:
Artificial Intelligence, Machine Vision, Neural Network, Image-processing, Optical
Character Recognition
IV
TABLE OF CONTENTS
LETTER OF APPROVAL …………………………………………………………….... I
COPYRIGHT ..................................................................................................................... II
ACKNOWLEDGEMENT ................................................................................................III
ABSTRACT ..................................................................................................................... IV
TABLE OF CONTENTS ................................................................................................... V
LIST OF FIGURES ....................................................................................................... VIII
LIST OF TABLES ........................................................................................................... IX
LIST OF ABBREVIATIONS AND SYMBOLS .............................................................. X
1. INTRODUCTION ..........................................................................................................1
1.1. Background ........................................................................................................2
1.2. Problem Statement .............................................................................................2
1.3. Objectives ...........................................................................................................3
1.3.1. General Objective ....................................................................................3
1.3.2. Specific Objective ....................................................................................3
1.4. Scope of Work ....................................................................................................4
1.5. Organization of Report .......................................................................................5
2. LITERATURE REVIEW ...............................................................................................6
2.1. Related Work ......................................................................................................7
2.2. Feature of Nepali Vehicle License Plate ............................................................8
2.3. Image Processing ..............................................................................................12
2.3.1. Image Acquisition and Preprocessing ....................................................12
2.3.2. Plate Localization...................................................................................12
2.3.3. Segmentation..........................................................................................14
2.4. Feature Extraction ............................................................................................14
V
2.5. Neural Network ................................................................................................15
3. METHODOLOGY ......................................................................................................17
3.1. Generic Description ..........................................................................................18
3.2. System Design ..................................................................................................19
3.3. Technical Description.......................................................................................24
3.3.1. Pre-processing and VLP Localization ...................................................25
3.3.2. Thinning .................................................................................................30
3.3.3. Feature Extraction ..................................................................................32
3.3.3.1. Fast Fourier Transform (FFT) ................................................33
3.3.3.2. Density over Different Zones .................................................33
3.3.3.3. Area of Image .........................................................................34
3.3.3.4. Moment Invariants .................................................................34
3.4. Artificial Neural Network (ANN) ....................................................................36
3.4.1. Multilayer Perceptron ............................................................................37
3.4.2. Feedforward Back Propagation Neural Network ...................................37
3.4.3. Backpropagation ....................................................................................38
3.4.4. Training FFNet.......................................................................................39
3.5. Training Neural Network .................................................................................39
3.5.1. Supervised Learning ..............................................................................40
3.5.2. Error Correction Learning......................................................................40
3.6. Validating Neural Networks .............................................................................40
4. IMPLEMENTATION ..................................................................................................42
4.1. Global Thresholding .........................................................................................43
4.2. Region Based Segmentation (Horizontal and Vertical) ...................................43
4.3. Back Propagation .............................................................................................44
5. RESULT AND DISCUSSION .....................................................................................46
VI
5.1 Result ................................................................................................................47
5.2. Observation and Discussions............................................................................47
5.3. Output ...............................................................................................................49
5. CONCLUSION AND FUTURE ENHANCEMENT...................................................51
6.1. Conclusion ........................................................................................................52
6.2. Future Enhancement ........................................................................................52
6. EPILOGUE ..................................................................................................................53
7.1. References ........................................................................................................54
7.2. Glossary ............................................................................................................56
VII
LIST OF FIGURES
Figure 2-1: Vehicle Identifier ...........................................................................................10
Figure 2-2: VLP in 4:3 ratio..............................................................................................11
Figure 2-3: VLP in 4:1 ratio..............................................................................................11
Figure 3-1: Use Case Diagram for VLP Segmentation. ...................................................18
Figure 3-2: Use Case Diagram of Optical Character Recognition ...................................19
Figure 3-3: Use Case Diagram for Main Program ............................................................19
Figure 3-4: Level 0 DFD of the System............................................................................20
Figure 3-5: Level 1 DFD for Process "Preprocessing" .....................................................21
Figure 3-6: Level 1 DFD of process "Training" ...............................................................22
Figure 3-7: Level 1 DFD for Process "Recognition" ........................................................23
Figure 3-8: Technical Description of the System .............................................................24
Figure 3-9: Horizontal Projection of VLPD .....................................................................29
Figure 3-10: Vertical Projection of the Image ..................................................................30
Figure 3-11: First Sub-iteration ........................................................................................31
Figure 3-12: Second Sub-iteration ....................................................................................32
Figure 3-13: Original pattern and Skeleton as a result of Zhang-Suen thinning
algorithm ...........................................................................................................................32
Figure 3-14: Feed Forward Multilayer Perceptron ...........................................................37
Figure 3-15: Operation on Layer's Node ..........................................................................37
Figure 5-1: Accuracy Rate of different stages of the system ............................................48
Figure 5-2: Recognition accuracy of individual character ................................................49
Figure 5-3: Input Image for VLPR System.......................................................................49
Figure 5-4: VLP Localization ...........................................................................................49
Figure 5-5: VLP Vertical Segmentation ...........................................................................50
Figure 5-6: VLP Horizontal Segmentation .......................................................................50
VIII
LIST OF TABLES
Table 2-1: Major Categories of Vehicle .............................................................................9
Table 5-1: Accuracy rate corresponding to different stages .............................................47
Table 5-2: Recognition Result of individual character .....................................................48
IX
LIST OF ABBREVIATIONS AND SYMBOLS
Φ
Activation Function
δ
Delta as error
AI
Artificial Intelligence
ANN
Artificial Neural Network
ART
Adaptive Resonance Theory
DOCR
Devanagari Optical Character Recognition
DFD
Data Flow Diagram
FFT
Fast Fourier Transform
FFNet
Feedforward Neural Network
INGO
International Non-governmental Organization
ITS
Intelligent Transport System
LL
Letter Letter
MLP
Multilayer Perceptron
NGO
Non-governmental Organization
NN
Neural Network
NN
Number Number
NNNN
Number NumberNumberNumber
OCR
Optical Character Recognition
VLP
Vehicle License Plate
VLPD
Vehicle License Plate Detection
VLPR
Vehicle License Plate Recognition
VLPD-R
Vehicle License Plate Detection & Recognition
VMTR
Vehicle and Transport Management Rule
X
1.
INTRODUCTION
Page | 1
1.1. Background
VLPR has been intensively studied in many countries
[1]
. Due to variation in license
plates currently in practice, the requirement of an automatic license plate recognition
system is different for each country. This project is laser focused for the development of
license plate localization and recognition system for vehicles in Nepal. This system is
developed based on digital images manipulation and can be easily applied to commercial
purpose like car park systems for the use of documenting access of parking services,
secure usage of parking houses and also to prevent car theft issues and many more.
In the current era of information technology, the use of automatics and intelligent
systems is becoming more and more widespread. The Intelligent Transport System (ITS)
technology has gotten so much attention that many systems are being developed and
applied all over the world. VLP recognition has turned out to be an important research
issue. VLP recognition has significant role in traffic monitoring system including
controlling the traffic volume, ticketing vehicle without human control, vehicle tracking,
and so on. In some countries, VLPR systems installed on country borders automatically
detect and monitor border crossings. Each vehicle can be registered in a central database
and compared to a black list of stolen vehicles.
1.2. Problem Statement
In most of the countries, the attributes of the vehicle license plates are strictly
maintained. For example, the size of the plate, color of the plate, font face/ size/ color of
each character, spacing between subsequent characters, the number of lines in VLP,
script etc. are maintained very specifically. However, in Nepal, the VLP are not yet
standardized especially in size of plate and font of the characters, which makes the
system less accurate. Only the numeric letter “5 “ is written in more than three styles,
making localization and subsequent recognition of vehicle number plates extremely
difficult in this condition.
Page | 2
The problem of vehicle number plate recognition is interestingly difficult one. These
tasks become more sophisticated when dealing with plate images inclined in various
angles and with noise. Because this problem is usually used in real-time systems, it
requires not only accuracy but also fast processing. The most vital and most difficult part
of any vehicle number plate recognition system is the detection and extraction of the
vehicle number plate, which directly affects the systems overall accuracy. The presence
of noise, blurring in the image, uneven illumination, dim light and foggy condition make
the task even more difficult. Localization of the VLP is also a problem due to distance
between the camera and the vehicle .Sometimes it also becomes difficult due to angular
image. This problem leads to inaccuracy in further steps [2].
Next problem in VLPR system is recognition of the characters. In Devanagari script "5"
is written in more than 5 styles. Similar is the case with "8"and "9". Lack of
standardization in Devanagari script is the cause of this problem.
1.3. Objectives
The core objective of this project is to make the vehicle number plate recognition system
automatic so that it will help the traffic and other aspect of the national security system.
There are mainly two types of objective of the this project namely,
1.3.1. General Objective
The general objective of this project is to recognize the number of the different types of
vehicle of Nepal such as government owned, Non-governmental Organization (NGO),
International Non-governmental Organization (INGO), public and private. As we know
the color of number plate of different sector are different and this will be the core thing
that help us to recognize the number plate easily and efficiently.
1.3.2. Specific Objective
The specific objectives of the project are as follows:
Page | 3
I)
Detection of the number plate.
II)
Recognition of the number of vehicle using the Devanagari Optical Character
Recognition.
1.4. Scope of Work
The scope of this project is to build an automatic system that can recognize the vehicle
by taking the image of the vehicle with VLP as input and obtaining the vehicle
registration.
The main feature of the system is, it can separate the VLP from the image and obtain the
character in the Devanagari. Recognition of Devanagari is always a difficult task and the
worst scenario is performing such task in the error prone surrounding.
VLPR system plays a major role in monitoring traffic rules and maintaining law
enforcement on public roads. This area is challenging because it requires an integration
of many areas in computer science, which include Object detection (plate localization)
and Character recognition. There are many scope of such recognition systems, some of
the examples where system fits are discussed below.

Traffic Systems: VLPR systems can be used for traffic systems to recognize
the number plate of the vehicle and store it in database. From which the wanted
or stolen vehicles can be searched easily as well as the density of running vehicle
in an area can be easily taken.

Parking: VLPR system can be used for parking places to keep the record of the
vehicles. The VLPR system can be used to automatically enter pre-paid members
and calculate parking fee for non-members (by comparing the exit and entry
times) by using some more technology along with VLPR.

Border Crossings: VLPR system can be used to monitor the border crossings
for keeping the track of vehicles which exits of the country. Each vehicle
information can be registered into a central database and can be linked to
additional information.
Page | 4
1.5. Organization of Report
Chapter 1: It deals with the general introduction of the project, system, problems, scope
and organization of the report.
Chapter 2: It occupies the literature review related to the project system. It includes the
features of Nepali VLP, Devanagari script, image processing, neural network
and optical character recognition.
Chapter 3: It describes the process and methodology applied on the raw image, overall
system view and technical description of the system. It presents the system
diagram, data flow diagram and data model on which the system is built. It
also explains the interaction among the different component of the system.
Chapter 4: It focuses on the implementation model and application overview of the
system describing the algorithm followed during the system design.
Chapter 5: It explains the results and discussions of the project. It also shows the error
calculation and accuracy rates of recognizing the VLP and registration
number in it.
Chapter 6: It explains the conclusion and future enhancement of the system.
Chapter 7: It deals with the epilogue part of the report. It contains the glossary,
references and the appendix part of the report.
Page | 5
2. LITERATURE REVIEW
Page | 6
2.1. Related Work
In this section focus is on the related work that has been done previously by several
researchers. In literature we can find many methods for license plate detection and
recognition system. The major drawback is that how long it will take to compute and
recognize the license plates. This is critical and most needed when it is applied to real
time applications. However, there is always a trade-off between computational time and
performance rate. In order to achieve an accurate result and increase the performance of
the system more computational time is required.
The problem of automatic VLP recognition has been studied since 1990s. The first
approach was based on characteristics of boundary lines
[1, 3]
. The input image was first
processed to enrich boundary lines’ information by some algorithms such as the gradient
filter, and resulted in an edging image. This image was binarized and then processed by
certain algorithms, such as Hough transform, to detect lines. Eventually, couples of 2parallel lines were considered as a plate-candidate.
Another approach was morphology-based
[4, 5, 6]
. This approach focuses on some
properties of plate images such as their brightness, symmetry, angles, etc. Due to these
properties, this method can detect the similar properties in a certain image and locate the
position of license plate regions. The third approach was texture-based. In this approach,
a VLP was considered as an object with different textures and frames
[1, 7]
. The texture
window frames of different sizes were used to detect plate-candidates. Each candidate
was passed to a classifier to confirm whether it is a plate or not. This approach was
commonly used in finding text in images tasks. In addition, there have been a number of
other methods relating to this problem focusing on detecting VLP in video data (objects
appear in a chain of sequent images).
The fourth approach was based on statistical properties of text
[7]
. In this approach, text
regions were discovered using statistical properties of text like the variance of gray level,
number of edges, edge densities in the region, etc. This approach was commonly used in
Page | 7
finding text in images, and could well be used for discovering and designating candidate
number plate areas as they include alphabets and numerals [8].
In addition, there have been a number of other methods relating to this problem focusing
on detecting VLP using AI and genetic algorithms
[1, 9]
. These systems used edge
detection and edge statistics and then artificial intelligence techniques to detect the
location of the number plate-designate area. All of the systems discussed above have
some kind of limitations for example they are plate size dependent, color dependent,
work only in certain conditions or environment like indoor images etc.
2.2. Feature of Nepali Vehicle License Plate
License plate is the unique identification number provide to each vehicle. Its registration
number is binded with the chasis number of the vehicle. The VNP's number is issued by
the zonal-level Transport Management Office, a government agency under the
Department of Transport Management
[10]
. The vehicle number plates are placed in the
front as well as back of the vehicle. The plates are required to be either in Devanagari or
Latin script. In practice, the registration plates of Nepal are bilingual. As per the latest
guidelines issued by the Traffic Police Division, the plate must not be reflective and
digitally printed [10].
The vehicle are provided with 6 major categories and 4 identifier and 2 physical form.
For the purpose of vehicle registration Vehicle & Transport Management Act, 2049
(1992) and Vehicle & Transport Management Rule, 2054 (1997) of Nepal, classifies
vehicles into the following 5 main categories on the basis of size and capacity:

Heavy and medium-sized vehicle: This includes bus, truck, dozer, dumper,
loader, crane, Fire engine, tanker, roller, pick-up, van, mini bus, mini truck,
minivan etc. having the capacity to carry more than 14 people (for passenger
vehicle) or more than 4 tons (for cargo vehicle).

Light vehicle: This includes car, jeep, van, pick-up, micro bus, etc. having the
capacity to carry less than 24 people or less than 4 tons.
Page | 8

Two-wheeler: This includes vehicle having two wheels like motor cycle, scooter
etc.

Tractor and power-trailer:

Three-wheeler: This includes vehicle having three wheels like tempo, autorickshaw etc.
The above mentioned each categories are further divided into 5 sub categories on the
basis of ownership and service-type which are as follows:
Table 2-1: Major Categories of Vehicle
Type of vehicle
Heavy size
Middle size
Motorcycle, scooter
Government
ग
झ
ब
Private
क
च
प, त
Local
ख
ज
थ
Tourist
य
य
Government Organization/
घ
ञ
सि डी
सि डी
Institution
Diplomatic
Constitutional

सि डी
झ
Private vehicle: The vehicles which are for entirely personal purpose and uses a
red license plate with the letters written in white.

Public vehicle

Government vehicle: The vehicles owned by the government agencies and
constitutional bodies such as ministries, departments, directorates, along with
Page | 9
police, military etc. falls under this category which uses white plate with the
letters written in red.

National Corporation vehicle: The vehicles which are registered under the name
of public corporations fully or partially owned by the government falls under this
category. These vehicles uses yellow plate with the letters written in blue.

Tourist vehicle
The vehicles are provide with the letter as shown in the Table 2-1: Major categories of
the vehicle so that the visual distinction can be made. Every development region in
Nepal follows the same trend for vehicle classification except the region has own
identifier described below.
The 4 identifiers are:
The license plate of Nepal is more detailed in comparison with other countries. The
current license plate format for vehicles in Nepal consists of 4 parts composed of letters
and digits in the LL NN LL NNNN format:
1
2
3
4
Figure 2-1: Vehicle Identifier
The identifier of the vehicle shown in Figure 2-1 is described as:

The first part indicate the zone code, signifying the zone in which the vehicle is
registered.

The second part is the set number which is prefixed when the four digit number runs
out from the last part.
Page | 10

The third part indicate the type of vehicle like private, public, governmental, national
corporation, tourist etc. as well as the class of vehicle like two-wheeler, light vehicle,
heavy and medium-sized vehicle etc.

The last part signifies four digits running in sequence.

The color of the background and foreground represent the owned type of vehicle like
public, private, government, diplomatic vehicle, etc.
All 14 zones of Nepal have their own abbreviated code for reference purpose. These
codes are normally single letter in Nepali and two letters (sometimes three letters also,
but the third letter 'a' can be omitted) in English which are shown in the Figure 2-1 [4].
The two physical form of VLP are:
The physical standard of the VLP is determined by Ministry of Infrastructure and
Transport. As per the Vehicle and Transport Management Rule (VTMR), the two
physical form are shown in the Figure 2-2 & Figure 2-3.
I.
The VLP present in the 4:3 ratio.
Figure 2-2: VLP in 4:3 ratio
II.
VLP present in 4:1 ratio
Figure 2-3: VLP in 4:1 ratio
Page | 11
According the VMTR, the character and number are written inside the number plate by
leaving ½ inch of space around the border for heavy and middle size four wheeler as for
two wheeler bikes, scooter the space are made at ¼ inch. The distance between the
number and the character must be ¼ inch and the distance between upper line and lower
line of character must be ½ inch [10].
2.3. Image Processing
2.3.1. Image Acquisition and Preprocessing
We have image file as problem domain. Image Acquisition is the first step in a VLPR
system and there are a number of ways to acquire images, the current literature discusses
different image acquisition methods used by various authors. Yan et. al. used an image
acquisition card that converts video signals to digital images based on some hardware-based
image preprocessing
[6]
. Naito et. al. developed a sensing system, which uses two CCDs
(Charge Coupled Devices) and a prism to split an incident ray into two lights with different
intensities
[7, 8 9]
. To do so requires the imaging sensor and capability to digitalize the
signal produced by the sensor. After digital image has been obtained, next step is to deal
with pre-processing of the image. The main purpose of the pre-processing is to increase
the efficiency of other processes. Pre-processing deals with the processes like image
enhancement, noise reduction, histogram equalization, edge detection, binarization of the
image.
2.3.2. Plate Localization
The next step is characterized as vehicle license plate (VLP) localization. VLP
localization is concerned with the finding the position of license plate in the captured
image. It is also known as license plate detection. The main idea of VLP localization is
to extract the license plate from whole image which is used in next step.
It is the most important phase in a VLPR system. This section discusses some of the
previous work done during the extraction phase. Hontani et. al. proposed a method for
Page | 12
extracting characters without prior knowledge of their position and size in the image
[11]
.
The technique is based on scale shape analysis, which in turn is based on the assumption
that, characters have line-type shapes locally and blob-type shapes globally. In the scale
shape analysis, Gaussian filters at various scales blur the given image and larger size
shapes appear at larger scales. To detect these scales the idea of principal curvature plane
is introduced. By means of normalized principal curvatures, characteristic points are
extracted from the scale space x-y-t. The position (x, y) indicates the position of the
figure and the scale t indicates the inherent characteristic size of corresponding figures.
All these characteristic points enable the extraction of the figure from the given image
that has line-type shapes locally and blob-type shapes globally. Kim et. al.
[12]
used two
Neural Network-based filters and a post processor to combine two filtered images in
order to locate the license plates.
The two Neural Networks used are vertical and horizontal filters, which examine small
windows of vertical and horizontal cross sections of an image and decide whether each
window contains a license plate. Cross-sections have sufficient information for
distinguishing a plate from the background. Lee et. al.
[13]
and Park et. al.
[14]
devised a
method to extract Korean license plate depending on the color of the plate. A Korean
license plate is composed of two different colors, one for characters and other for
background and depending on this they are divided into three categories. In this method a
neural network is used for extracting color of a pixel by HLS (Hue, Lightness and
Saturation) values of eight neighboring pixels and a node of maximum value is chosen as
a representative color. After every pixel of input image is converted into one of the four
groups, horizontal and vertical histogram of white, red and green (i.e. Korean plates
contains white, red and green colors) are calculated to extract a plate region. To select a
probable plate region horizontal to vertical ratio of plate is used. Dong et. al [15] presented
histogram based approach for the extraction phase. Kim G. M
[16]
used Hough transform
for the extraction of the license plate. The algorithm behind the method consists of five
steps. The first step is to threshold the 14 gray scale source image, which leads to a
binary image. Then in the second stage the resulting image is passed through two parallel
sequences, in order to extract horizontal and vertical line segments respectively. The
result is an image with edges highlighted. In the third step the resultant image is then
Page | 13
used as input to the Hough transform, this produces a list of lines in the form of
accumulator cells. In fourth step, the above cells are then analyzed and line segments are
computed. Finally the list of horizontal and vertical line segments is combined and any
rectangular regions matching the dimensions of a license plate are kept as candidate
regions. The disadvantage is that, this method requires huge memory and is
computationally expensive.
2.3.3. Segmentation
The next step is characterized as segmentation of license plate. Segmentation carried out
in two steps. The first one deals with separation of license plate in two consecutive rows
of two-row license plate such that each row consists of series of Devanagari numbers
with the application of horizontal projection. The second step is separation of each
character of the license plate. After that another process comes into account that is
enhancement of segmentation. The segmentation of plate contains beside the characters
also undesirable such as dots and stretches as well as redundant spaces on the sides of
characters. There is need to deals with these problems in segmentation.
Many different approaches have been proposed in the literature and some of them are as
follows, Nieuwoudt et. al. [17] used region growing for segmentation of characters. The basic
idea behind region growing is to identify one or more criteria that are characteristic for the
desired region. After establishing the criteria, the image is searched for any pixels that fulfill
the requirements. Whenever such a pixel is encountered, its neighbors are checked, and if
any of the neighbors also match the criteria, both the pixels are considered as belonging to
the same region. Morel et. al.
[18]
used partial differential equations (PDE) based technique;
Neural network and fuzzy logic were adopted in for segmentation into individual characters.
2.4. Feature Extraction
The purpose of feature extraction is the measurement of those attributes of patterns that
are most pertinent to a given classification task. The task of the human expert is to select
or invent features that allow effective and efficient recognition of patterns. All images
Page | 14
are down sampled before being used. This prevents the neural network from being
confused by size and position. By down sampling the image down to a consistent size, it
will not matter how large the letter, as the down sampled image will always remain a
consistent size. Down-sampling involves taking the image from a larger resolution to a
24*24resolution. To see how to reduce an image to 24*24, think of an imaginary grid
being drawn over top of the high resolution image. This divides the image into
quadrants, 24 across and 24 down. If any pixel on a region is filled, then the
corresponding pixel in the 24*24 down sampled image is also filled it. The skeleton of
the image may contain several numbers of pixels. So the image is down sampled to 576
pixels of 24*24 grids. These 576 pixels values represent the pattern of the image.
In the literature of this section, a lot of work has been done for feature extraction of the
segmented image of the number plate. We can have many different features that can be
extracted from segmented character images [19, 20, 21, 22].
There are three major categories of feature extraction techniques:
o Geometrical and Topological Features: Extracting and Counting Topological
Structures, Geometrical Properties, Coding, Graphs, Trees, Strokes, Chain Codes etc.
o Statistical Features: Zoning, Crossing and Distances, Projections, Distribution
measures, etc.
o Global Transformation and Series Expansion Features: Fourier Transform, Cosine
o Transform, wavelets, Moments, Karhuen-Loeve Expansion, etc.
2.5. Neural Network
Computers can perform many operations considerably faster than a human being. Yet
there are many tasks where the computer falls considerably short of its human
counterpart. There are numerous of this. Given two pictures, a preschool child could
easily tell the difference between a cat and dog. Yet this same simple task would
confound today's computers.
Page | 15
Artificial intelligence (AI) is the field of computer science that attempts to give computer
human like abilities. One of the primary means by which computers are endowed with
humanlike abilities is through the use of a neural network. The human brain is the
ultimate example of a neural network. The human brain consists of a network of over a
hundred billion interconnected neurons. Neurons are individual cells that can process
small amount of information and then activates other neurons to continue the process.
A work is done on Devanagari optical character recognition by Anil et. al.
[23]
shows the
use of feed-forward for training the data and back-propagation for recognition of
characters. The paper by Dong Xiao Ni
[24]
uses the basic biological neuron and the
artificial computation model; outlines network architectures and learning processes and
multilayer feed-forward network is used for optical character recognition. It uses neural
network as a powerful data modeling tool that is able to capture and represent complex
input/output relationships. The paper by Anne et. al.
[25]
uses MLF neural networks ,
trained with a back-propagation learning algorithm, the most popular neuralnetworks
which is applied to a wide variety of problems. Bishnu Chaulagain et. al . uses Hidden
Markov Model (HMM) for neural network
[26, 27]
. The paper also shows the use of
Tesseract engine for character recognition in Devanagari script.
It is not possible to find weights which enable Single Layer Perceptron to deal with nonlinearly separable problems like XOR. However, Multi-Layer perceptions (MLPs) are
able to cope with non-linearly separable problems. Minsky & Papert (1969) offered
solution to XOR problem by combining perceptron unit responses using a second layer
of units [28].
The most common neural network model is known as a supervised network because it
requires a desired output in order to learn
[24]
. The binary data is then fed into a neural
network that has been trained to make the association between the character image data
and a numeric value that corresponds to the character. The output from the neural
network is then translated into ASCII text and saved as a file [24].
Page | 16
3. METHODOLOGY
Page | 17
3.1. Generic Description
The general overview of the VLPD-R system is shown in the Figure 3-1 and Figure 3-2.
The system includes the two cases:
i)
Localization of license plate
ii)
Recognition of character from the plate
And these are the main goal of the system. In the first case, the user provides the images
and the system perform processing with the output of the segmented number plate from
the provided image. The second case is about the neural network learning and training.
Teach pattern is another sub use case. This is included dependency of the use case train
neural network. This means teaching a neural network must include some kind of pattern
teaching. Then character recognition use case comes. This goal is accomplished by a sub
goal which includes some kind of pattern recalling. Thus character recognition system
works. Neural network system is secondary actor whose main purpose is to interact with
overall system.
The main program is represented by the Figure 3-3. Our main goal is to obtain the
character from the segmented part of the image. It has the three main sub cases namely
image processing, localization and optical character recognition. The final output of the
system is user readable in the text format.
Figure 3-1: Use Case Diagram for VLP Segmentation.
Page | 18
Figure 3-2: Use Case Diagram of Optical Character Recognition
Figure 3-3: Use Case Diagram for Main Program
3.2. System Design
The data flow diagram (DFD) – level 0 of the system is shown in the Figure 3-4. Further
the DFD is expanded to level 1 for different process: preprocessing, training and
recognition as shown in Figure 3-5, Figure 3-6 and Figure 3-7 respectively.
Page | 19
Scanner
Digital
Image
1. Pre-processing
of Image
Enhancement
on Image
2. VLP
Localization
USER
Normalized
Image data
Extraction of
VLP
Weights
3. Character
Extraction
4. Train
5. Trained
data
Normalized
Image data
6.
Recognize
Weights
Result
Readable
Text
USER
7. Output
Figure 3-4: Level 0 DFD of the System
Page | 20
Digital Image
1.1
Grey Scaling
Grey Scaled Image
1.2
Histogram Equalization
Uniform Pixel Density
Distribution in image
1.3
Sobel Filter
(Horizontal and Vertical)
Obtain Horizontal and
Vertical edges in image
1.4
Binarization
Obtain the Image in
0’s and 1’s
1.5
Horizontal and vertical
Projection
Obtain Square Shaped Images
1.6
Segmentation
Obtain VLP from the
Image
Figure 3-5: Level 1 DFD for Process "Preprocessing"
Page | 21
Normalized Image
4.1
Get Normalized
Image
Image in matrix form
4.2
Initialize weight for
first time
Weight matrix
4.3
Calculate output by backward
propagation
Calculated
output
4.4
Compare teacher output to
calculated output
4.5
Back propagate error
USER
Teacher output
Error
Trained Data
(weights)
Change in weight
Trained data (weight)
4.6
Adjust weights
Figure 3-6: Level 1 DFD of process "Training"
Page | 22
Normalized Image
6.1
Get Normalized Image
Image in matrix
form
Trained Data(weights)
Weight matrix
form
6.2
Calculate Output by back
propagation
Calculated output
6.3
Make Decision
Result
Figure 3-7: Level 1 DFD for Process "Recognition"
Page | 23
3.3. Technical Description
The technical description of the system is shown in the Figure 3.8.
Scanned Image
Pre-processing
Segmentation
Feature Extraction
Neural Network
Testing
Training
Output
Recognized Text
Figure 3-8: Technical Description of the System
Page | 24
3.3.1. Pre-processing and VLP Localization
Image taken from camera is processed by the preprocessing module. The purpose of this
module is to enrich the edge features. Because our detection method is based on the
boundary features, it improves the successful rate of the VLP Localization module. The
algorithms which are used in this module can be sequentially stated as graying,
normalization and histogram equalization. After having a gray scaled image, we have
used Sobel filters to extract the edging image, and threshold the image to the binary one.
We have used the local adaptive threshold algorithm for binarization of image. The
resulted image is used as input for the VLP localization module [21, 29].
In VLP Localization step, detection of a number plate area is carried out. This
problematic includes algorithms that are able to detect a rectangular area of the number
plate in an original image. Humans define a number plate in a natural language as a
“small plastic or metal plate attached to a vehicle for official identification purposes”,
but machines do not understand this definition as well as they do not understand what
“vehicle”, “road”, or whatever else is. Because of this, there is a need to find an
alternative definition of a number plate based on descriptors that will be comprehensible
for machines.
Let us define the number plate as a “rectangular area with increased occurrence of
horizontal and vertical edges”. The high density of horizontal and vertical edges on a
small area is in many cases caused by contrast characters of a number plate, but not in
every case. This process can sometimes detect a wrong area that does not correspond to a
number plate. Because of this, we often detect several candidates for the plate by this
algorithm, and then we choose the best one by a further heuristic analysis [2, 21, 29,].
Page | 25
3.3.1.1.
Edge Detection
We can use a periodical convolution of the function f(x,y), where x and y are spatial
coordinates and f(x,y) is intensity of light at that point of the image, with specific types
of matrices m to detect various types of edges in an image
𝑤−1 ℎ−1
𝑓`(𝑥, 𝑦) = 𝑓(𝑥, 𝑦) ∗ 𝑚[𝑥, 𝑦] = ∑ ∑ 𝑓(𝑥, 𝑦). 𝑚[𝑚𝑜𝑑𝑤 (𝑥 − 𝑖), 𝑚𝑜𝑑ℎ (𝑦 − 𝑗)]
𝑖=0 𝑗=0
………………….. (3.1)
Where w and h are dimension of image represented by the function f and m[x,y]
represent the element in xth rows and jth column of the matrix m.
Horizontal and vertical edge detection
To detect horizontal and vertical edges, we convolve source image with matrices mhe and
mve. The convolution matrices are usually much smaller than the actual image. Also, we
can use bigger matrices to detect rougher edges.
−1 −1 −1
mhe= 0
0
0
1
1
1
−1 0 1
mve= −1 0 1
−1 0 1
Sobel edge detector
The Sobel edge detector uses a pair of 3x3 convolution matrices. The first is dedicated
for evaluation of vertical edges, and the second for evaluation of horizontal edges.
−1 −2 −1
Gx = 0
0
0
1
2
1
−1 0 1
Gy = −2 0 2
−1 0 1
The magnitude of the affected pixel is then calculated using the formula, G = √𝐺𝑥2 + 𝐺𝑦2
In praxis, it is faster to calculate only an approximate magnitude as, G =| 𝐺𝑥 | + |𝐺𝑦 |.
Page | 26
Horizontal and Vertical projection
The vertical projection of the image is a graph, which represents an overall magnitude of
the image according to the axis y. The vertical projection of so transformed image can be
used for a vertical localization of the number plate. The horizontal projection represents
an overall magnitude of the image mapped to the axis x.
We can mathematically define the horizontal and vertical projection as:
𝑝𝑥 (𝑥) = ∑ℎ−1
𝑗=0 𝑓(𝑥, 𝑗) ……………………………………………………………… (3.2)
𝑝𝑦 (𝑦) = ∑𝑤−1
𝑖=0 𝑓(𝑖, 𝑦) ……………………………………………………………... (3.3)
3.3.1.2.
Double-phase statistical image analysis
The statistical image analysis consists of two phases. The first phase covers the detection
of a wider area of the number plate. The output of double-phase analysis is an exact area
of the number plate.
The detection of the number plate area consists of a “band clipping” and a “plate
clipping”. The band clipping is an operation, which is used to detect and clip the vertical
area of the number plate (so-called band) by analysis of the vertical projection of the
snapshot. The plate clipping is a consequent operation, which is used to detect and clip
the plate from the band (not from the whole snapshot) by a horizontal analysis of such
band.
Snapshot
Assume the snapshot is represented by a function f(x,y), where 𝑥0 <𝑥<𝑥1 and 𝑦0 <𝑦<𝑦1
The [𝑥0 ,𝑦0 ] represents the upper left corner of the snapshot, and [𝑥1 ,𝑦1 ] represents the
bottom right corner. If w and h are dimensions of the snapshot, then 𝑥0 = 0, 𝑦0 = 0, 𝑥1 =
𝑤 − 1 and𝑦1 = ℎ − 1.
Page | 27
Band
The band b in the snapshot f is an arbitrary rectangle𝑏 = (𝑥𝑏0 , 𝑦𝑏0 , 𝑥𝑏1 , 𝑦𝑏1 ), such as:
(𝑥𝑏0 = 𝑥𝑚𝑖𝑛 ) and (𝑥𝑏1 = 𝑥𝑚𝑎𝑥 ) and (𝑦𝑚𝑖𝑛 ≤ 𝑦𝑏0 ≤ 𝑦𝑏1 ≤ 𝑦𝑚𝑎𝑥 )
Plate
Similarly, the plate p in the band b is an arbitrary rectangle 𝑝 = (𝑥𝑝0 , 𝑦𝑝0 , 𝑥𝑝1 , 𝑦𝑝1 ), such
as:
(𝑦𝑝0 = 𝑦𝑏0 ) and (𝑦𝑝1 = 𝑦𝑏1) and (𝑥𝑏0 ≤ 𝑥𝑝0 ≤ 𝑥𝑝1 ≤ 𝑥𝑏1 )
Band clipping
The band clipping is a vertical selection of the snapshot according to the analysis of a
graph of vertical projection. If h is the height of analyzed image, the corresponding
vertical projection 𝑝𝑦𝑟 (𝑦) contains h values, such as y € < 0; h-1 >.
The fundamental problem of analysis is to compute peaks in the graph of vertical
projection. The peaks correspond to the bands with possible candidates for number
plates. The maximum value of 𝑝𝑦 (𝑦) corresponding to the axle of band can be computed
as:
𝑦𝑏𝑚 = arg 𝑚𝑎𝑥𝑦0≤𝑦≤𝑦1 {𝑝𝑦 (𝑦) } ………………………………... (3.4)
The 𝑦𝑏0 and 𝑦𝑏1 are coordinates of band, which can be detected as:
𝑦𝑏0 = 𝑚𝑎𝑥𝑦0≤𝑦≤𝑦𝑏𝑚 {𝑦 | 𝑝𝑦 (𝑦) ≤ 𝑐𝑦 . 𝑝𝑦 ( 𝑦𝑏𝑚 ) }………….. ….. (3.5)
𝑦𝑏1 = 𝑚𝑎𝑥𝑦𝑏𝑚≤𝑦≤ 𝑦1 {𝑦 | 𝑝𝑦 (𝑦) ≤ 𝑐𝑦 . 𝑝𝑦 ( 𝑦𝑏𝑚 ) }……………..... (3.6)
𝑐𝑦 is a constant, which is used to determine the foot of peak 𝑦𝑏𝑚 . In praxis, the constant
is calibrated to 𝑐1 = 0.55 for the first phase of detection, and 𝑐2 = 0.42 for second
phase.
Page | 28
3.3.1.3.
Segmentation
Segmentation is a classifier which helps to separate each character and line from the
given image with the proper definition of boundaries. In VLPD, segmentation is one of
the important step for automatic number plate recognition, because the further processing
of image depends in it. In segmentation, boundaries are defined and each lines and
character is separated so that character-wise manipulation on the license plate can be
done.
Horizontal Segmentation:
It is the method for horizontal separation of the image for the separation of lines in the
license plate. Normally, VLP in the back side are provided with 2 rows, upper row
containing the development region, available numbers of vehicle and its type and lower
row containing the numeric value. The image document is horizontally projected so that
the band of minima can be obtained describing the region for splitting.
Figure 3-9: Horizontal Projection of VLPD
The minima obtained after projection is the point required for the intersection of the
word. The horizontal segmentation separate the upper row and lower row as shown in the
Figure 3-9.
Page | 29
Vertical segmentation:
It is the method for determining the character of the image from the line of the words.
The vertical projection of the image as shown in the Figure …… will provide the lowest
point for determining the separation of word. Since the density value for the pitch black
pixel is 0 and for white is 255. The vertical projection will calculate the column wise
pixel density and multiple minimum points for segmentation is obtained. The region
between the two minimum points is the required region for segmentation of the image.
The process is same as the horizontal segmentation but the difference is here character is
segmented and ready for the input in the neural network.
Figure 3-10: Vertical Projection of the Image
3.3.2. Thinning
Thinning is the process of peeling off a pattern as many pixels as possible
without affecting the general shape of the pattern. In other words, after pixels have been
peeled off, the pattern can still be recognized.
Page | 30
Zhang-Suen Thinning Algorithm
This algorithm has two steps, which will be successively applied to the image
[30]
. In
each step contour points of the region that can be deleted are identified. Contour
points are defined as points that have value “1” and they have at least one 8-neighbor
pixel value equal to “0.”
Step 1: Pixel I(i,j) is marked for deletion if ALL of the following 4 conditions are true
1. Its connectivity = 1
2. Has at least 2 object neighbors and not more than 6
3. At least one of P2, P4, P6are background
4. At least one of P4, P6, P8are background
5. Delete marked
Step 2: Same as first except rules 3 and 4 are changed
3. At least one of P2, P4, P8are background
4. At least one of P2, P6, P8are background
5. Delete Marked
If at the end of any sub-iteration, there are no pixels to be deleted, the skeleton is
complete. First sub-iteration, as shown in Figure3-11, removes south or east boundary
pixels or north-west corner pixels.
Figure 3-11: First Sub-iteration
Page | 31
Second sub-iteration, as shown in Figure 3-12, removes north or west boundary pixels or
south-east corner pixels
Figure 3-12: Second Sub-iteration
Each of the above steps reduces the amount of the information to be processed by a
feature extraction. Figure 3-13 shows the result of Zhang-Suen thinning algorithm.
Figure 3-13: Original pattern and Skeleton as a result of Zhang-Suen thinning
algorithm
3.3.3. Feature Extraction
Information contained in a bitmap representation of an image is not suitable for
processing by computers. Because of this, there is need to describe a character in another
way. The description of the character should be invariant towards the used font type, or
deformations caused by a skew. In addition, all instances of the same character should
have a similar description. A description of the character is a vector of numeral values,
so-called “descriptors”, or “patterns”:
Page | 32
Generally, the description of an image region is based on its internal and external
representation. The internal representation of an image is based on its regional
properties, such as color or texture. The external representation is chosen when the
primary focus is on shape characteristics. The description of normalized characters is
based on its external characteristics because we deal only with properties such as
character shape. Then, the vector of descriptors includes characteristics such as number
of lines, bays, lakes, the amount of horizontal, vertical and diagonal or diagonal edges,
and etc. The feature extraction is a process of transformation of data from a bitmap
representation into a form of descriptors, which are more suitable for computers.
3.3.3.1. Fast Fourier Transform (FFT)
FFT is one of the efficient method for extracting the signature of an image. It
decomposes an image into its real and imaginary components which is a representation
of the image in the frequency domain. If the input signal is an image then the number of
frequencies in the frequency domain is equal to the number of pixels in the image or
spatial domain. The inverse transform re-transforms the frequencies to the image in the
spatial domain. The FFT of a 2D image are given by the following equations:
𝑛
−𝑖2𝜋(𝑥 )
𝑁
𝐹(𝑥) = ∑𝑁−1
𝑛=0 𝑓(𝑛)𝑒
……………………………………………… (3.6)
The Fourier transform of an image is a simple extension of the 1-D Fourier transform
into two dimensions, and is achieved by simply applying the 1-D transform to each row
of an image, and then transforming each column of the resulting image. It produces
essentially the same thing. A picture of smooth water waves travelling in a diagonal
direction will transform to a series of spikes along that same diagonal.
The Fourier transform is defined over continuous functions. The FFT is a technique for
efficiently evaluating the Fourier transform over discrete sets of data.
3.3.3.2. Density over Different Zones
Page | 33
The density of each pixel over various region defines the structure of the image. The
image is splitted into different zones as per the need for the accuracy. The system over
here splits the image into 8 different region i.e. 24*24 image into 3*3 image and the
density of each region is calculated. It will provide the data of 8 different region which is
assumed to be nearer or equivalent for the target value.
3.3.3.3.
Area of Image
The area of the image is the count of the number of pixel for which 𝑓(𝑥, 𝑦) = 1. The
area of the image for the constant size image is almost invariable due to which the
properties is one of the reason for defining the image properties [8].
3.3.3.4.
Moment Invariants
Moment invariants are important tools in object recognition problem. These techniques
grab the property of image intensity function. Moment invariants were first introduced to
the pattern recognition community in 1962 by Hu
[29]
, who employed the results of the
theory of algebraic invariants and derived his seven famous invariants to rotation of 2-D
objects. Moment invariants used in this research for extracting statistical patterns of
character images are given in
[10]
. Moment invariants are pure statistical measures of the
pixel distribution around the center of gravity of the character and allow capturing the
global character shape information.
The standard moments mpq of order (p+q) of an image intensity function f(x;y) is given
by,
…………………... (3.7)
A uniqueness theorem states that if f(x;y) is piecewise continues and has non-zero values
only in a finite part of the xvis ; yvis plane, moments of all order exist and the moment
Page | 34
sequence (mpq) is uniquely determined by f(x;y). Conversely, (mpq) is uniquely
determines f(x;y).
For discrete domain, the 2-D moment of order (p+q) for a digital image f(x;y) of size
M x N is given by,
……………….. (3.8)
The corresponding central moment of order (p+q) is defined as,
………... (3.9)
Where,
………………………………….. (3.10)
The normalized central moments, denoted by ɳpq , are defined as,
………………………………………………… (3.11)
Where,
……………………………. (3.12)
A set of seven invariant moments can be derived from the second and third moments
[23]
which are invariant to translation, scale change, mirroring, and rotation, are given as
follows.
Page | 35
…………………………………………………….. (3.13a)
………………………………………… (3.13b)
………………………………... (3.13c)
……………………………… (3.13d)
…… (3.13e)
……………………. (3.13f)
………. (3.13g)
3.4. Artificial Neural Network (ANN)
ANN is a non-linear, parallel, distributed, highly connected network having capability of
adaptivity, self-organization, fault tolerance and evidential response, which closely
resembles with physical nervous system. Physical nervous system is highly parallel,
distributed information processing system having high degree of connectivity with
capability of self-learning.
f1(e)
X1
f2(e)
f5(e)
f3(e)
X2
f6(e)
f4(e)
Page | 36
Figure 3-14: Feed Forward Multilayer Perceptron
3.4.1. Multilayer Perceptron
It consists of multiple layers of computational units, usually interconnected in a feedforward way. Each neuron in one layer has directed connections to the neurons of the
subsequent layer. It consists of a layer of input units, one or more layers of hidden units,
and one layer of output units shown in Figure 3-14 and 3-15. The output from each layer
is the weighted linear summation of all input vectors along with the bias term, passed
through some activation function. The network weight adjustment is done by backpropagating the error of the network.
Weight
U1
Adder
Activation
W1
U2
W2
Function
∑
Ф
Output
Y
U3
W3
Figure 3-15: Operation on Layer's Node
3.4.2. Feedforward Back Propagation Neural Network
A Feedforward Neural Network (FFNet) is a biologically inspired classification
algorithm. It consist of a (possibly large) number of simple neuron-like processing units,
organized in layers. Every unit in a layer is connected with all the units in the previous
layer. These connections are not all equal, each connection may have a different strength
Page | 37
or weight. The weights on these connections encode the knowledge of a network. Often
the units in a neural network are also called nodes.
Data enters at the inputs and passes through the network, layer by layer, until it arrives at
the outputs. During normal operation, that is when it acts as a classifier, there is no
feedback between layers. This is why they are called FFNet.
3.4.3. Backpropagation
Backpropagation, an abbreviation for "backward propagation of errors", is a common
method of training artificial neural networks. From a desired output, the network learns
from many inputs, similar to the way a child learns to identify a dog from examples of
dogs.
It is a supervised learning method, and is a generalization of the delta rule. It requires a
dataset of the desired output for many inputs, making up the training set. It is most useful
for feed-forward networks (networks that have no feedback, or simply, that have no
connections that loop). Backpropagation requires that the activation function used by
the artificial neurons (or "nodes") be differentiable.
For better understanding, the Backpropagation learning algorithm can be divided into
two phases: propagation and weight update.
Phase 1: Propagation
Each propagation involves the following steps:
1. Forward propagation of a training pattern's input through the neural network in order
to generate the propagation's output activations.
2. Backward propagation of the propagation's output activations through the neural
network using the training pattern target in order to generate the deltas of all output
and hidden neurons.
Page | 38
Phase 2: Weight update
For each weight-synapse follow the following steps:
1. Multiply its output delta and input activation to get the gradient of the weight.
2. Bring the weight in the opposite direction of the gradient by subtracting a ratio of it
from the weight.
This ratio influences the speed and quality of learning; it is called the learning rate. The
sign of the gradient of a weight indicates where the error is increasing, this is why the
weight must be updated in the opposite direction.
3.4.4. Training FFNet
The FFNet uses a supervised learning algorithm: besides the input pattern, the neural net
also needs to know to what category the pattern belongs. Learning proceeds as follows: a
pattern is presented at the inputs. The pattern will be transformed in its passage through
the layers of the network until it reaches the output layer. The units in the output layer all
belong to a different category. The outputs of the network as they are now are compared
with the outputs as they ideally would have been if this pattern were correctly classified:
in the latter case the unit with the correct category would have had the largest output
value and the output values of the other output units would have been very small. On the
basis of this comparison all the connection weights are modified a little bit to guarantee
that, the next time this same pattern is presented at the inputs, the value of the output unit
that corresponds with the correct category is a little bit higher than it is now and that, at
the same time, the output values of all the other incorrect outputs are a little bit lower
than they are now. (The differences between the actual outputs and the idealized outputs
are propagated back from the top layer to lower layers to be used at these layers to
modify connection weights. This is why the term backpropagation network is also often
used to describe this type of neural network.
3.5. Training Neural Network
Page | 39
The individual neurons that make up a neural network are interconnected through the
synapses. These connections allow the neurons to signal each other as information is
processed. Not all connections are equal. Each connection is assigned a connection
weight. If there is no connection between two neurons, then their connection weight is
zero. These weights are what determine the output of the neural network. Therefore, it
can be said that the connection weights form the memory of the neural network.
Training is the process by which these connections weights are assigned. Most training
algorithm begins by assigning random numbers to the weights matrix. Then the validity
of the neural network is examined. Next the weights are adjusted based on how valid the
neural network performed. This process is repeated until the validation error is within an
acceptable limit. The type of learning in neural networks is determined by the manner in
which the parameter changes. There are many ways to train neural networks.
3.5.1. Supervised Learning
In a supervised learning process, the adjustment of weights is done under the supervision
of a teacher or ideal output, that is, precise information about the desired or correct
network output is available from a teacher when given a specific input pattern.
3.5.2. Error Correction Learning
The goal is to minimize the cost to correct the errors. This leads to the well-known delta
rule (or Widrow-Hoff rule), which is stated as the adjustment made to a synaptic weight
of a neuron is proportional to the product of the error signal and the input signal of the
synapse in question
3.6. Validating Neural Networks
Once a neural network has been trained, it must be evaluated to see if it is ready for
actual use. This is important so that it can be determined if additional training is required.
Page | 40
To correctly validate a neural network, validation data must be aside that is completely
separate from the training data.
Page | 41
4. IMPLEMENTATION
Page | 42
4.1. Global Thresholding
The basic global threshold, T is calculated as follows:

Select an initial estimate for T (typically the average grey level in the image)

Segment the image using T to produce two groups of pixels, G1 consisting of
pixels with grey levels > T and G2 consisting pixels with grey levels < T

Compute the average grey levels of pixels in G1 to give µ1 and G2 to give µ2.

Compute a new threshold value
𝑇=

µ1+ µ2
2
………………………………………………………………… (4.1)
Repeat steps 2 to 4 until the difference in T in successive iterations is less than a
predefined limit T.
4.2. Region Based Segmentation (Horizontal and Vertical)
The goal of the segmentation algorithm is to find peaks, which correspond to the spaces
between characters. At first, there is a need to define several important values in a graph
of the horizontal projection px(X):

Vm – The maximum value contained in the horizontal/ vertical projection px(x).

Va – The average value of the horizontal projection px(x).
The algorithm of segmentation iteratively finds the maximum peak in the graph of
vertical/ horizontal projection. The peak is treated as a space between characters, if it
meets some additional conditions, such as height of peak. The algorithm then zeroizes
the peak and iteratively repeats this process until no further space is found. This principle
can be illustrated by the following steps:
1. Determine the index of the maximum value of horizontal projection.
Xm = arg {max (px)} 0≤x≤w
2. Detect the left foot and the right foot of the peak as:
Xl = max {x|px(x)≤ Cx . px(xm)}
Xr = min {x|px(x)≤ Cx . px(xm)}
3. Zeroize the horizontal projection px(xm) on inverval (xl,xr)
Page | 43
4. If px(xm)< Cw. Vm, go to step 7.
5. Divide the plate horizontally in the point xm.
6. Go to step 1.
7. End.
Two different constants have been used in the algorithm above. The constant x c is used
to determine foots of peak xm . The optimal value of cx is0.7. The constant cw determines
the minimum height of the peak related to the maximum value of the projection ( m v ). If
the height of the peak is below this minimum, the peak will not be considered as a space
between characters. It is important to choose a value of constant
w
c carefully. An
inadequate small value causes that too many peaks will be treated as spaces, and
characters will be improperly divided. A big value of w c causes that not all regular peaks
will be treated as spaces, and characters will be improperly merged together. The optimal
value of w c is 0.86. To ensure a proper behavior of the algorithm, constants x c and w c
should meet the following condition:
Where P is a set of all detected peaks m x with corresponding foots xl and xr.
4.3. Back Propagation
It is a supervised learning method, and is a generalization of the delta rule. It requires a
dataset of the desired output for many inputs, making up the training set. It is most useful
for feed-forward networks (networks that have no feedback, or simply, that have no
connections that loop). Backpropagation requires that the activation function used by
the artificial neurons (or "nodes") be differentiable.
1. Initialize the weights to small random values.
2. Feed the training sample through the network and determine the final output.
3. Compute the error for each output unit, for unit k it is: δk=(tk-yk)f'(y_ink) where
tk=Required Output yk=Actual Output f'(y_ink)=Derivative of f
4. Calculate the weight correction term for each output unit, for unit k it is:
Δwjk=αδkzj Where, α is a small constant zj is hidden layer signal.
Page | 44
5. Propagate the delta terms (errors) back through the weights of the hidden units
where the delta input for the jth hidden unit is : Δ_inj=Σk=1m δkwjk The delta term
for the jth hidden unit is: δj=δ_injf'(z_inj)
6. Calculate the weight correction term for the hidden units Δwij=α δjxi
7. Update the weights. wjk(new)=wjk(old)+ Δwjk
8. Test for stopping (maximum cycles, small changes etc.)
Page | 45
5. RESULT AND DISCUSSION
Page | 46
5.1 Result
The experiments conducted were targeted for the validation of the model for deployment.
The result was analyzed in various stages of the system. First stage of experiment was
carried out for license plate localization following with horizontal segmentation and
vertical segmentation. After that experiment for character recognition was carried out.
The experiments for the recognition of individual character of the license plate were
carried out in two different modes.
Training the System: In this mode, the errors in each iteration were analyzed. The main
issue of this experiment was to test whether the system is reaching to the stable state or
not.
Recognize the Characters: This mode of experiment was conducted to test whether the
model trained is capable of predicting the correct values or not.
5.2. Observation and Discussions
Table 5-1 shows accuracy of the system at different stages. The plate localization,
horizontal segmentation of localized license plate, vertical segmentation and finally
character recognition are considered as different stages of the system. Figure 5-1 shows
the accuracy of the system at different stages.
Table 5-1: Accuracy rate corresponding to different stages
Stage/Part
Accuracy
Plate Localization
67%
Horizontal Segmentation
98%
Vertical Segmentation
90%
Character Recognition
92%
Page | 47
Accuracy Rate
100%
80%
60%
40%
20%
0%
Plate Localization
Horizontal Segmentation Vertical Segmentation Character Recognition
Stages of the system
Figure 5-1: Accuracy Rate of different stages of the system
Table 5-2 shows the recognition rate of each character of the vehicle license plate
presented to the system. Figure 5-2 shows the graph of recognition accuracy rate of the
system corresponding to individual character.
Table 5-2: Recognition Result of individual character
Class
Accuracy Rate
Class
Accuracy Rate
0
97%
8
88%
1
85%
9
82%
2
82%
k
87.5%
3
97%
r
86.6%
4
94%
af
90.7%
5
84%
h
92.3%
6
86%
u
85.4%
7
85%
s
94.1%
Page | 48
Recognition Accuracy
100%
95%
90%
85%
80%
75%
70%
0
1
2
3
4
5
6
7 8 9
Character
k
r
af
h
u
s
Figure 5-2: Recognition accuracy of individual character
5.3. Output
Figure 5-3: Input Image for VLPR System
Figure 5-4: VLP Localization
Page | 49
Above Figure 5-3 indicate the image taken from camera for the VLPR system. The result
that shown in Figure 5-4 indicate the VLP localization. The area surrounded by blue
rectangle indicates the position of the VLP in the image.
Segmentation
Figure 5-6: VLP Horizontal Segmentation
Figure 5-5: VLP Vertical Segmentation
Page | 50
5. CONCLUSION AND FUTURE ENHANCEMENT
Page | 51
6.1. Conclusion
The project is an attempt to emphasize on the recognition of Nepali Vehicle License
Plate Detection and Recognition in the simplest possible manner with the use of image
processing techniques and neural network enable us to determine the best possible
output.
The system developed has capability to locate the license plate from the supplied image
and determine the character present in it. The various image processing techniques like
filtering, thinning, binarization, cropping, etc. help in determining the plate and
individual character’s image and its features are used as input to the neural network
which recognizes the character.
6.2. Future Enhancement
VLPD-R is a vast area of research. VLPD-R like project need deep research and
knowledge to address different problem associated with it in different conditions. The
system is optimized to work with straight image and in proper lighting condition of the
license plate image, skewness and low contrast in the image reduces accuracy rate of
plate localization. Moreover the similar color of vehicle and license plate creates
problem in detection of license plate. The present system requires series of manual input
which is not acceptable in real life implementation.
Hence, further work to improve the present system is needed. The system should be
made to work with any type of image in any lighting condition. Moreover system has to
build fully automatic so that it can easily deploy in real life for various purpose. The
system is working with only image; it will make to operate video too.
Page | 52
6. EPILOGUE
Page | 53
7.1. References
[1]
MukeshKumar,”A Real-Time Vehicle License Plate Recognition (LPR)
System”, A thesis report submitted for the completion of Master degree in
Electronics Instrumentation and control engineering, July 2009.
[2]
OndrejMartinsky, “Algorithmic and Mathematical Principles of Automatic
Number Plate Recognition System”, A thesis report submitted for the completion
of BSC to Faculty of Information Technology, BRNO University Of Technology,
August 2007.
[3]
Vehicle and Transport Management Act of Nepal, 2003.
[4]
An article Published on Himalayan times
[5]
Nepal
Traffic
Police,
License
Plate
Information
cited
from
http://traffic.nepalpolice.gov.np/other-notices/number-plate1.html
[6]
Tran DucDuan, Tran Le Hong Du, Tran VinhPhuoc, Nguyen Viet Hoang,
“Building an Automatic Vehicle License-Plate Recognition System”, Intl. Conf.
in Computer Science – RIVF’05, Feb 21-24, 2005, Can Tho, Vietnam.
[7]
An article on Tesseract based Nepali OCR - Resarch Report published on
http://nepalinux.org/index.php?option=com_content&task=view&id=46&Itemid
=53
[8]
Ashok Kumar Pant, Sanjeeb Prasad Panday and Prof. Dr. Shashidhar Ram Joshi,
"Off-line Nepali Handwritten Character RecognitionUsing Multilayer Perceptron
and Radial Basis Function Neural Networks".
[9]
Dr. Richard Spillman, “Artificial Intelligence”, PLU, Fall 2003
[10]
Gonzalez Woods and Eddins, “Digital Image Processing”, vol-3.
[11]
Oivind Due Trier, Anil K. Jain, and TorfinnTaxt, “Feature Extraction Methods
for Character Recognition – A Survey”, Pattern Recognition, Vol. 29.No. 4, pp.
641-662, 1996.
[12]
Huang L., Wan G., Liu C., “An Improved Parallel Thinning Algorithm”, 2003.
[13]
Martin F. Møller, "A Scaled Conjugate Gradient Algorithm for Fast Supervised
Learning", University of Aarhus, Denmark, 1990.
[14]
KiriWagstaff, "ANN Backpropagation: Weight updates for hidden nodes", 2008
Page | 54
[15]
Bal´azsEnyedi, LajosKonyha and K´alm´anFazekas, "Real Time Number Plate
Localization Algorithms",Journal of ELECTRICAL ENGINEERING, VOL. 57,
NO. 2, 2006, 69–77.
[16]
Hamid Mahini, ShohrehKasaei, FaezehDorri and FatemehDorri, "An Efficient
Features–Based License Plate Localization Method", IEEE, 2006.
[17]
Dong
Xiao
Ni,
"Application
of
Neural
Networks
to
Character
Recognition",Proceedings of Students/Faculty Research Day, CSIS, Pace
University, May 4th, 2007.
[18]
Augusto Celentanoand Vincenzo Di Lecce, "A FFT based technique for image
signature generation".
[19]
[20]
Vehicle and Transportation Management act",2054B.S..
KuruGollu, B. Sankur and A.E. Harmanci, "Color Image Segmentation Using
Histogram Multithresholding And Fusion",2001.
[21]
BishnuChaulagain, BrizikaBantawaRai and Sharad Kumar Raya, "Final
Report on Nepali Optical Character Recognition ",2009.
[22]
Anne Magaly De Paula Canuto,"Combining Neural Networks And Fuzzy Logic
For Applications In Character Recognition", University of Kent at Canterbury,
2001.
[23]
M.-K. Hu, “Visual pattern recognition by moment invariants,”IRE Transactions
on Information Theory, vol. IT-8, pp. 179–187, 1962.
Page | 55
7.2. Glossary
Activation function: A mathematical function used in neural network to map input
values to closed range of values between -1 to 1.
Back Propagation training: A sound and systematic means of training a multilayer
network.
Digital image processing: Manipulation, improvement analysis of pictorial information
of image that is digitally represented.
Epoch: Number of iteration taken for one cycle.
Feature extraction: The process of extracting essential characteristics of an input.
Momentum: A method that is used to accelerate the training process of back
propagation neural network.
Neurons: Interconnected nerve cells that make up most of the brain tissue in a living
organism.
Neural network: A computer model that simulates the working of biological neuron.
Output vector: A vector that hold the output values generated by a trained network from
an input vector during the process of knowledge retrieval.
Pixel: The smallest unit of an image. All images are composed of 2-d array of pixels.
Segmentation: Differentiation of object of interest from the background.
Thinning: The process of extracting shape/skeleton of image.
Page | 56
Download