BORANG PENGESAHAN STATUS TESIS UNIVERSITI TEKNOLOGI MALAYSIA

advertisement
PSZ 19 : 16 (Pind. 1/97)
UNIVERSITI TEKNOLOGI MALAYSIA
BORANG PENGESAHAN STATUS TESIS ♦
JUDUL :
THE APPLICATION OF HOUGH TRANSFORM FOR
CORNER DETECTION
SESI PENGAJIAN : SEMESTER II 2005/2006
MOHAD FUAD BIN JAMALUDIN
Saya
(HURUF BESAR)
mengaku membenarkan tesis (PSM/Sarjana/Doktor Falsafah)* ini disimpan di Perpustakaan
Universiti Teknologi Malaysia dengan syarat-syarat kegunaan seperti berikut :
1.
2.
3.
4.
Tesis adalah hakmilik Universiti Teknologi Malaysia.
Perpustakaan Universiti Teknologi Malaysia dibenarkan membuat salinan untuk tujuan
pengajian sahaja.
Perpustakaan dibenarkan membuat salinan tesis ini sebagai bahan pertukaran antara
institusi pengajian tinggi.
**Sila tandakan ( √ )
√
SULIT
(Mengandungi maklumat yang berdarjah keselamatan atau
kepentingan Malaysia seperti yang termaktub di dalam
AKTA RAHSIA RASMI 1972)
TERHAD
(Mengandungi maklumat TERHAD yang telah ditentukan
Oleh organisasi/badan di mana penyelidikan dijalankan)
TIDAK TERHAD
Disahkan oleh
(TANDATANGAN PENULIS)
(TANDATANGAN PENYELIA)
Alamat Tetap :
B-64 FELDA SUNGAI SAYONG,
81000 KULAI, JOHOR
Tarikh :
CATATAN :
8hb Mei, 2006
*
**
♦
ASSOC. PROF. DR.
SHAHARUDDIN SALLEH
Nama Penyelia
Tarikh :
8hb Mei, 2006
Potong yang tidak berkenaan.
Jika tesis ini SULIT atau TERHAD, sila lampirkan surat daripada pihak
berkuasa/organisasi berkenaan dengan menyatakan sekali sebab dan tempoh tesis ini
perlu dikelaskan sebagai SULIT atau TERHAD.
Tesis dimaksudkan sebagai tesis bagi Ijazah Doktor Falsafah dan Sarjana secara
penyelidikan, atau disertai bagi pengajian secara kerja kursus dan penyelidikan, atau
Laporan Projek Sarjana Muda (PSM).
“I hereby declare that I have read this thesis and in my
opinion this thesis is sufficient in terms of scope and quality for the
award of the degree of Master of Science (Mathematics)”
Signature
:
..........................................................
Name of Supervisor :
Assoc. Prof. Dr. Shaharuddin Salleh
Date
8hb Mei, 2006
:
THE APPLICATION OF HOUGH TRANSFORM FOR CORNER DETECTION
MOHAD FUAD BIN JAMALUDIN
A dissertation submitted in fulfilment of the
requirements for the award of the degree of
Master of Science (Mathematics)
Faculty of Science
Universiti Teknologi Malaysia
APRIL, 2006
ii
I declare that this thesis entitled THE APPLICATION OF HOUGH TRANSFORM FOR
CORNER DETECTION is the result of my own research except as cited in the
references. This thesis has not been accepted for any degree and is not concurrently
submitted in candidature of any other degree.
Tandatangan
:
.............................................
Nama Penulis
:
Mohad Fuad Bin Jamaludin
Tarikh
:
8hb Mei, 2006
iii
To my loving parent, Jamludin Bidin and Siti Lilah Juraimi also my lecturers
Assoc. Prof. Dr. Shaharuddin Salleh and Dr. Habibullah Haron
Thanks for your support, sacrifices and blessings
iv
ACKNOWLEDGMENT
Firstly, I would like to thank Allah S.W.T for giving me the health, strength and
patience to complete this dissertation. My deepest gratitude and appreciation to my
supervisor, Assoc. Prof. Dr. Shaharuddin Salleh also Dr. Zaitul Marlizawati Zainuddin
for they invaluable and insightful guidance, and continuous support offered throughout
the length of this study. Also special thanks to Dr. Habibullah Haron for his helpful
suggestions and ideas.
My sincerest appreciation to all my friends and course mates of MSc.
Mathematics in the Faculty of Science, who have support and help me during my study
in UTM. I ‘m also grateful to my family for their endless blessings, support and
understanding.
v
ABSTRACT
Corners are very attractive features of many applications in human perception
and computer vision. The problem of detecting corners is simplified into detecting
simple lines in a local coordinate system. As simple lines can be expressed using only
one parameter, Hough Transform can detect them quickly. Other edge detectors have
some problems when marking edge points around corners. Based on both gradient
magnitude threshold and grey level analysis, the edge points can be detected. Using the
Sobel and Kirsch Operator, an image can be extracted and provide the edge points for
Hough Transform. Mean while, the Zhang Suen Thinning Method can be used to reduce
the edge points into minima points to speed up the algorithm. To illustrate the problem,
the interface program is developed by using Microsoft Visual C++ 6.0.
vi
ABSTRAK
Titik simpang merupakan salah satu sifat-sifat objek yang penting dalam
perspektif visual manusia dan komputer. Masalah di dalam mengesan simpang
dipermudahkan kepada mengesan sifat-sifat garisan di dalam sistem grid atau koordinat.
Kebiasaannya, ciri-ciri garisan dapat ditunjukkan di dalam parameter Hough Transform
dengan lebih cepat. Sesetengah pengesan pinggir (edge detector), mempunyai masalah
dalam menandakan titik pinggir di sekitar titik simpang. Berdasarkan nilai penentu
kecerunan dan analisa grey-level, titik-titik pinggir boleh dikesan. Operator KirschSobel telah digunakan untuk mengekstrak imej dan membentuk pinggiran sebelum
melalui proses seterusnya menggunakan Teknik Hough Transform. Sementara itu,
Teknik Penipisan iaitu Kaedah Zhang-Suen digunakan untuk mengurangkan ketebalan
garisan pinggir supaya menjadi lebih nipis (pinggir minima). Penggunaan kaedah
tersebut adalah untuk mempercepatkan proses pengesanan simpang di dalam Teknik
Hough Transform. Bagi menunjukkan proses pengesanan simpang, satu rekabentuk
program antaramuka telah dibangunkan dengan menggunakan Microsoft Visual C++
6.0.
vii
TABLE OF CONTENTS
CHAPTER
I
TITLE
PAGE
DECLARATION
ii
DEDICATION
iii
ACKNOWLEDGMENT
iv
ABSTRACT
v
ABSTRAK
vi
CONTENTS
vii
LIST OF FIGURES
x
LIST OF TABLES
xii
LIST OF APPENDICES
xiii
RESEARCH FRAMEWORK
1
1.1
Introduction
1
1.2
Problem Statement
2
1.3
Objectives of Study
3
1.4
Scopes of Study
3
1.5
Dissertation Organization
4
viii
II
EDGE AND CORNER DETECTION PROBLEM
5
2.1
Intoduction
5
2.2
Project Framework
6
2.3
Edge Detection Problems
7
Roberts Cross-Gradient Operator
12
2.3.2
Prewitt edge detector and Sobel edge detector
13
2.3.3
Kirsch edge detector
15
2.3.4
Laplacian Filtering Method
16
2.3.5
Gaussian Smoothing Filter
19
2.4
Thresholding
20
2.5
Thinning Problems
23
2.5.1
Simple Thinning Method
24
2.5.2
Zhang-Suen Skeletonizing
26
2.6
2.7
III
2.3.1
2.5.3 Medial Axis Transform
28
Corner Detection Problems
30
2.6.1 Hough Transform
30
2.6.2 Wavelet Transform
31
2.6.3
32
Chain Code Representation
Summary
34
THE HOUGH TRANSFORM
35
3.1
Background
35
3.2
Method of Line Detection
36
3.3
Method of Corner Detection
40
3.4
Algorithm for Line and Corner Detection
43
3.5
Summary
48
ix
IV
V
VI
CASE STUDY IN CORNER DETECTION
49
4.1
Intoduction
49
4.2
Edge Image
50
4.3
Local Minima Image
58
4.4
Locating Corners
60
SIMULATION MODEL
63
5.1
Introduction
63
5.2
Model Description
64
5.3
Simulation Model
65
5.4
Summary
70
SUMMARY AND CONCLUSIONS
71
6.1
Observation
71
6.2
Disscussions
73
6.3
Suggestions
75
REFERENCES
76
APPENDIX A
79
APPENDIX B
81
APPENDIX C
84
APPENDIX D
94
x
LIST OF FIGURES
FIGURE NO.
TITLE
2.1
The main stage of detecting corners
2.2
Detection of edges associated with grey-level corresponding
first and second derivatives (Seul et al. 2000)
PAGE
6
8
2.3
Pixel update at ( x , y ) for mask 3x3 size in windows template
11
2.4
The Gaussian matrix of 7x7 size
19
2.5
Classification of common thinning algorithm
24
2.6(a)
Illustrate the not deleted pixels that may cause discontinuities
25
2.6(b)
The pixel whose removal will shorten an object limb
25
2.7
The eight-neighbourhood of pixel, P
25
2.8
Example of computing distance transform of binary image:
(a) a full region binary image,
(b) a distance transform obtained
2.9
Medial Axis matrix of the binary image shown in
figure 2.8(a) using 8-connected boundary
2.10
28
29
Direction of boundary segments of a chain code for
(a) a 4-connected chain, (b) an 8-connected chain
33
3.1
The line passing two points
36
3.2
The intersection of two lines in the mc-plane correspond to
a line in the xy-plane
37
3.3
The illustration of r and θ in the line
38
3.4
An accumulator array
39
xi
3.5
The algorithm of Hough Line Detector
44
3.6
The algorithm of Hough Corner Detector
44
3.7
The points in the xy-plane lying on a line
45
3.8
Sinusoidal functions are associated with points on a line
46
3.9
The two lines crossing at point (3,5) in Cartesian space
46
3.10
The sinusoidal function of point (3,5) lies on two points that
corresponds to the high lines crossing.
4.1
47
Kirsch and Sobel Algorithm for computing the edges of an
Image.
50
4.2
The original pixel values of an image size of 10x10 matrix
51
4.3
A pixel represented as a 24-bit string (Salleh et al, 2005)
52
4.4
The first 3x3 matrix of the image in Figure 4.2.
55
4.5
The new value of the image in 8x8 matrix after filtering
the image in figure 4.1
57
4.6
The binary edge image of Figure 4.5
57
4.7
The Zhang-Suen Thinning Algorithm.
58
4.8
A 3x3 matrix is obtained from Figure 4.6
59
4.9
The minima image after thinning process with Zhang-Suen
Algorithm and Hough Line Detector.
59
4.10
The corner candidate using the Hough Corner Detector.
62
5.1
The sample of digital image
65
5.2
The result from the edge detector (the Edge Image)
66
5.3
The result of thinning process (the Minima Image)
67
5.4
The corner candidates produced from the thinning image of
5.5
Figure 5.3
68
The accurate corners are obtained from Figure 5.4
69
xii
LIST OF TABLES
TABLE NO.
TITLE
PAGE
4.1
The accumulator of the Hough Transform
60
4.2
The array of rho[i] and theta[i]
61
xiii
LIST OF APPENDICES
APPENDIX
TITLE
PAGE
A
The sinusoidal graph in Hough space
79
B
The Analysis of the Rate of the Maximum Thresholding
81
C
The Result of Corner Image using different Thresholding
84
D
The Related Source Code
94
CHAPTER I
RESEARCH FRAMEWORK
1.1
Introduction
Feature detection such as edge and corner detection is an essential step in human
perception of shapes. Such corner points contain important information about the shape
of objects. The extraction of the corner points is useful for computer applications such
as image recognition, segmentation, description, matching and data compression.
In human perception of shapes, a corner is not only from the mathematical
concept of corners but also from the neighbouring patterns of local features. Human
recognize the corners, which are consistent with each other in the description of shapes.
They develop this imperfect knowledge empirically.
In industrial applications this can be an essential for safety or for quality control,
which require an efficient algorithm for locating specific objects in images with the aims
2
of counting, checking and measuring. Sharp features and well-defined shapes such as
corners help considerably with object location. If the corners can be detected
computationally, the job such as counting, checking and measuring can be done
efficiently.
Recently, corner detection problem is divided into two categories namely
boundary-based and region-based approaches (Chen et al., 1995). Boundary-based
approaches use chain code to detect the corners on the boundary of an object. While the
region-based approaches identify the corners directly on the grey-level image using
gradient and curvature measurements.
Conventionally, a corner is defined as the junction point of two or more straight
lines. Based on this definition, two characteristics of corners can be generalized (Shen
and Wang, 2001). Firstly, a corner point is also an edge point. Secondly, at least two
lines pass through the corner point.
1.2
Problem Statement
Given an image in digital form, the problem is to detect the corners in that image.
A corner is identified at a point in the image through the non-existent of the first
derivative. We apply the Hough Transform method to detect the corners in an image.
The Hough Transform is applied to the boundary object of the binary image, which is
can be a line or curvature. To produce the binary image also known as an edge image,
we apply the Sobel and Kirsch Edge Detector and the Zhang-Suen Thinning Method to
the digital image.
3
1.3
Objectives of Study
The following are the objectives of this dissertation:
‰
Design and develop a simulation model for detecting corner on the digital
image.
‰
Apply the procedure to a real problem.
‰
Evaluate the performance of Hough Transform by using different threshold
values.
1.4
Scopes of Study
The simulation program of corner detection has been developed using the
Microsoft Visual C++. It involves an image of size 640 x 480 pixels. The simulation
program uses the bitmap format to extract and analyses every pixel points in twodimensional array. It uses the RGB values of colouring to extract the image and detect
some edge lines. Then, the edge lines are used to determine the corners. In this study,
the Kirsch and Sobel Edge Detector is used to obtain the edge values in binary image.
The Zhang Suen Thinning Method is used to reduce the thickness of the edges. All
these methods are used in pre-processing stage. In the post-processing, the Hough
Transform is used to detect the corners through the edge lines obtained.
4
1.5
Dissertation Organization
This dissertation consists of six chapters. Chapter I is the research framework.
This chapter includes an introduction of study, the problem statement, objective and
scope of the study. This follows by Chapter II that briefly describes about edge and
corner detection problems. Chapter III discusses in details about the Hough Transform
to detect lines and corners. In Chapter IV, a case study of corner detection using the
Hough Transforms is discussed. The experimental results and analyses are described in
Chapter V. Lastly, Chapter VI discusses in general the conclusion and future works.
CHAPTER II
EDGE AND CORNER DETECTION PROBLEM
2.1
Introduction
This study involves the pre-processing and the post-processing stage of corner
detection. The pre-processing includes the edge image process and the thin image
process. In the simple way, the digital image must be extract to get the binary image by
using edge detector. The edge detectors like Robert, Sobel and Kirsch operator can be
used to detect the edge in an image. In the edge detection process, the threshold value
must be considered when extracting the binary image from the digital image. The next
process is to reduce some noise from the binary image and to detect the minima points
(thinning image). In the post-processing, the corner detection approach is applied to get
the corners.
6
2.2
Project Framework
Image processing is an area of study for modifying, manipulating and extracting
images to improve the quality of an image. The application involves the process of
formation, representation, operation and display. This study is a part of the image
processing that involves the extracting and reformation process.
Chen et al. (1995) proposed the processing stages in detecting corners directly
from the grey-level image by using the Wavelet Transform. In this study, the same
processing stages will be used, but the Hough Transform will be applied to detect the
corners instead of the Wavelet Transform. The stages are shown below:
Digital Image
Acquiring the
edge image
(Input)
Binary
Image
Acquiring the
minima points
Thinning
Image
Eliminating the
edge points
Real Corner
(Output)
Figure 2.1
Locating corners
Corner
Candidates
The main stages of detecting corners
7
First, a two dimension digital image is used to obtain the binary image. Every
pixel of image has a value that represents the colour. The value is called intensity. The
intensity is stored in two-dimension array for extracting. After this stage, the image is
filtered by using the edge detection method. The Kirsch and Sobel edge detector is used
to acquire the edge image. The edge image is represented by binary values that are
determined by using the threshold values.
In this stage, the edge is probably thick. The next process is to thin the edge to
detect the corner easily. Thinning method is used to acquire the minima points. Then
the Hough Line Detector is used to reduce the noise and represent the edges as lines.
The lines are obtained using the accumulator that involved some parameters. These
parameters will be used in the next stage to detect corners.
Next stage is to check every point at edges to get the corner candidate points.
The validation involves the entire candidate of corner points by using the accumulator.
The validation process uses the Hough Corner Detector that involves some
characteristics of lines.
2.3
Edge Detection Problems
Edges carry useful information about object boundaries that can be used for
image filtering applications, image analysis and object identification. Basically, edges
can be detected from an image by looking at the separation regions of image intensity.
The separation between the high and low intensity pixels is verified using the threshold
value. Figure 2.2 shows the separation of high and low intensity more closely modelled
8
as having a ‘ramp-like’ profile. The figure also corresponds to the first and second
derivatives, illustrating the fact that the information of signs and location of an edge may
be extracted by using different operators. Since, some operators may use the first
derivative and some may use the second derivative.
Image
‘Ramp-like’ profile
Profile of a horizontal
line, f ( x )
First derivative,
∇ f ( x, y )
Second derivative,
∇ 2 f ( x, y )
Figure 2.2
Detection of edges associated with grey-level corresponding first and
second derivatives (Seul et al. 2000)
9
The most direct approach to detect edge relies on the evaluation of local spatial
variation of f (x, y ) . The first derivative of f (x, y ) corresponds to a local maximum at
an edge. Therefore, the local maximum can produce gradient image to detect edge.
Typically, the gradient image at location ( x, y ) where x and y are the row and
column coordinates respectively is a two directional derivatives that can be represented
as gradient magnitude and gradient orientation (Chanda and Majumder, 2000). The
gradient magnitude relates to edge strength, while the gradient orientation corresponds
to the angle of gradient (direction).
The image gradient is defined as the vector that refers to the ‘ramp-like’ profile.
⎡ ∂f ⎤
⎡G x ⎤ ⎢ ∂x ⎥
∇f ( x , y ) = ⎢ ⎥ = ⎢ ⎥
⎣G y ⎦ ⎢ ∂f ⎥
⎢⎣ ∂y ⎥⎦
2.1
Most important from vector analysis is the magnitude, commonly approximated as
follows
∇f ( x, y ) = G x2 + G y2 or ∇f ( x, y ) = G x + G y
2.2
This quantity gives the maximum rate of increase of f ( x, y ) per-unit distance in
the direction of ∇f (x, y ) . The gradient orientation can be described by the direction
angle
10
⎛ Gy
⎝ Gx
α (x , y ) = tan −1 ⎜⎜
⎞
⎟⎟
⎠
2.3
where the angle is measured with respect to the x-axis. The direction of an edge at
point (x, y ) is perpendicular to the gradient orientation. Another approach is to calculate
the gradient using forward-difference method as follows:
f ( x + h, y ) − f ( x , y )
∂f
= lim
h
∂x h →0
∂f
f ( x, y + k ) − f ( x , y )
= lim
k
→
0
∂y
k
2.4
Let h and k tend to 1, yields
∂f
= f ( x + 1, y ) − f ( x, y )
∂x
∂f
= f ( x, y + 1) − f ( x, y )
∂y
2.5
Usually in the programming method, [i][j]-notation is used instead of (x, y ) notation, where the column j corresponds to the x-axis and the row i corresponds to the
y-axis. Generally, the edge detection process is a transformation process of a pixel at
f (x, y ) into h(x, y ) or f [i ][ j ] → h[i ][ j ] through programming.
The transformation process involves a derivative of f (x, y ) and thresholding.
The derivative process uses a convolution method by using kernel G, k (x, y) = G ∗ f (x, y)
where G corresponds to an operator of the method of edge detection. The kernel G is
11
known as the weight or coefficient of the element in the mask of 3x3 size. The mask is
as shown in Figure 2.3 below.
Figure 2.3
f ( x − 1, y − 1)
f ( x , y − 1)
f ( x + 1, y − 1)
f ( x − 1, y )
f (x , y )
f ( x + 1, y )
f ( x − 1, y + 1)
f ( x , y + 1)
f ( x + 1, y + 1)
Pixel update at (x , y ) for mask 3x3 size in window template
By using the masks of size 3x3, the transformation equation is approximated as follows:
3
3
k ( x , y ) = ∑∑ G (m , n ) f ( x + m − 2 , y + n − 2 )
2.6
m =1 n =1
Roberts, Prewitt, Sobel and Kirsch are some examples of the edge detection
operator that apply the convolution method to the first derivative. Equation 2.6 is used
to get the value of k (x , y ) for every pixel in an image. The value k (x , y ) is then
compared to a threshold value. If k (x , y ) is greater than the threshold, then 1 (or 0) is
assigned to the pixel. Otherwise, the pixel is assigned with 0 (or 1). The result of the
transformation is in binary form that is held by output image h(x , y ) .
12
2.3.1
Roberts Cross-Gradient Operator
Edge detection operators verify each pixel neighbourhoods by quantifying the
slope and the direction as well as the grey-level. Using the mask in Figure 2.3 to detect
edges, Roberts defined the first order derivative of the column and row of the mask as
K x = f (x + 1, y + 1) − f ( x, y ) and K y = f ( x , y + 1) − f ( x + 1, y )
2.7
where K x and K y correspond to the derivative of the column and the row of the mask.
The above partial derivatives can be represented by masks of 3x3 size. The masks of
Robert operator are defined as
Gx
⎡0 0 0 ⎤
⎡0 0 0 ⎤
⎢
⎥
= ⎢0 - 1 0⎥ and G y = ⎢⎢0 0 - 1⎥⎥
⎢⎣0 0 1⎥⎦
⎢⎣0 1 0 ⎥⎦
This operator is one of the oldest edge detectors in digital image processing
(Gonzalez et al., 2004). It is also the simplest and shortest, thus the position of the edges
is more accurate. But the problem with the short support is noisier. It also produces a
weak response to real edge unless the image is very sharp.
13
2.3.2
Prewitt edge detector and Sobel edge detector
Prewitt and Sobel edge detection algorithms also used the masks to represent the
horizontal, K y and vertical, K x edge components. The method uses one-dimensional
approximation to gradient to reduce the noise.
The idea behind the popular Prewitt and Sobel edge detection is the combination
of differentiation with the central differences of adjacent horizontal and vertical of edge
components. The difference is in the (x, y ) -notation, where x is the column of an array
and y is the row of an array.
∂f
= ( f ( x + 1, y + 1) − f ( x − 1, y + 1)) + ( f ( x + 1, y ) − f ( x − 1, y )) +
∂x
( f (x + 1, y − 1) − f (x − 1, y − 1))
2.8
∂f
= ( f ( x + 1, y + 1) − f ( x + 1, y − 1)) + ( f ( x, y + 1) − f ( x, y − 1)) +
∂y
( f (x − 1, y + 1) − f (x − 1, y − 1))
2.9
The Prewitt edge detector deals better with the effect of noise. Rearranging the
equations 2.8 and 2.9, the partial derivatives of the Prewitt operator are given as
K x = ( f ( x + 1, y + 1) + f ( x + 1, y ) + f ( x + 1, y − 1)) −
( f (x − 1, y + 1) + f (x − 1, j ) + f (x − 1, y − 1))
K y = ( f (x + 1, y + 1) + f ( x , y + 1) + f (x − 1, y + 1)) −
( f (x + 1, y − 1) + f (x , y − 1) + f (x − 1, y − 1))
2.10
2.11
14
Equations 2.10 and 2.11 correspond to the vertical and horizontal mask in
detecting edge. Therefore, the masks of Prewitt operator can be written as
⎡- 1
G x = ⎢⎢- 1
⎢⎣- 1
0
0
0
1⎤
⎡ -1
⎥
1⎥ and G y = ⎢⎢ 0
⎢⎣ 1
1⎥⎦
-1
0
1
- 1⎤
0⎥⎥
1⎥⎦
These masks have longer support. If the masks are used separately, they will only detect
the edges in one direction. However, if they are combined, the edges in both directions
can be detected. It will make the edge detector less noisy.
The Sobel edge detector provides good performance and is relatively insensitive
to noise (Pitas, 1993). The partial derivatives of the Sobel edge detector are similar to
Prewitt partial derivatives. The only difference is that the Sobel edge detector has a
weight of two in the centre coefficient. A weight value of two is applied to achieve
some smoothing by giving more importance to the centre point (Gonzalez et al., 2004).
Arranging equations 2.10 and 2.11 and add coefficient of two at the centre, the partial
derivatives of the Sobel operator yield
K x = ( f ( x + 1, y + 1) + 2 f ( x + 1, y ) + f ( x + 1, y − 1)) −
( f (x − 1, y + 1) + 2 f (x − 1, j ) + f (x − 1, y − 1))
K y = ( f (x + 1, y + 1) + 2 f ( x , y + 1) + f ( x − 1, y + 1)) −
( f (x + 1, y − 1) + 2 f (x , y − 1) + f (x − 1, y − 1))
Therefore the masks of Sobel operator are:
⎡-1
G x = ⎢⎢- 2
⎢⎣ - 1
0
0
0
1⎤
⎡ - 1 - 2 - 1⎤
⎥
2⎥ and G y = ⎢⎢ 0 0 0⎥⎥
⎢⎣ 1 2 1⎥⎦
1⎥⎦
2.12
2.13
15
Most of the applications in computing digital gradients use the Prewitt and Sobel
Operators. The Prewitt operators are simpler to implement than the Sobel Operator, but
the Prewitt Operator has noise suppression characteristics when dealing with derivatives.
Both of these operators can handle more gradual transitions and noisier images better
(Gonzalez et al, 2004).
2.3.3
Kirsch edge detector
The Kirsch edge detector uses eight masks at each point of an image. The masks
are used to determine an edge gradient magnitude and direction. Each mask refers to an
edge oriented in particular general direction such as horizontal (0 degree), vertical (90
degree) or diagonal (45 or 135 degree).
The masks of Kirsch operator give the positive weights to five consecutive
neighbours and the negative weights to another three consecutive neighbours. The
Kirsch edge detector is the first derivative approach that sets the central point of each
mask by a zero. The masks of Kirsch operator are defined as follows:
5⎤
⎡- 3 - 3 - 3⎤
⎡- 3 - 3 - 3⎤
⎢
⎥
⎥
5⎥, G(2) = ⎢- 3 0 5⎥, G(3) = ⎢⎢- 3 0 - 3⎥⎥ ,
⎢⎣- 5 5 5⎥⎦
⎢⎣- 3 5 5⎥⎦
5⎥⎦
⎡- 3 - 3 - 3⎤
⎡ 5 - 3 - 3⎤
⎡ 5 5 - 3⎤
⎡ 5 5 5⎤
⎢
⎥
⎢
⎥
⎢
⎥
G(4) = ⎢ 5 0 - 3⎥, G(5) = ⎢ 5 0 - 3⎥, G(6) = ⎢ 5 0 - 3⎥, G(7) = ⎢⎢- 3 0 - 3⎥⎥
⎢⎣ 5 5 - 3⎥⎦
⎢⎣ 5 - 3 - 3⎥⎦
⎢⎣- 3 - 3 - 3⎥⎦
⎢⎣- 3 - 3 - 3⎥⎦
⎡- 3 - 3
⎡- 3 5 5⎤
⎢
⎥
G(0) = ⎢- 3 0 5⎥, G(1) = ⎢⎢- 3 0
⎢⎣- 3 - 3
⎢⎣- 3 - 3 - 3⎥⎦
16
Consequently, the Kirsch operators are eight directional masks. Effectively, the
masks show the magnitude of each direction. The maximum magnitude of the
derivative has been computed on pixel-by-pixel basis to detect the edge points. The
maximum magnitude of derivatives can be described as below:
K = max {K (i ) }
2.14
i =0 ,1,...,7
After edge extraction, additional processing may be required. To reduce the
amount of data even further, a thinning process can be applied to detect the thin lines for
a single pixel. Note that, the Prewitt and the Sobel operator can be upgraded to the eight
masks like Kirsch operator to obtain edge orientation images (Castleman, 1996).
2.3.4
Laplacian Filtering Method
Another approach to detect edges is the Laplace equation. This method is
defined in terms of the second order differential equation, as follows:
∂2 f ∂2 f
∇ f (x , y ) = 2 + 2
∂y
∂x
2
2.15
The first order derivatives have local maxima or minima at image edges due to large
local intensity changes. Therefore, the second order derivatives have zero-crossings at
edge locations. Thus, an approach to detect edge position is to estimate the Laplace
equation and to find zero crossing positions (Pitas, 1993).
17
An estimator of the Laplace equation uses the central difference method of
∂2 f
∂2 f
and 2 given by
∂x 2
∂y
∂2 f
f ( x + 1, y ) − 2 f ( x, y ) + f ( x − 1, y )
= lim
2
h
→
0
∂x
h2
2.16
∂2 f
f ( x, y + 1) − 2 f ( x, y ) + f ( x, y − 1)
= lim
2
k
→
0
∂y
k2
2.17
where h and k are constants. Substituting equations 2.16 and 2.17 into equation 2.15 and
set h = k = 1 . The equation yields
∇ 2 f ( x, y ) = f ( x + 1, y ) + f ( x − 1, y ) + f ( x, y + 1) + f ( x, y − 1) −
4 f ( x, y )
2.18
Equation 2.18 can also be written as the mask,
⎡0 1 0 ⎤
L = ⎢⎢1 − 4 1⎥⎥
⎢⎣0 1 0⎥⎦
Laplace equation is isotropic and symmetric for rotation increments of 45 and 90
degree respectively (Salleh et al., 2005). Another approximation to the Laplace equation
includes the diagonal neighbours, as follows:
∇2 f ( x, y) = f ( x + 1, y) + f ( x − 1, y) + f ( x, y + 1) + f ( x, y − 1) +
f ( x + 1, y + 1) + f ( x + 1, y − 1) + f ( x − 1, y + 1) +
f ( x − 1, y − 1) − 8 f ( x, y)
2.19
18
and the mask of 3x3 size is given by
⎡1 1 1⎤
L = ⎢⎢1 − 8 1⎥⎥
⎢⎣1 1 1⎥⎦
Thus, the Laplace method is the second order differential that tends to enhance
image noise. It makes the method creates several false edges. Normally, the method is
used in line with the Laplacian of Gaussian (LoG), which handles noise in a more
effective way (Salleh et al., 2005). In Laplacian of Gaussian edge detection, there are
mainly three steps:
i.
Filtering
ii.
Enhancement
iii.
Detection
Gaussian filter is used for smoothing and the second derivative is used in the
enhancement step. The detection criterion is the presence of a zero crossing in the
second derivative with the corresponding to a large peak in the first derivative
(Castleman, 1996).
In this approach, the noise can be reduced by convoluting the image using
Gaussian filter. Isolated noise points and small structures are filtered out from the edges
and therefore, the edges will be smoother. Most the pixels have a local maximum
gradient that correspond an edge, occur around the zero crossing in the second
derivative. In other words, the edges are said to be present if there exists a zero crossing
in second derivative and if magnitude of the first derivative exceeds a threshold at the
same location (Chanda and Majumder, 2000).
19
2.3.5
Gaussian Smoothing Filter
It is sometimes useful to apply a Gaussian Smoothing Filter to an image before
performing edge detection. The filter can be used to soften edges, and to filter out
spurious points (noise) in an image. However, there will also be a lot of unwanted edge
fragments appearing due to noise and fine texture. The figure below shows the example
of Gaussian matrix of 7x7 size.
⎡
⎢
⎢
⎢
⎢
G=⎢
⎢
⎢
⎢
⎢
⎣
1
1
2
2
2
1
1
Figure 2.4
1
2
2
4
2
2
1
2
2
4
8
4
2
2
2
4
8
16
8
4
2
2
2
4
8
4
2
2
1
2
2
4
2
2
1
1
1
2
2
2
1
1
⎤
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎦
The Gaussian matrix of 7x7 size
The filter is applied in the following way:
k (i , j ) =
⎞
1⎛ n n
⎜ ∑ ∑ G (i , j ) f (i , j )⎟
⎟
⎜
s ⎝ i =1 j =1
⎠
2.20
where, k is the filtered value of the target pixel.
f is a pixel in the 2D grid.
G is a coefficient in the 2D Gaussian matrix,
n is the vertical/ horizontal dimensions of the matrix. e.g. n is set to 7.
and s is the sum of all values in the Gaussian matrix
20
The input of the Gaussian filter is a masks of an image as shown in Figure 2.4
and the output is a centre pixel value of the mask. In order to reduce the blurring of
edges, an edge-preserving Gaussian smoothing is used in which the original replaced by
the new computed value if the difference between the two is greater than a given
threshold (Manohar, 2004). Also due to Gaussian smoothing the resulting image
becomes darker. This is mainly because the range of pixel values in the result is small
when compared to the actual range that can be represented in grey-level. This effect can
be alleviated by doing a contrast clipping on the final image.
2.4
Thresholding
Binary images are simpler to analyse than grey-level images, but digital images
often cannot be transformed directly to binary without some pre-processing. Usually,
the thresholding involves the selection of intensity that corresponds to an image f(x,y).
The intensity composed of light objects on a dark background, in such a way that it is
grouped into two regions (Gonzalez, 2004).
An obvious way to extract the objects from the background is to select a
threshold, T that separates these regions. The object point can be detected the image
f ( x, y ) ≥ T . Otherwise, the point is called a background point. The mathematical
definition of standard binary threshold is described below:
⎧1
h(x, y ) = ⎨
⎩0
if f (x, y ) ≥ T
if f ( x, y ) < T
2.21
21
Selection of the value of threshold, T is a critical issue. The threshold can simply
be selected either manually or automatically. In the manual case, the intensity must be
observed in order to get the ideal threshold value. But usually, a digital image has
different intensity. Therefore, to get the ideal threshold value by observing the intensity
will be time consuming. In this case, the automatic concept by mathematical approach
is relevant used in getting the ideal threshold value.
One way to choose a threshold value is to use the intensity of image histogram.
Gonzalez et al. (2004) describe a threshold automatically as a mid-point between the
minimum, T1 and maximum, T2 intensity values in the image. The minimum and
maximum intensity can be described as in equation 2.22.
T1 =
min { f (x , y )} and,
x =0,1,...,m
y =0,1,...,n
T2 =
max { f (x , y )}
x =0,1,...,m
y =0,1,...n
2.22
where m and n are image size, x and y correspond to the column and row of image array.
The threshold can be calculated as below.
T =
1
(T1 + T2 )
2
2.23
Another way to get a threshold value is by using an average of intensity value.
Salleh et al. (2005) used this approach to separate the image in their example works.
The summation of intensity values was divided by the size of image as describes below.
1 m n
T=
∑∑ f (x , y )
mn x =0 y =0
2.24
22
Pitas (1993) also proposed some heuristic adaptation technique to determine the
threshold value, T (k ,l ) . The method is to calculate the local arithmetic mean of the
intensity image. It is described by
⎛ 1 ⎞
T3 (k ,l ) = ⎜
⎟
⎝ mn + 1 ⎠
2 x+m
l +n
∑ ∑ f (x , y )
2.25
x =k −m l = y −n
and this value will be used to calculate the threshold given be equation 2.26.
T (k, l ) = (1 + β )T3 (k, l )
2.26
where β is a percentage indicating the level of the threshold, 0 < β ≤ 1 above the local
arithmetic mean.
As mentioned, one of the objectives in this study is to evaluate the Hough
Transform by using different threshold value. Based on the thresholding proposed by
Pitas (1993), this study uses a rate of the maximum thresholding to determine the
threshold value. The threshold calculation is:
T = α T2
2.27
where α is the rate of the maximum threshold, T2 given in equation 2.22 with
0 < α ≤ 1.
In this study, the thresholding is used for different functions, specifically in the
Kirsch and Sobel Edge Detector, the Hough Line and Corner Detector. The thresholding
of the Kirsch and Sobel Edge Detector is used to separate the image into two regions,
which are black and white pixel colours or value 1 and 0 in binary image. Meanwhile,
23
the thresholding in the Hough Line Detector is applied to the accumulator in detecting
the significant lines. The thresholding of the Hough Corner Detector is used to
determine the real corner from the corner candidate.
We use α 1 , α 2 and α 3 as a parameter to evaluate the performance of Hough
Transform under different value of α , where α 1 is correspond to the rate of
thresholding in Kirsch and Sobel Edge Detector, α 2 and α 3 are correspond to the
Hough Line and Corner Detector.
2.5
Thinning Problems
Thinning can be defined heuristically as a set of successive erosions of the
outermost layers of a shape, until a connected unit width set of lines (skeleton) is
obtained (Pitas, 1993). Thinning algorithms examine the neighbourhoods of each
contour pixel and identify those pixels that can be skeletal pixel and deleted. These
algorithms reduce the thickness of the object in a binary image to produce thinner
representations called skeletons.
These algorithms have been used in many applications such as pattern
recognition, medical image analysis, text and handwriting recognition, fingerprint
classification, printed circuit board design, robot vision and automatic visual analyses of
industrial parts.
24
The thinning algorithms can be classified into two categories, iterative thinning
algorithms and non-iterative thinning algorithms (Engkamat, 2005). Figure 2.5 shows
the classifications of thinning algorithms. The iterative thinning algorithms use a pixel
scheme based on the eight-neighbourhood, while the non-iterative thinning algorithms
consider the centre line of the pattern only without examining the entire individual
pixels.
Thinning
Algorithms
Iterative
(Pixel based)
Parallel
Non-iterative
(Non-pixel based)
Sequential
Line Following
Medial Axis
Transform
Figure 2.5
2.5.1
Fourier
Transform
Classification of common thinning algorithm (Engkamat, 2005)
Simple Thinning Method
Simple Thinning is an iterative algorithm that converts border pixel from 1 to 0.
Connectivity is an essential property, which is preserved in the thinned object.
Therefore, border pixels are deleted in such a way that object connectivity is stable.
25
This Simple Thinning algorithm satisfies the following two constraints (Pitas, 1993).
Maintain connectivity at each iteration. Border pixels whose removal
1.
may cause discontinuities are shown as in Figure 2.6(a).
Do not shorten the end of thinned shape limbs. Border pixel whose
2.
removal will shorten the end of thinned shape limbs are shown in Figure
2.6(b).
0
0
0
0
0
0
1
1
0
1
1
0
0
0
1
0
0
0
(a)
(b)
Figure 2.6(a) Illustrate the not deleted pixels that may cause discontinuities
Figure 2.6(b) The pixel whose removal will shorten an object limb
The algorithm operates in an eight-neighbourhood as shown below:
P8
P1
P2
P7
P
P3
P6
P5
P4
Figure 2.7
The eight-neighbourhood of pixel, P
26
The basic disadvantage of this algorithm is that it does not thin the image object
symmetrically (Pitas, 1993). This algorithm does not perform in a satisfactory way
when the binary image objects are relatively large and convex. A more elaborate
thinning algorithm that thins the edges symmetrically is described in Zhang and Suen
algorithm.
2.5.2
Zhang-Suen Skeletonizing
The Zhang-Suen thinning algorithm is a parallel method, where the new value of
any pixel can be computed using only the values known from previous iteration (Ritter
and Wilson, 2001). It uses the eight-neighbourhood to verify boundary point for
validating the skeletal.
In this algorithm the 3x3 window has been used and the contour pixel P, which
the centre pixel is connected to the other eight neighbouring pixels. Let P1 , P2 ,..., P8 be
the eight neighbours of P (shown as in Figure 2.7) and Pi ∈{0,1} is taken from the source
image (binary image). The selected contour pixels are removed iteratively until a
skeletal pixel remains. A contour pixel is a point that has pixel value 1 and another eight
neighbouring pixel has value 0 or 1.
Let N (P ) be the number of nonzero of the eight neighbours of P and let N (T )
denote the number of ‘zero to one’ pattern in the ordered sequence P1 , P2 ,..., P8 , P1 . The
‘zero to one’ pattern is the transition bit from 0 to 1 of the ordered sequence
P1 , P2 ,..., P8 , P1 to determine the number of a connected component.
27
If the number of 0 → 1 transitions in this sequence is 1, only one connected
component presents in the eight-neighbourhood of pixel P (Pitas, 1993). In this case, it
is allowed to verify the contour pixel whether it is a centre of the eight-neighbourhood or
not. This verification involves some conditions to check the boundary position and the
corner direction of the eight-neighbourhood of pixel P.
The conditions consist of two sub-iterations. The first sub-iteration removes the
southeast boundary points and northwest corner points.
The second sub-iteration deletes the northwest boundary points and southeast corner
points (Ritter and Wilson, 2001). In the first sub-iteration, the conditions that must be
satisfied are:
i.
2 ≤ N ( P) ≤ 6
ii.
N (T ) = 1
iii.
P1 • P3 • P5 = 0
iv.
P3 • P5 • P7 = 0
In the second sub-iteration, the conditions (i) and (ii) are same as in the first subiteration but only conditions (iii) and (iv) are changed to:
iii.
P7 • P1 • P3 = 0
iv.
P1 • P7 • P5 = 0
When N (T ) > 1 , the contour P cannot be removed. Under this condition, it has
more part of ‘zero to one’ patterns around the contour P and makes more regions. When
no more pixels to be removed the algorithm is stopped. The algorithm will produce a
thin image with minima binary value.
28
2.5.3
Medial Axis Transform
The Medial Axis is the locus of the corners and forms the skeleton of the object
(Parker, 1997). The function of Medial Axis treats all boundary pixels as point sources
of a wave front. The wave (line) passes through each point only once. So when the two
waves meet it produces a corner. Usually the Medial Axis is used to get the skeleton of
an object with full region binary.
One way to find the medial axis is to use the boundary of the object. Firstly,
compute the distance from each object pixel to the nearest boundary pixel. Next, the
Laplacian of the distance image is calculated and pixels having large values are medial
axis. Figure 2.8 shows an example of computing the distance transform.
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
1
1
1
0
0
0
0
0
0
0
1
1
1
1
1
1
0
0
0
0
0
0
0
1
1
1
1
1
1
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
1
1
1
0
0
1
1
1
1
1
1
1
1
1
1
1
0
0
1
1
1
1
1
1
1
1
1
1
1
0
0
1
1
1
1
1
1
1
1
1
1
1
0
0
1
1
1
1
1
1
1
1
1
1
1
0
0
1
1
1
1
1
1
1
1
1
1
1
0
0
1
1
1
1
1
1
1
1
1
1
1
0
0
1
1
1
1
1
1
0
0
0
0
0
0
0
1
1
1
1
1
1
0
0
0
0
0
0
0
1
1
1
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
(a)
Figure 2.8
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
1
1
1
0
0
0
0
0
0
0
1
2
2
2
2
1
0
0
0
0
0
0
0
1
2
3
3
2
1
0
0
0
0
0
0
0
1
2
3
3
2
1
1
1
1
1
1
0
0
1
2
3
3
2
2
2
2
2
2
1
0
0
1
2
3
3
3
3
3
3
3
2
1
0
0
1
2
3
4
4
4
4
4
3
2
1
0
0
1
2
3
3
3
3
3
3
3
2
1
0
0
1
2
3
3
2
2
2
2
2
2
1
0
(b)
Example of computing distance transform of binary image:
(a) a full region binary image,
(b) a distance transform obtained
0
1
2
3
3
2
1
1
1
1
1
1
0
0
1
2
3
3
2
1
0
0
0
0
0
0
0
1
2
2
2
2
1
0
0
0
0
0
0
0
1
1
1
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
29
Pixels that belong to the medial axis of a connected component or object may not
in general be connected (Chanda and Majumder, 2000). Thus, it apparently loses some
important shape information that is very useful for image analysis. Most researchers
would agree that the Medial Axis Transform is an ideal skeleton but it takes long
computation (Parker, 1997). However, the Medial Axis Transform can describe the
basis concept of the thinning method.
The Medial Axis was found for the skeleton object using the Euclidean distance,
4-connected boundary and 8-connected boundary. There are clear differences in the
Medial Axis depending on how the distance is calculated. Figure 2.9 shows a skeleton
image by using 8-connected boundary.
0
0
0
0
0
0
0
0
0
0
0
0
0
Figure 2.9
0
1
0
0
0
0
1
0
0
0
0
0
0
0
0
2
0
0
2
0
0
0
0
0
0
0
0
0
0
3
3
0
0
0
0
0
0
0
0
0
0
0
3
3
0
0
0
0
0
0
0
0
0
0
0
3
3
0
0
0
0
0
0
1
0
0
0
0
3
3
0
0
0
0
0
2
0
0
0
0
0
3
4
4
4
4
4
3
0
0
0
0
0
0
3
3
0
0
0
0
0
2
0
0
0
0
0
3
3
0
0
0
0
0
0
1
0
0
0
0
3
3
0
0
0
0
0
0
0
0
0
0
0
3
3
0
0
0
0
0
0
0
0
0
0
2
0
0
2
0
0
0
0
0
0
0
0
1
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
Medial Axis matrix of the binary image shown in Figure 2.8(a) using
8-connected boundary
30
2.6
Corner Detection Problems
Generally, corners can be described as two or more meeting lines or sides. In
other words, a corner is yielded when two or more lines intersect each other. A corner
provides a feature of shape that is used in object matching and registration. Thus corner
detection is essential step in many image analysis and object recognition procedure.
Recently, the corner detection can be divided in two approaches (Chen et al.,
1995). The common method of the first approach is initially segment of an image into
regions and represents the object boundary with chain code. The second approach
works directly on the grey-level and mainly uses the gradient and curvature
measurements. Some corner detection methods are described in the following
subsections.
2.6.1
Hough Transform
The Hough Transform was developed by Paul Hough in 1962 and patented by
IBM (Ritter and Wilson, 2001). It became in the last decade a standard tool in the
domain of artificial vision for finding straight lines hidden in larger amounts of other
data. It is an important technique in image processing. For detecting lines in images,
the image is first binaries using some form of thresholding and then the positive
instances catalogued in an examples dataset.
31
The idea of the Hough Transform is that each of these points votes for several
combinations of parameters. The parameters that win a majority of votes are declared
the best lines, circles or ellipses given by these points.
Davies (1988) applied the generalised Hough Transform to corner detection in
line mode as square object detection and look for peaks in parameter space. The
approach has the advantage that it can be used when objects have curved sides or blunt
corners. It can be tuned for varying degrees of corner bluntness.
Shen and Wang (2002) simplified the corner detection problem into detecting
simple lines using the Hough Transform. They presented a dynamic edge detection
scheme based on both gradient magnitude threshold and grey level analysis. The Hough
Transform is very time consuming if the dimension of its parameter space is high. Also,
Hough Transform depends on the performance of the edge detector. General edge
detector cannot localise edge points well around corners, leading to errors in reporting
corners.
2.6.2
Wavelet Transform
The fundamental idea behind wavelets is to analyse according to scale. Indeed,
some researchers in the wavelet field is adopting a whole new perspective in processing
data. Wavelets are functions that satisfy certain mathematical requirements used in
representing data or other functions.
32
This idea is not new. In wavelet analysis, the scale that is used to analyse the data
plays a special role. Wavelet algorithms process data at different scales or resolutions.
Lee et al. (1995) proposed a multiscale corner detection algorithm based on Wavelet
Transform of contour orientation. They utilize both information of the local extrema
and modulus of transform results to detect corners and arcs effectively. The ramp width
of contour orientation profile, which can be computed using the transformed modulus of
two scales, is utilized in the determination of corner points.
Chen et al. (1995) also proposed a new edge detector based on the ratio of the
inter scale wavelet transform modulus. This edge detector can correctly detect edges at
the corner positions, making accurate corner detection possible. They apply the nonminima suppression scheme to edge image and extract the minima image. By the
orientation variance, these non-corner edge points are eliminated. In order to locate the
corner points, they propose a corner indicator based on the scale invariant property of
the corner orientations.
2.6.3
Chain Code Representation
Binary image consists of binary values such as 0 and 1. Every pixel of value 1
has a connected path to represent an object. Freeman (1961) introduced a representation
consisting of codes representing direction that involves 4-connected and 8-connected
paths. Each segment possesses a specific direction if boundary is traversed in a
clockwise manner. The chain directions can be coded in the way shown in Figure 2.10
for 4-connected and 8-connected chains.
33
1
2
3
0
3
(a)
Figure 2.10
2
4
5
1
0
6
7
(b)
Direction of boundary segments of a chain code for
(a) a 4-connected chain, (b) an 8-connected chain
The boundary of an object is represented by a connected path of value 1 in a
binary image. The path which is 4-connected or 8-connected chains can be considered
to consist of line segments connecting adjacent pixels. Each segment possesses a
specific direction if the boundary is traversed in a clockwise manner. The codes of the
successive boundary segments are taken from the boundary segments of a chain code.
The chain code depends on the starting pixel of the boundary. An advantage of chain
code is that it is translation invariant (Pitas, 1993). Scale invariance can be obtained by
changing the size of the sampling grid. However, such a resampling seldom produces
exactly the same chain code.
Haron et al. (2005) proposed a corner detection algorithm for thinned binary
image by using an 8-connected chain code. The algorithm is based on chain-coded
image deriving from the slope between each code. Then the series is analysed and
checked for the existence of corner at the current pixel location. He also identified that
the length of segment and threshold value was identified as significant factors in
determining the accuracy of corner detection.
34
2.7
Summary
This chapter explains all the stages in the project framework of this study. Three
stages must be focused in order to understand the overall process i.e the edge image, the
minima image and locating corners. In the edge image stage, the Kirsch and Sobel edge
detector will be used in the case study. While in the minima stage, the Zhang-Suen
Skeletonizing will be used to build the thin image in binary form. This study emphasizes
the application of Hough Transform in corner detection problem.
CHAPTER III
THE HOUGH TRANSFORM
3.1
Background
The Hough Transform is a feature extraction technique which can be used to
isolate features of a particular shape within an image. Developed by Paul Hough in
1962 and patented by IBM with the name "Method and Means for Recognizing Complex
Patterns" (Wikipedia). The Hough Transform is used in a variety of different
algorithms for pattern recognition.
The transform consists of a slope-intercept parameterisation (mc-space) for
straight lines, which awkwardly leads to an unbounded transform space since the slope
can go to infinity. Universally, the rho-theta parameter (rθ-space) is in popular used
today.
36
3.2
Method of Line Detection
The underlying principle of the Hough Transform is that there are an infinite
number of potential lines that pass through any point at different orientation. The idea
of the transform is to determine each point votes for several combinations of parameters.
The parameters that win a majority of votes are declared the best line given by these
points.
Consider a point (xi , y i ) in the image. The Hough Transform describes the line
as y = mx + c . Here x and y are the observed points and m and c are the parameters.
There are infinitely many lines that pass through this point, but they all satisfy the
condition
y i = mxi + c
3.1
for varying values of m and c. See Figure 3.1 below:
Figure 3.1
The line passing two points
37
The idea of the Hough Transform is to write the line equation in a new form.
Equation 3.1 can also be written as
c = −mxi + yi
3.2
where x and y to be a constant and m and c are the varying parameters. When their
values of m and c are changed, the line on the Figure 3.1 can be transformed shown in
Figure 3.2.
Figure 3.2
The intersection of two lines in the mc-plane correspond to a line in the
xy-plane
Each point in the xy-plane corresponds to a line in the mc-space. If the xy-plane
has 10 points on a line then the mc-space will have 10 lines. Thus, the intersection
between each line in the mc-space shows that the points lie on the same line. However,
Equation 3.2 has a problem to represent vertical lines. In that case, m = ∞ and the
parameter space is unbounded. For solving this case, Hough represents an alternative
way to describe lines as
r = xi cosθ + y i sin θ
3.3
38
where r is the perpendicular distance from the origin to the line and θ is the angle
between r and the x-axis. The parameter space is now in r and θ, where 0° ≤ θ ≤ 180°
and r is limited by the size of the image, −
1
2
x2 + y2 ≤ r ≤
1
2
x2 + y2 .
Equation 3.3 be obtained from the Cartesian coordinate using the Trigonometry
Identities and Theorem Pythagoras. Figure 3.3 shows the relationship between r, θ, x
and y.
r = xcosθ + ysinθ
Figure 3.3
The illustration of r and θ in the line
From Figure 3.3, it can be seen that
x = r cosθ and y = r sin θ
3.4
From Theorem Pythagoras, we know that
r 2 = x2 + y2
3.5
39
Substituting 3.4 into 3.5 gives us
r 2 = x2 + y2
= (rcos θ ) + (rsin θ )
2
2
................................ from 3.4
r = rcos 2 θ + rsin 2 θ
r = (rcosθ )cosθ + (rsinθ )sinθ
∴ r = xcosθ + ysinθ
To implement the method in the discrete domain, rθ-plane is approximated by a
two-dimensional array, which is called an accumulator A as shown in Figure 3.4
(Chanda and Majumder, 2000). It is expected that the elements A(θ, r) corresponding to
the point of intersection curves in rθ-plane would be incremented more rapidly than any
other cell. Hence, a local maximum in accumulator A gives the parameters of a line in
the rθ-plane on which relatively large number of points lie.
θ
r
Figure 3.4
An accumulator array
40
3.3
Method of Corner Detection
When the Hough Transform detects the line, we can apply the straight-line
equation to find the corner point. We must verify it whether the lines are parallel or
intersect. The following theorems show the relationship between the slope, the angle
and the type of lines (Raji et al., 2002).
Theorem 3.1:
The relationship between slope and the angle of line
If a line with the slope m makes an angle Q towards
positive x-axis, then m = tan Q
Theorem 3.2:
The relationship between the slopes of two perpendicular
Lines
Two lines with the slopes of m1 and m2 are
perpendicular, if m1m2 = -1
The angle Q between two intersecting lines with known slopes is given
⎛ m − m1 ⎞
⎟⎟
Q = tan − 1 ⎜⎜ 2
⎝ 1 + m2 m1 ⎠
3.6
where, m2 m1 ≠ −1
From this formula, we can conclude that
a) when Q = 0°, that is when both lines are parallel (m1 = m2)
b) when Q = 90°, that is when both lines are perpendicular from theorem 3.2
(m1m2 = -1)
c) two lines are intersecting when 0° < Q < 180°
41
From Equation 3.6, if m1 and m2 are the same, then the lines are parallel. We can
make an assumption that the intersecting lines are not parallel, where m2 ≠ m1 . We
need to compute the slope of lines to verify the characteristics of lines. Theorem 3.1
gives mi = tan θ i and θ i can be determined from the peak of the accumulator in Hough
Line Detector.
When corner is detected by the non-parallel lines, the corner points can be
computed using linear equation. Suppose we have two different r and θ as below,
r1 = x cosθ1 + y sin θ1
3.7
r2 = x cosθ 2 + y sin θ 2
3.8
Equation 3.7 and 3.8 can be written in matrix from as
⎡ r1 ⎤ ⎡ cos θ1
⎢r ⎥ = ⎢cos θ
2
⎣ 2⎦ ⎣
sin θ1 ⎤ ⎡ x ⎤
sin θ 2 ⎥⎦ ⎢⎣ y ⎥⎦
3.9
This can be rewritten as
ri = Ai t
⎡ r1 ⎤
⎡ cosθ1
⎡ x⎤
where, ri = ⎢ ⎥ ,t = ⎢ ⎥ and Ai = ⎢
⎣ y⎦
⎣r2 ⎦
⎣cosθ 2
3.10
sinθ1 ⎤
. Therefore, the intersection
sinθ 2 ⎥⎦
point ( x , y ) , is given as
t = Ai−1 ri
3.11
42
We can compute the inverse of Ai as below
Ai−1 =
=
1
sin θ 2 cos θ1 − cos θ 2 sin θ1
1
sin (θ 2 − θ1 )
⎡ sinθ 2
⎢− cosθ
2
⎣
⎡ sin θ 2
⎢− cos θ
2
⎣
− sin θ1 ⎤
cos θ1 ⎥⎦
− sinθ1 ⎤
cosθ1 ⎥⎦
Substitute into 3.11, the corner can be detected as
⎡x ⎤
⎡ sinθ 2
1
⎢ y ⎥ = sin (θ − θ ) ⎢− cosθ
2
⎣ ⎦
2
1 ⎣
⎡x ⎤
1
⎢ y ⎥ = sin (θ − θ )
⎣ ⎦
2
1
− sinθ1 ⎤ ⎡ r1 ⎤
cosθ1 ⎥⎦ ⎢⎣r2 ⎥⎦
⎡ r1sinθ 2 − r2 sinθ1 ⎤
⎢r cosθ − r cosθ ⎥
1
1
2⎦
⎣2
⎡ x ⎤ ⎡ R1 sin θ 2 − R2 sin θ1 ⎤
⎢ y ⎥ = ⎢ R cos θ − R cos θ ⎥
1
1
2⎦
⎣ ⎦ ⎣ 2
where
R1 =
r1
sin(θ 2 − θ 1 )
and R2 =
3.12
r2
sin(θ 2 − θ 1 )
We can apply Equation 3.12 into the algorithm to detect the corner. Before that,
we need to determine the parameters of ri and θ i, which is obtained from the
accumulator of Hough Line Detector described in section 3.2. In real time processing,
we have found many lines that imply the corner candidates. Therefore we must find a
method to minimize the corners in order to detect the corners accurately.
The approach that we use to detect the real corner is by using the thresholding in
corner candidates. We assume that real corners have many lines intersecting at one
point. So, we need to determine the highest number of lines intersecting at one point to
be the threshold value to separate the corner candidates.
43
3.4
Algorithm for Line and Corner Detection
After filtering the digital image, the thinning image will be built. The thinning
image has the boundary object with the connection pixel of value 1. We use the
connected pixel with value 1 to detect a line. For apply this process, the two-dimension
array of accumulator A must be built and then initialise every element A(t,r) with a zero.
Note that, a parameter t in the accumulator A refers to a parameter θ in rθ-plane. The
accumulator A has r =
x 2 + y 2 row size and t = 360 column size where x and y
correspond to the image size.
Through programming, the elements A(t, r) is written as A[r][t] and the image
pixel f(x,y) is written as f[i][j] where, x corresponds to column j and y corresponds to
row i. The array image f[i][j] uses Equation 3.3 to compute r with 0 ≤ t ≤ 360 . The
same r and t that is obtained by Equation 3.3 will be stored into the element A[r][t].
For such cases, the element A[r][t] will be increased by 1 for each lies with same
parameters. The large value of accumulator A can be assumed as the lines. But in this
case, we use the thresholding to detect significant lines. The significant lines can be
detected according to the following algorithm as shown in Figure 3.5.
The Hough Transform also can be used to detect a corner. The determined
parameters r and t in the accumulator A are stored by parameters rho[i] and theta[i].
Then, the parameters rho[i] and theta[i] will be used to detect whether the lines are
parallel or not. When the lines are not parallel, the corner points can be detected using
Equation 3.12. The algorithm to detect the corner points is described in Figure 3.6. This
algorithm also uses the thresholding to separate the significant corners from corner
candidates.
44
Step 1:
Initialise all elements of A[r][t] to zero.
Step 2:
For each pixel f[i][j] corresponding to the edge line
in the image, add 1 to all elements A[r][t], whose
indices r and t satisfy Equation 3.3, where
A[r][t]= A[r][t]+1
Step 3:
Determine the threshold value T from accumulator A and
set the largest value of element A[r][t] to be a
maximum value. The threshold value T is then calculated
using Equation 2.27.
Step 4:
Compare all the elements A[r][t] with the threshold
value. Determine r and t correspond to a line in the
[ ][ ]
accumulator A where A r t ≥ T and store into the
parameters rho[i] and theta[i].
Step 5:
The parameters rho[i] and theta[i] can be constructed
the lines with the connected path of value 1.
Figure 3.5
The algorithm of Hough Line Detector
Step 1:
Initialise the array image f[i][j] to zero.
Step 2:
Compute the slope m1 and m2 for all lines,
m1 = tan(theta[i]) and m 2 = tan(theta[i + 1])
Step 3:
If
(m1 ≠ m2 )
then compute the corner candidates by using
equation 3.12 and store into f[i][j].
Step 4:
Set threshold value T from the array image f[i][j]. The
maximum value is the highest intersecting pixel in
f[i][j].
Step 5:
Determine the real corners by using the thresholding.
If the real corners produce the pixel nearby, find the
converge pixel.
Figure 3.6
The algorithm of Hough Corner Detector
45
For example, assume the points (1,7), (3,5), (5,3) and (6,2) are on the same line
as shown in Figure 3.7 below:
Example 3.1
8
(1,7)
7
y -axis
6
(3,5)
5
4
Line
(5,3)
3
(6,2)
2
1
0
0
1
2
3
4
5
6
7
x -axis
Figure 3.7
The points in the xy-plane lying on a line
Hough Transform is generated and the indicated points are mapped to the sinusoidal
functions as follow.
r1 = cosθ 1 + 7 sin θ 1
r2 = 3 cosθ 2 + 5 sin θ 2
r3 = 5 cosθ 3 + 3 sin θ 3
r4 = 6 cosθ 4 + 2 sin θ 4
Each point in the line (as shown in Figure 3.7) corresponds to the sinusoidal function
that intersects at θ = 45 and r = 5.66 as shown in Figure 3.8. Therefore, the line in
Figure 3.7 corresponds to θ = 45 and r = 5.66 . From this example, we can make an
assumption that a line in the xy-plane must has their own value of r and θ. The
algorithm of Hough Line Detector use parameters rho[i] and theta[i] to store every
determined r and θ.
46
Hough space
(1,7)
r -axis
8
(45,5.66)
(3,5)
6
(5,3)
4
(6,2)
2
0
-2 0
50
100
150
200
-4
-6
-8
theta -axis
Sinusoidal functions are associated with points on a line.
Figure 3.8
A corner is defined as the intersection of two or more straight lines. Based on
this definition, the Hough Transform can detect a corner by using Equation 3.12.
Suppose we use the same example as in Figure 3.7 and add a line that intersects at the
point (3,5). The lines are shown in Figure 3.9.
Example 3.2
8
7
y -axis
6
as a corner point
(3,5)
5
Line 1
Line 2
4
3
2
1
0
0
1
2
3
4
5
6
7
x -axis
Figure 3.9
The two lines crossing at point (3,5) in Cartesian space
47
The line in Figure 3.9 is transformed to the Hough space as in Figure 3.10. The
figure shows the two points with the highest intersection of the sinusoidal functions. We
obtain the points (45, 5.66) and (135, 1.41) as the intersection points (corners) in Figure
3.10. Before that, we must compute m1 and m2 to verify the parallel lines. We have
r1 = 5.6, θ 1 = 45° and r2 = 1.41, θ 2 = 135° and then compute mi = tan θ i we obtain
m1 = 1 and m2 = −1 . Since m2 ≠ m1 , we see that the lines are not parallel. After that,
we compute the intersecting point or the corner using Equation 3.12, we have
⎡ x ⎤ ⎡3.0052⎤ ⎡3⎤
⎢ y ⎥ = ⎢4.9992⎥ ≈ ⎢5⎥ as shown in Figure 3.9.
⎦ ⎣ ⎦
⎣ ⎦ ⎣
Hough space
(1,7)
(3,5)
10
8
6
(5,3)
(6,2)
(45,5.66)
(1,3)
r -axis
4
0
-2 0
(5,7)
(135,1.41)
2
50
100
150
200
-4
-6
-8
theta -axis
Figure 3.10
The sinusoidal function of point (3,5) lies on two points that
corresponds to the high lines crossing.
48
3.4
Summary
The lines produced by the Hough Transform probably are not the lines that
human viewer would select. The arbitrary line can be detected by using the value in the
accumulator. The criterion for an exact line in the Hough Transform sense is a large
number in the collinear points. The Hough Transform uses the accumulator to set the
point in the arbitrary lines. Each element in the accumulator represents a line in the xyplane. This means the accumulator size is based on the size of image. The Hough
Transform needs to construct a big array to store the element value in accumulator that
corresponding to a line. There fore, a large memory space is needed for the operation of
line detection.
CHAPTER IV
CASE STUDY IN CORNER DETECTION
4.1
Introduction
In this chapter, a simple model is developed to implement the algorithm and
method that had been discussed in Chapter II by using Microsoft Visual C++ 6.0. The
model is used to show the concept and their calculations. The main tasks are to get the
local minima image by using the edge detection and thinning them. For the edge
detection, the Sobel and Kirsch Operator are used to detect edges and the Zhang-Suen
thinning algorithm is used to solve the thinning problem. The last stage is to detect
corners by using the local minima result in the thinning image. This study uses the
Hough Transform for searching the corners that are direct to the object behaviour. The
final result will be an image that has corner points and the binary image with numbers
that refer to the corners.
50
4.2
Edge Image
The first stage is to get the edge image from the digital image in the bitmap
format. The Kirsch and Sobel operator is used to compute the magnitude (derivative)
and the image will be extracted as the edges. Before extracting, the threshold value must
be determined from the image. The threshold value uses Equation 2.27 to separate the
object from the background. Then, condition 2.21 is used to convert the intensity value
in array to a binary value.
Through programming, it uses [i][j]-notation instead of ( x , y ) -notation where
row i and column j correspond to y and x in xy-plane. Therefore, to apply the equation is
described in Chapter II and III, we need to change ( x , y ) -notation into [i][j]-notation.
The edge detection algorithm use the array f[i][j] that corresponds to input of image and
array h[i][j] correspond to output of process image. The algorithm is described in
Figure 4.1.
Step 1:
Set image size (xSize,ySize).
Step 2:
Initialise the image into array f[i][j]. Determine the
grey value in image.
Step 3:
Compute threshold value T by using Equation 2.27
Step 4:
Compute the derivatives of Kirsch and Sobel operator and
determine the maximum magnitude of derivative, K[i]
max=K[0]
for c=1 to 7
if (max<K[c]) max = K[c]
Step 5:
Compare all max for each f[i][j] with T and save the
result into h[i][j] that correspond to edge image with
binary value.
if (max >= T) then h[i][j] = 1;
else
h[i][j] = 0;
Figure 4.1
Kirsch and Sobel Algorithm for computing the edges of an image
51
As an example, consider the generated 10x10 pixels of image in bitmap format.
Let f[i][j] corresponds to the pixel image and contains eight bits of data, which has the
value between 0 to 255 for each pixel in grey-level image. Figure 4.2 represents the
extracted image that be declared in grey value.
1
2
3
4
5
6
7
8
9
10
Figure 4.2
1
144
144
144
144
144
144
144
152
152
152
2
152
152
160
152
152
152
152
160
160
160
3
144
152
152
152
160
168
176
176
168
168
4
152
152
152
160
192
224
240
240
240
240
5
152
152
160
168
208
248
255
255
255
255
6
152
152
160
168
224
255
255
255
255
255
7
152
152
160
176
232
255
255
255
255
255
8
152
152
160
176
232
255
248
255
248
255
9
152
160
160
176
232
255
255
255
255
255
10
152
160
160
176
232
255
255
255
255
255
The original pixel values of an image size of 10x10 matrix.
We want to implement the edge detection using the Kirsch and Sobel operator.
We apply the algorithm, which is described in Figure 4.1 to extract the image using the
example given (as shown in Figure 4.2). The algorithm is applied in the following steps.
Step 1: Set a image size
We know the image is the matrix with 10x10 size. Therefore, the image has a
xSize = 10 and ySize = 10.
52
Step 2: Initialise the array image f[i][j]
In windows, the bitmap image is represented by 24 bits of data that contains total
of 16,777,126 colours (Salleh et al., 2005). This will cause the computation to involve
large values. To initialise the values, every pixel is held by array f[i][j]. In the next
process, the image needs to be transformed into grey-level image. The transformation
involves a bit-shifting function to move the bit values that refer to the colour.
The bitmap uses the RGB values for every colour pixel. RGB are red, green and
blue colour, which has eight bits data for each colour. The transformation will shift the
bit colour into a grey value.
174
55
Red
A
Green
3
B
171
Blue
7
A
E
1 0 1 0 1 0 1 1
0 0 1 1 1 0 1 1
1 0 1 0 1 1 1 0
Blue
Green
Red
Figure 4.3
A pixel represented as a 24-bit string (Salleh et al., 2005).
53
For example, we have a pixel f[1][2] = RGB (174,55,171) as shown in Figure
4.3. In windows the pixels are represented in binary or hexadecimal value. The pixels
need to be converted into grey-level image by shifting the bit values for every colour
involved.
First, the red colour pixel is obtained by shifting 16 bits to the right. Then, shift 8
bits to the left to get the green colour. To get the blue colour, shift 8 bits to the left from
green colour. All colours bits are added to get the grey-level pixel. The process can be
described as follows using the function in Microsoft Visual C++.
f[0][0]=RGB(174,55,171)=AB37AE
R=f[0][0]>>16=0000AB
G=R<<8=00AB00
B=G<<8=AB0000
f[0][0]=R+G+B=ABABAB=11250603
In this case study, we use 8 bits data that contains 0 to 255 for keeping the
calculation simple. Therefore, no shifting is needed in this case. The initialise value of
f[i][j] is same as in Figure 4.2.
Step 3: Compute the threshold value T
From the image in Figure 4.2 the maximum intensity value is 255. Equation 2.27
is used and arbitrary set α to 0.7. Therefore the threshold value is T = 178.5. In this
case the image is in integer form. After integer function int( ) in Microsoft Visual C++
has been applied, the threshold value is T = 178. This threshold value will then be used
to separate the intensity of image into two regions. The edge image can be obtained
after the separation process.
54
Step 4: Compute the Kirsch and Sobel operator
The Kirsch and Sobel operator is a combination algorithm between Kirsch
operator and Sobel operator. The Sobel coefficient in Sobel operator is used in
Equations 2.12 and 2.13 to be implemented in the Kirsch masks. Note that the Sobel
operator is two-directional orientations (horizontal and vertical) and the Kirsch masks
are an eight-neighbourhood orientation. The combination of Kirsch and Sobel operator
make the algorithm more efficient.
The combination of Kirsch and Sobel operator is an eight neighbourhood masks
with the coefficients of Sobel operator. Equation 2.14 is used to get the maximum
magnitude to determine the directional orientation. The masks of Kirsch and Sobel
operator are as below:
⎡- 2 - 1
G(0) = ⎢⎢ - 1 0
⎢⎣ 0 1
0⎤
⎡- 1
⎥
1⎥. G(1) = ⎢⎢- 2
⎢⎣- 1
2⎥⎦
⎡ 2 1 0⎤
⎡ 1
⎢
⎥
G(0) = ⎢ 1 0 - 1⎥. G(1) = ⎢⎢ 2
⎢⎣ 0 - 1 - 2⎥⎦
⎢⎣ 1
0
0
0
1⎤
2⎥⎥. G(2) =
1⎥⎦
⎡ 0 1
⎢- 1 0
⎢
⎢⎣- 2 - 1
2⎤
⎡ 1 2 1⎤
⎥
1⎥, G(3) = ⎢⎢ 0 0 0⎥⎥
⎢⎣- 1 - 2 - 1⎥⎦
0⎥⎦
0 - 1⎤
⎡ - 1 - 2 - 1⎤
⎡ 0 - 1 - 2⎤
⎥
⎥
⎢
0 - 2⎥. G(2) = ⎢ 1 0 - 1⎥, G(3) = ⎢⎢ 0 0 0⎥⎥
⎢⎣ 1 2 1⎥⎦
⎢⎣ 2 1 0⎥⎦
0 - 1⎥⎦
The Kernels G in the masks of Kirsch and Sobel operator is used to obtain the
derivatives by [i][j]-notation.
K[0] = f[i][j+1]+2*f[i+1][j+1]+f[i+1][j]
-f[i][j-1]-2*f[i-1][j-1]-f[i-1][j]
K[1] = f[i+1][j+1]+2*f[i+1][j]+f[i+1][j-1]
-f[i-1][j-1]-2*f[i-1][j]-f[i-1][j+1]
K[2] = f[i+1][j]+2*f[i+1][j-1]+f[i][j-1]
-f[i-1][j]-2*f[i-1][j+1]-f[i][j+1]
K[3] = f[i+1][j-1]+2*f[i][j-1]+f[i-1][j-1]
f[i-1][j+1]-2*f[i][j+1]-f[i+1][j+1]
55
K[4] = f[i][j-1]+2*f[i-1][j-1]+f[i-1][j]
-f[i][j+1]-2*f[i+1][j+1]-f[i+1][j]
K[5] = f[i-1][j-1]+2*f[i-1][j]+f[i-1][j+1]
-f[i+1][j+1]-2*f[i+1][j]-f[i+1][j-1]
K[6] = f[i-1][j]+2*f[i-1][j+1]+f[i][j+1]
-f[i+1][j]-2*f[i+1][j-1]-f[i][j-1]
K[7] = f[i-1][j+1]+2*f[i][j+1]+f[i+1][j+1]
-f[i+1][j-1]-2*f[i][j-1]-f[i-1][j-1]
For example, we compute the first 3x3 matrix from the generated10x10 pixels of image
(in Figure 4.2). It can be shown as below.
Figure 4.4
144
152
144
144
152
152
144
160
152
The first 3x3 matrix of the image in Figure 4.2.
The centre pixel of the first 3x3 matrix is 152. The derivatives of Kirsch and
Sobel are used to transform the centre pixel into a new value.
K[0]= f[i][j+ 1]+ 2 * f[i + 1][j+ 1]+ f[i + 1][j] f[i][j- 1]- 2 * f[i - 1][j- 1]- f[i - 1][j]
= 152 + 2(152) + 160 − 144 − 2(144) − 152
= 32
K[1] = f[i + 1][j + 1]+ 2 * f[i + 1][j]+ f[i + 1][j- 1] f[i - 1][j- 1]- 2 * f[i - 1][j]- f[i - 1][j + 1]
= 152 + 2(160) + 144 − 144 − 2(152) − 144
= 24
56
K[2] = f[i + 1][j]+ 2 * f[i + 1][j- 1]+ f[i][j- 1] f[i - 1][j]- 2 * f[i - 1][j + 1]- f[i][j + 1]
= 160 + 2(144) + 144 − 152 − 2(144) − 152
=0
K[3] = f[i + 1][j- 1]+ 2 * f[i][j- 1]+ f[i - 1][j- 1] f[i - 1][j + 1]- 2 * f[i][j + 1]- f[i + 1][j + 1]
= 144 + 2(144) + 144 − 144 − 2(152) − 152
= −24
K[4] = f[i][j- 1]+ 2 * f[i - 1][j- 1]+ f[i - 1][j] f[i][j + 1]- 2 * f[i + 1][j + 1]- f[i + 1][j]
= 144 + 2(144) + 152 − 152 − 2(152) − 160
= −32
K[5] = f[i - 1][j- 1]+ 2 * f[i - 1][j]+ f[i - 1][j + 1] f[i + 1][j + 1]- 2 * f[i + 1][j]- f[i + 1][j- 1]
= 144 + 2(152) + 144 − 152 − 2(160) − 144
= −24
K[6]= f[i - 1][j]+ 2 * f[i - 1][j + 1]+ f[i][j + 1] f[i + 1][j]- 2 * f[i + 1][j- 1]- f[i][j- 1]
= 152 + 2(144) + 152 − 160 − 2(144) − 144
=0
K[7]= f[i - 1][j + 1]+ 2 * f[i][j + 1] + f[i + 1][j + 1] f[i + 1][j - 1]- 2 * f[i][j- 1]- f[i - 1][j - 1]
= 144 + 2(152) + 152 − 144 − 2(144) − 144
= 24
From the calculation, we can determine the absolute of maximum magnitude
K [ i ] , i ∈ {0,1,...,7} to be max = 32. Therefore, the centre pixel of the first 3x3 matrix
is changed from 152 to 32. The calculation is repeated until all the centre pixels value
has been changed. From the calculation, the new values of the centre pixels are as
shown in Figure 4.5.
57
2
3
4
5
6
7
8
9
Figure 4.5
2
32
32
40
64
96
112
96
72
3
24
8
80
200
272
328
328
320
4
32
48
168
304
318
317
324
340
5
24
56
200
311
204
84
60
60
6
32
72
248
333
132
14
0
0
7
32
88
280
324
93
14
14
14
8
32
88
288
316
78
0
0
0
9
32
72
288
316
85
14
14
14
The new value of the image in 8x8 matrix after filtering the image
in Figure 4.2.
Step 5: Compare the filtering image f[i][j] with the threshold value T
The last step is to compare the new value with the threshold value, T = 178. The
process produces two regions with value 1 and 0. After all the pixel values in Figure 4.5
are compared, the new form as shown in Figure 4.6 is produced.
2
3
4
5
6
7
8
9
Figure 4.6
2
0
0
0
0
0
0
0
0
3
0
0
0
1
1
1
1
1
4
0
0
0
1
1
1
1
1
5
0
0
1
1
1
0
0
0
6
0
0
1
1
0
0
0
0
7
0
0
1
1
0
0
0
0
8
0
0
1
1
0
0
0
0
9
0
0
1
1
0
0
0
0
The binary edge image of Figure 4.5.
58
4.3
Local Minima Image
In this stage, we want to get the minima pixels to smooth the edge. A thick edge
will make it difficult to acquire the real corners. Therefore, we apply the Zhang-Suen
Thinning Algorithm to the edge that will detect the minima pixels by suppressing all the
other non-minimum pixels. The resulting image is called a thinning image and those
extracted pixels are called minima pixels. By applying this step, the number of points
will be reduced significantly. The Zhang-Suen Thinning Algorithm is described as in
Figure 4.7.
Step 1:
Count the number of object pixels with value 1, N(P) for
the eight neighbourhoods as shown in Figure 2.7.
Step 2:
Check if 2 ≤ N(P) ≤ 8 then
2.1
Generate the pixel sequence P1, P2,..., P8, P1 .
2.2
Count the number of ‘zero to one’ patterns, N(T)
2.3
Check if N(T) = 1 then
2.3.1 Check if
(P1 • P3 • P5 = 0)
and
(P3 • P5 • P7 = 0)
then
the central pixel can been deleted.
2.3.2 Check if
(P7
• P1 • P3 = 0) and (P1 • P7 • P5 = 0) then
the central pixel can been deleted.
Step 3:
Stop the process when is no more pixel to be removed.
Figure 4.7
The Zhang-Suen Thinning Algorithm
After the thinning process, the probability lines in an edge image are calculated
to reduce some noise. The algorithm of Hough Line Detector is used to get the
probability of line edges. As an example, consider a 3x3 matrix from Figure 4.6 as
shown in Figure 4.8.
59
P7
Figure 4.8
P8
P1
P2
0
0
0
0
1
1
1
1
1
P6
P5
P4
P3
A 3x3 matrix is obtained from Figure 4.6.
The centre pixel in Figure 4.8 corresponds to the array f[4][5] in Figure 4.6. We
have the number of object pixels with value 1 denoted by N(P) to be 4 that P3, P4, P5 and
P6. Then, the pixel sequence P1, P2,…, P8, P1 is generated by using the template in
Figure 2.7. The sequence obtained is 001111000. The next step is to count the number
of ‘zero to one’ patterns, N(T). So we have N(T)=1. Using the condition (P1 • P3 • P5 = 0)
and (P3 • P5 • P7 = 0) , the pixel at array f[4][5] can be deleted, therefore the pixel in array
f[4][5] equal to 0. All processes are repeated for all pixels and the Hough Line Detector
is used to produce a line in minima pixels as illustrated in Figure 4.9.
2
3
4
5
6
7
8
9
Figure 4.9
2
0
0
0
0
0
0
0
0
3
0
0
0
0
0
0
0
0
4
0
0
0
1
1
1
1
1
5
0
0
0
1
0
0
0
0
6
0
0
0
1
0
0
0
0
7
0
0
0
1
0
0
0
0
8
0
0
0
1
0
0
0
0
9
0
0
0
1
0
0
0
0
The minima image after thinning process with Zhang-Suen Algorithm
and Hough Line Detector.
60
4.4
Locating Corners
The peaks of accumulator array mark the equations of significant lines. The
Hough Transform does not require prior grouping or linking of the edge points. The
edge points that lie along the curve of interest may constitute a small fraction of the
edges in the image. Moreover, if the peak in the parameter space covers more than one
accumulator, then the centre of the region that contains the peak provides an estimate of
the parameters.
The peaks of accumulator can be defined with the setting of threshold value.
From these peaks, we can detect the arbitrary line in the image. Table 4.1 shows the
example of accumulator that describes as in an array is produced by Figure 4.9. In
summary, the accumulator can be illustrated as in appendix A.
Table 4.1
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
0
0
0
0
0
5
1
1
1
1
1
0
0
0
0
0
15
0
0
0
0
0
2
4
1
1
1
1
0
0
0
0
30
0
0
0
0
0
0
2
3
2
2
1
0
0
0
0
The accumulator of the Hough Transform
45
0
0
0
0
0
0
1
2
4
2
1
0
0
0
0
60
0
0
0
0
0
0
1
3
3
2
1
0
0
0
0
75
0
0
0
0
0
0
3
4
1
1
1
0
0
0
0
90
0
0
0
0
0
6
1
1
1
1
0
0
0
0
0
105
0
0
0
4
2
1
1
1
1
0
0
0
0
0
0
120
2
2
2
1
1
1
1
0
0
0
0
0
0
0
0
135
1
2
1
1
1
0
0
0
0
0
0
0
0
0
0
150
2
2
0
0
0
0
0
0
0
0
0
0
0
0
0
165
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
180
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
61
From Figure 4.9, we can see obviously the two lines that are constructed by a
pixel 1. The algorithm of Hough Line Detector is applied to detect the two lines by
using the parameters r and t. Equation 2.27 uses the maximum value and the setting
value of α to determine the threshold value.
The maximum value in the accumulator is 6. Suppose we set α to 0.8. The
threshold value T that we have is 4.8. Since we use the int() function in programming,
the threshold value becomes T = 4. Every element in the accumulator (as shown in
Figure 4.5) is compared with the threshold value to detect lines. But, we obtain 6 line
candidates by using the thresholding in the accumulator. Table 4.2 shows the pair of r
and t that is held by parameters rho[i] and theta[i].
Table 4.2
i
rho[i]
theta[i]
1
4
0
2
6
15
The array of rho[i] and theta[i]
3
8
45
4
7
75
5
5
90
6
3
105
Since the detected lines are not same, we need to reset α with 0.9. So, we have
the threshold value T = 5. In this case we have rho[1] = 4, theta[1] = 0 and rho[2] = 5,
theta[1] = 90 that correspond to the two lines. From these values, the two lines can be
constructed by the connected pixel with value 1 as in Figure 4.9.
We can apply the values of rho[i] and theta[i] to detect a corner. By using the
algorithm of Hough Corner Detector, we need to reset the value of image array f[i][j]
with 0. Then, we use the values of rho[i] and theta[i] to compute the slopes where m1 = 0
and m2 = ∞. The slopes show the lines are not parallel where m2 ≠ m1 .
62
Therefore, we use Equation 3.12 to detect a corner and have R1 = 4 and R2 = 5.
We also have sin(0) = 0, sin(90) = 1, cos(0) = 1 and cos(90) = 0 . Then, the corner pixel
⎡ x ⎤ ⎡ 4⎤
can be obtained by ⎢ ⎥ = ⎢ ⎥ and then the image array f[5][4] equal to 1. The corner
⎣ y ⎦ ⎣5 ⎦
pixel can be illustrated as figure below.
2
3
4
5
6
7
8
9
Figure 4.10
2
0
0
0
0
0
0
0
0
3
0
0
0
0
0
0
0
0
4
0
0
0
1
0
0
0
0
5
0
0
0
0
0
0
0
0
6
0
0
0
0
0
0
0
0
7
0
0
0
0
0
0
0
0
8
0
0
0
0
0
0
0
0
9
0
0
0
0
0
0
0
0
The corner candidate using the Hough Corner Detector.
CHAPTER V
SIMULATION MODEL
5.1
Introduction
A simulation model is developed to provide the results of several sample images,
which are conducted to study the performance of the Hough Transform. In this
implementation, images in bitmap format are used and the rate of the maximum
threshold value is selected as given by equation 2.27. Our simulation is based on the
case study described in Chapter IV.
This simulation uses the MFC programming with Microsoft Visual C++. The
MFC programming has a tool to create windows to illustrate the result image properly.
Some variables, arrays and functions are being constructed to represent each stage
parameter used in the corner detection process. This Interface simulation is created to
help analysing the number of corners that should be detected.
64
5.2
Model Description
The simulation will process a digital image into bitmap format. The image
consists of 640 x 480 pixels and uses the RGB colours. This simulation is an objectoriented approach that can construct the windows and other object easily. An objective
of this study is to design a simulation model to detect the corner. Therefore, some
functions and procedures must be studied to develop the simulation program. Some
member functions in the simulation are:
CMainFrame(); - it is a constructor. A class member function allocating memory
storage and initialising class variables.
~CMainFrame(); - it is a destructor that allocated from free store memory.
OnPaint(); - Activate the paint area.
OnLButtonDown(UINT, CPoint); - Activate the left mouse button
void ImageEdgeDraw(); - Illustrate the edge image in the paint area.
void ConvertFileBinary(CString); - Convert the magnitude into binary
form.
void GreyLevelImage(); - The function that convert the image into grey
colour.
void KSobelEdgeDetector(int); - Filter the image.
void ZhangSuenThinning(); - Smoothing the edge image.
void HoughAccumulator(); - Determine the significant lines that refers to the
parameters r and θ.
void HoughLineDetector(); - Function to detect the line.
void HoughCornerCandidate(); - Function to compute the corner candidate.
void HoughAccurateCorner(); - Function to detect the real corner.
65
5.3
Simulation Model
The order of the process in detecting corner is given by the stages in Figure 2.10.
In the first stage, the digital image is analysed and the intensity is extracted. The result
from this process is the edge image with black and white colour. The edge image holds
the binary value. The value 1 corresponds to black pixel while 0 corresponds to white.
Figure 5.1 is a sample of digital image and Figure 5.2 is the resulted of edge detector
using Sobel and Kirsch Operator.
Figure 5.1
The sample of digital image
66
Figure 5.2
The result from the edge detector (the Edge Image)
After filtering process is done by using Sobel and Kirsch edge detector, the
thresholding process will be carried out. Note that, the image is in RGB format that
contains 16,77,126 colours per-pixel. Equation 2.27 is used and the rate of thresholding
in Kirsch and Sobel operator is set as α1 = 0.5. By multiplying this rate with the
maximum magnitude value, the threshold value is obtained. By using the threshold
value, the image can be separated into two regions as shown in Figure 5.2 above.
The next step is to smooth the edges to detect the lines. The Hough Transform
can detect the line using the accumulator that is produced from the Thinning Method.
The Zhang- Suen Thinning Algorithm reduces the non-minima point that refers to the
edges until it is smoothed out. Then the accumulator is used to find the r and θ that
corresponds to the lines.
67
More lines can be generated using the r and θ that are obtained from the
accumulator. The thresholding is used to find the best lines that are similar to the object
represented. The rate of the largest value in the accumulator is set to α2 = 0.3. Figure 5.3
shows the thin lines image that is produced from the thinning process applied to the
image given in Figure 5.2.
Figure 5.3
The result of thinning process (the Minima Image)
The next process is to detect the corner candidates. The corner candidates are
produced from the linear equation 3.12. We have detected the lines that have their own
r and θ values. Using these values, the corner candidates that correspond to the
intersection point of the lines in Figure 5.3 will be calculated. We can see the result as
shown in Figure 5.4. The blue dots in the figure correspond to the corner candidates.
68
Figure 5.4
The corner candidates produced from the thinning image of Figure 5.3
To detect the accurate corner, the candidates must been verified. In this study,
the accurate corners are assumed to have the highest value of the intersecting lines. The
thresholding is then set to determine the accurate corners. In this study, the rate of the
largest number of the corner candidates is set as α3 = 0.4. Figure 5.5 shows the accurate
corners produced from the separation of corner candidates given in Figure 5.4. The red
dots are used to illustrate the accurate corners in the image.
69
Figure 5.5
The accurate corners are obtained from Figure 5.4
Not all accurate corners can be detected by using the Hough Transform. The
algorithm only detected 10 corner points out of 20 corner points in the sample of digital
image. We can make an assumption that the Hough Transform is an important process
in detecting lines. However, the detected number of lines depends on the performance
of edge detector. The selection of thresholding in the edge detection plays an important
role in producing the accurate corners.
In this study, we will describe the importance of thresholding to detect the
accurate corners from the image. The parameters are set manually to provide the
accurate corners. The parameters are the rate of maximum values that refer to α1, α2 and
α3. Through programming, the parameters are changed into a1, a2 and a3.
70
The effect of the threshold value to the number of detected corners is given in
Appendix B. The analysis shows the accuracy of selected rates in detecting the real
corners. The rates have not accurate still produce the corner candidates. We can see
that a1 = 0.9, a2 = 0.1 and a3 = 0.4 is the best rates of maximum value for this image.
The rates produce 14 corner points out of the 20 corner points. Most of the detected
corners are in the middle region of image. We know that the middle region is brighter
than the other regions. That makes the corners easily detected in that area.
5.4
Summary
The Hough Transform can detect the simple corner based on the lines detected.
We can see that the Hough Transform is sensitive to the thresholding used in the edge
detector. The three parameters (a1, a2 and a3) that refer to the rate of maximum value
are important to detect the significant corner.
Generally, high value of parameter a1 will make the edges smoother with less
noise. Otherwise, the image produced will have thick edges and more noise. While the
parameter a2 is the rate of threshold for selecting lines and parameter a3 is for selecting
corners. Both parameters depend on the selection of parameter a1. The detected corners
using different parameters values are shown in Appendix C.
CHAPTER VI
SUMMARY AND CONCLUSIONS
6.1
Observation
The main focus of this study is to use the method of Hough Transform to detect
corners in an image. The process of detecting corners involves four stages. The four
stages are the edge detection, thinning problem, line detection and corner detection.
In the first stage, the Kirsch and Sobel operator is used to detect edges. The real
values of the edges are in binary form. The binary values are then constructed into client
area with an MFC application by using the function in Microsoft Visual C++. The
selection of the rate of thresholding will effect the detection of edge image. The edges
are thick and have noise when the rate is low. Whereas, the edges are thin and less noisy
when the rate is high. This factor will affect the process of detecting the accurate
corners.
72
The Zhang-Suen thinning algorithm is used to thin the edges. The aim is to
speed up the method of Hough Transform in obtaining the corner. This algorithm will
verify the edges whether to remove pixel or not. The main condition is 2 ≤ N ( P ) ≤ 6
where N (P ) is the number of nonzero of the eight neighbours of P.
From this condition, only the upper and left parts of the edges are removed. The
pixel removal is not symmetrical. By changing the condition into 2 ≤ N(P) ≤ 8 , the
pixel removal (centre pixel) is symmetrical and the edges are thinned smoothly. It is
because the condition uses the entire pixels neighbourhood (eight-neighbourhood) to
verify whether to remove the centre pixel or not.
The third stage is to detect the lines. The Hough Line Detector has been used to
detect the probability of the existing lines in the second stage by using the accumulator.
This stage involves the computation of the parameters r and θ to show the probability of
lines and will be used to detect the existing corners.
The Hough Transform detects the lines using the high value in the accumulator
that represent the parameters of r and θ. The rate of thresholding is used to detect the
actual lines. The lower rate of thresholding makes the Hough Transform detect more
lines. The effect is the Hough Transform cannot detect the corner accurately.
The accumulator in Hough Line Detector is also used to detect the corners. The
parameters that refer to the lines are used to detect the corner candidates by using
Equation 3.12. We know the corner candidates are the intersection points in two or
more lines. The corner candidates that are obtained are not the same as in real
visualisation. Therefore, the last stage is used to detect real corners.
73
An assumption is made that the high intersection points define the corner. The
thresholding is used to detect real corners from the corner candidates. Overall, the real
corners not well defined using the Hough Transform. Almost the real corners easily
detect surroundings at the brightness region in an image.
6.2
Discussions
Corner detection is one of the most commonly used operations in image analysis
especially in providing information for object recognition. A corner is usually related
with the edges in an image. This means that if the edges in an image can be accurately
identified, the corners can be easily detected. The corner points can then be used to
describe basic properties such as area, perimeter and shape. Since computer vision
involves in the identification and classification of objects in an image, corner detection
is an essential tool.
Most good algorithms begin with a clear statement of the problem to be solved.
The aim of this study is to find the corner points in an image. The problem is then
separated into three minor problems. First problem is to extract the digital image into
the edge image. Then, find the accurate line by searching the minima points. Lastly, the
problem is to detect the corner points in the minima image.
This study consists of four main stages of implementation. The first stage is the
problem to identify the edge points. Some edge detector such as Roberts, Prewitt, Sobel
and Kirsch Operators had been used to study the characteristic that corresponds to the
performance. In the simulation model, we can see that the brightness factor plays an
74
important role in detecting the accurate edges. The fundamental of thresholding is also
the essential characteristic to determine the separation region between edges and the
background.
The threshold can be chosen by inspecting the edge magnitude histogram. Only a
small percentage of the pixel f ( x , y ) is obtained. Thresholding 2.21 is global and can be
applied to the entire image. In general, the edge magnitude output has regions
possessing different statistical properties. Therefore, the global thresholding may
produce thick edges in one region and thin or broken edges in another region.
The second stage problem is to acquire the minima points from the edge image.
The problem is solved by using the thinning approach. The Zhang-Suen thinning
algorithm is chosen to get the minima point of the edge image. By observation, the
Zhang- Suen Algorithm did not produce good results in reducing the noise in the edge
image. Other factors that contribute to this result is the brightness of digital image or the
selected edge detector. For future work, the observation of the edge detector must be
interpreted to produce some good results.
The Last stage problem is to define the approach that can detect the corner well.
This study involves the application of Hough Transform in image processing. The
Hough Transform is well known as the line detection algorithm in the imaging area.
The algorithm produces the accumulator to detect lines using the parameters of r and θ.
Whereas, the main objective is to study the Hough Transform in application of
corner detection. The parameters in the accumulator are used to detect the corners using
Equation 3.12 as well as can. Nevertheless, the Hough Transform is suitable for image
that has lines to detect easily the intersection points known as corners.
75
6.3
Suggestions
The following are some of the suggestions on improving the results of this
dissertation and possibly on future research of this area:
‰
Use other edge detector that can reduce the noise such as Canny Edge
Detector or Laplacian of Gaussian. The approach must be understood
very well to produce good result.
‰
Further research on the thresholding that corresponds to the performance
of separating the edge image.
‰
Use other method such as Wavelet Transform, Genetic Algorithm or
Mathematical Morphology Approach to improve the detection of the
candidate corners.
‰
Determine other stages to acquire the corner and study the performance.
76
REFFERENCES
Anton, H, Bivens, I. And Davis, Stephen (2005). Calculus 8th Edition. Singapore: John
Willey & Sons (Asia) Pte. Ltd.
Awcock, G. W. And Thomas R. (1996). Applied Image Processing. New York:
McGraw-Hill Inc.
Castleman, K. R. (1996). Digital Image Processing. New Jersey: Prentice Hall.
Chanda, B. And Majumder D. D. (2000). Digital Image Processing and Analysis. New
Delhi: Prentice Hall of India.
Chen, C.H, Lee J. S. and Sun Y. N. (1995). Wavelet Transformation for Gray Level
corner detection. Tainan University: Elsevier Science B.V.
Davies, E.R. (1988). Application of the generalised Hough Transfrom to corner
detection. IEE Proceedings, vol 135.
Engkamat, A (2005). Enhancement of Parallel Thinning algorithm for Handwritten
Characters using Neural Network. Universiti Teknologi Malaysia: Master
Thesis.
Freeman, H. (1961). On the encoding of arbitrary geometric configurations. IRE Trans.
Electron Comp. EC-10, June 1961, 260-268.
77
Gonzalez, R. C. and Woods, R. E. (2004). Digital Image Processing using Matlab. New
Jersey: Prentice Hall.
Haron, H., Shamsuddin, S. M. and Mohamed, D. (2005). A New Corner Detect
algorithm for Chain Code representation. Loughborough University: Taylor &
Francis Group.
Jones, R. M. (2000). Intoduction to MFC programming with Visual C++. New Jersey:
Prentice Hall PTR.
Lee J. S., Sun, Y. N. and Chen, C. H. (1995). Multiscale Corner Detection by using
Wavelet Transform. IEEE Transactions on Image Processing: no.1 vol.4.
Manohar V. (2004). Digital Image Processing-Basic Thresholding and Smoothig.
University of South Florida at http:Marathon.csee.usf.edu/~yqiu2/ Basic_
Thresholding_and_Smoothing.doc.
Matteson, R. G. (1995). Introduction to Document Image Processing Techniques.
Boston, London: Artech House Publishers.
Pappas, C. H. and Murray III W. H. (1998). Visual C++ 6: The Complete Reference.
Osborne: Mc Graw Hill.
Parker, J. R. (1997). Algorithms for Image Processing and Computer Vision. Canada:
Wiley Computer Publishing.
Pitas, I. (1993). Digital Image Processing Algorithms. UK: Prentice Hall International
Ltd.
Raji, A. W. M. et al. (2002). Matematik Asas. Universiti Teknologi Malaysia: Skudai.
78
Ritter, G. X. and Wilson, J. N. (2001). Handbook of ComputerVision Algorithms in
Image Algebra. Boca Raton: CRC Press.
Salleh, S. et al. (2005). Numerical Simulations and Case Studies using C++ .Net.
USA: Tab Books.
Seul, M. et al. (2000). Practical Algorithms for Image Analysis Description, Examples
and Code. UK: Cambridge University Press.
Shen, F. and Wang, H. (2002). Corner detection based on modified Hough Transfrom.
Nanyang Technological University: Elsavier Science B.V.
The Hough Transform at http://en.wikipedia.org/wiki/Hough_ transform.
The Hough Transform at http://planetmath.org/encyclopedis/HoughTransform.html.
Yaakop, M. H. (2004). Sistem Bantuan Pengesanan Simpang bagi garisan tidak sekata
dengan menggunakan Teknik Wavelet Transform. Universiti Teknologi
Malaysia: Bachelor Thesis.
Zhang, T. Y. and Suen, C. Y. (1984). A Fast Parallel Algorithm for Thinning Digital
Patterns. Comm. ACM. 27(3): 236-239.
APPENDIX A
The sinusoidal graph in Hough space
The sinusoidal function in Hough space from the accumulator in Figure 4.9
(5,5)
(6,5)
(7,5)
(8,5)
(9,5)
(4,6)
(4,5)
(4,7)
(4,8)
(4,9)
Hough space
15
rho -axis
10
5
0
0
30
60
90
-5
-10
theta -axis
120
150
180
APPENDIX B
The Analysis of the Rate of the Maximum Thresholding
a1
a2
a3
Accuracy
Number
Points
1.0
0.1
0.5
Yes
10
212
318
426
243
57
532
352
164
460
274
48
124
202
231
261
281
311
341
389
423
1.0
0.1
0.4
Yes
10
212
318
426
242
57
532
351
164
459
274
48
124
202
230
260
280
310
342
391
423
1.0
0.1
0.3
No
12
212
318
135
425
242
57
375
532
351
164
459
274
48
124
153
201
230
260
278
279
310
342
391
423
0.9
0.1
0.5
Yes
11
212
318
136
426
242
57
532
353
164
461
274
48
123
151
202
230
259
281
310
342
388
423
0.9
0.1
0.4
Yes
14
212
493
318
136
596
425
242
57
532
352
164
461
274
562
48
99
123
151
175
202
230
260
281
310
342
389
423
471
0.9
0.1
0.3
No
16
212
494
318
136
597
425
242
56
375
532
352
164
632
460
273
562
47
99
123
151
175
201
230
260
278
280
310
342
358
389
422
470
0.8
0.2
0.5
Yes
7
212
318
136
425
532
352
274
49
123
152
202
280
310
423
0.8
0.2
0.4
Yes
10
213
318
136
426
242
56
532
351
461
274
48
123
152
201
230
261
279
311
390
423
0.8
0.2
0.3
No
11
214
318
135
426
242
56
374
532
351
461
274
46
123
153
201
230
261
277
279
311
390
424
a1
a2
a3
Accuracy
Number
Points
0.7
0.3
0.6
Yes
5
318
426
242
352
460
123
202
230
311
388
0.7
0.3
0.5
Yes
5
318
426
242
352
460
123
202
230
311
388
0.7
0.3
0.4
No
12
214
318
137
297
426
264
242
468
531
353
165
460
46
123
154
153
202
200
230
233
279
311
342
389
0.6
0.3
0.5
Yes
6
318
426
242
532
352
460
123
203
231
280
311
389
0.6
0.3
0.4
Yes
7
318
136
426
242
532
352
460
123
153
202
231
279
311
389
0.6
0.3
0.3
No
14
214
318
137
297
158
426
264
205
242
468
531
352
165
459
46
123
153
153
170
202
200
203
231
233
279
312
342
391
0.5
0.3
0.5
Yes
9
389
213
317
136
426
242
531
353
460
21
50
124
153
204
230
280
311
389
0.5
0.3
0.4
Yes
10
389
213
317
136
426
242
531
353
460
276
21
50
124
153
203
230
281
310
389
426
0.5
0.3
0.3
No
21
389
214
495
318
136
297
598
264
426
242
407
467
56
514
375
531
560
353
165
460
276
21
48
100
124
154
153
178
200
203
230
230
233
259
268
278
280
302
311
342
389
426
APPENDIX C
The Result of Corner Image using different Thresholding
85
86
87
88
89
90
91
92
93
APPENDIX F
The Related Source Code
95
Code 1
File header: Project.h
#include <afxwin.h>
#include <fstream.h>
#include <math.h>
#include "resource.h"
#define Timer 1
#define RAD 0.0174532925
#define a1 0.5 //the thresholding rate in Kirsch-Sobel Edge Detector
#define a2 0.3 //the thresholding rate in Houg Line Detector
#define a3 0.4 //the thresholding rate in Hough Corner Detector
typedef struct
{
int i;
int *px,*py;
double *rho,*theta;
}SCorner;
class CMainFrame : public CFrameWnd
{
private:
int i,j,r,t,m,n,rA,click;
int **f,**h,**A;
int xHome,yHome,xSize,ySize;
SCorner cDet;
public:
CMainFrame();
~CMainFrame();
CBitmap mBitmap;
afx_msg void OnPaint();
afx_msg void OnKeyDown (UINT MyKey,UINT nRep,UINT nFlags);
void MainStageProcess();
void ResultListReport();
void LabelProcess(int,int);
void TheRawImage();
void GreyLevelImage();
void ImageEdgeDraw();
void ConvertFileBinary(CString);
void KSobelEdgeDetector();
void ZhangSuenThinning();
void HoughAccumulator();
void HoughThresholding();
void HoughLineDetector();
void HoughCornerCandidate();
void HoughAccurateCorner();
DECLARE_MESSAGE_MAP()
};
class CMyWinApp : public CWinApp
{
public:
BOOL InitInstance();
};
96
Code 2
Main programming in Project.cpp
#include "Project.h"
CMyWinApp MyApplication;
BOOL CMyWinApp::InitInstance()
{
m_pMainWnd = new CMainFrame();
m_pMainWnd->ShowWindow(m_nCmdShow);
m_pMainWnd->UpdateWindow();
return TRUE;
}
BEGIN_MESSAGE_MAP(CMainFrame, CFrameWnd)
ON_WM_PAINT()
ON_WM_KEYDOWN()
END_MESSAGE_MAP()
CMainFrame::CMainFrame()
{
click=0;
xHome=10; yHome=30,
xSize=640, ySize=480;
rA=(int)(sqrt(xSize*xSize+ySize*ySize)/2.0);
m=2*rA+1; n=360;
cDet.px=new int [xSize*ySize];
cDet.py=new int [xSize*ySize];
cDet.rho=new double [m*n];
cDet.theta=new double [m*n];
A=new int *[m+1];
for (r=0;r<=m;r++)
A[r]=new int [n+2];
f=new int *[ySize+1];
h=new int *[ySize+1];
for (i=0;i<=ySize;i++)
{
f[i]=new int [xSize+1];
h[i]=new int [xSize+1];
}
for (r=0;r<m;r++)
for (t=0;t<=n;t++)
A[r][t]=0;
mBitmap.LoadBitmap(IDB_IMEJB);
Create(NULL, "Corner Detection by Hough Transform",
WS_OVERLAPPEDWINDOW,CRect(CPoint(170,100),CSize(670,600)));
}
97
CMainFrame::~CMainFrame()
{
delete cDet.px;
delete cDet.py;
delete cDet.rho;
delete cDet.theta;
for (r=0;r<=m;r++)
delete A[r];
delete A;
for (i=0;i<=ySize;i++)
{
delete f[i];
delete h[i];
}
delete f;
delete h;
}
void CMainFrame::OnPaint()
{
if (click==0)
MainStageProcess();
}
void CMainFrame::OnKeyDown(UINT MyKey,UINT nRep,UINT nFlags)
{
switch(MyKey)
{
case VK_SPACE: click++;
MainStageProcess();
break;
case VK_ESCAPE: DestroyWindow();
break;
}
}
void CMainFrame::MainStageProcess()
{
CClientDC dc(this);
CString Label;
switch (click)
{
case 0: dc.TextOut(xHome+220,yHome-25," The Digital Image
");
TheRawImage();
“
dc.TextOut(xHome+220,yHome+500,"<< Press Space Bar >>");
break;
98
case 1: dc.TextOut(xHome+240,yHome+500,"
dc.TextOut(xHome+220,yHome-25," Grey-Level process
GreyLevelImage();
dc.TextOut(xHome+240,yHome+500,"
");
");
");
dc.TextOut(xHome+220,yHome-25,"
Stage [1] :
The Edge Image
");
KSobelEdgeDetector();
ImageEdgeDraw();
ConvertFileBinary("P1_EdgeImage.out");
dc.TextOut(xHome+220,yHome-25," Accumulator Process ");
ZhangSuenThinning();
HoughAccumulator();
dc.TextOut(xHome+220,yHome-25,"
Stage [2] :
The Thinning Image
");
HoughThresholding();
HoughLineDetector();
ImageEdgeDraw();
ConvertFileBinary("P2_ThinImage.out");
break;
case 2: dc.TextOut(xHome+220,yHome-25,"
Stage [3] :
The Corner Candidate
");
HoughCornerCandidate();
break;
case 3: dc.TextOut(xHome+220,yHome-25,"
Stage [4] :
The Corner Image
");
HoughLineDetector();
ImageEdgeDraw();
HoughAccurateCorner();
ResultListReport();
ConvertFileBinary("P3_CornerImage.out");
break;
default: DestroyWindow();
break;
}
}
void CMainFrame::ResultListReport()
{
CClientDC dc(this);
CString Label;
CRect Titik;
Label.Format("Threshold 1 : %.2f",a1);
dc.TextOut(xHome,yHome+485,Label);
Label.Format("Threshold 2 : %.2f",a2);
dc.TextOut(xHome,yHome+500,Label);
Label.Format("Threshold 3 : %.2f",a3);
dc.TextOut(xHome,yHome+515,Label);
dc.TextOut(xHome+490,yHome+485,"The candidate corner
dc.TextOut(xHome+490,yHome+505,"The real corner
Titik=CRect(CPoint(xHome+470,yHome+485),CSize(15,15));
dc.FillSolidRect(&Titik,RGB(0,0,255));
Titik=CRect(CPoint(xHome+470,yHome+505),CSize(15,15));
dc.FillSolidRect(&Titik,RGB(255,0,0));
}
");
");
99
void CMainFrame::LabelProcess(int x,int xs)
{
CClientDC dc(this);
CString Label;
int percentage;
::Sleep(Timer);
percentage=100*x/xs;
Label.Format("
dc.TextOut(xHome+530,yHome+485,Label);
Label.Format("Processing [%d]",percentage);
dc.TextOut(xHome+530,yHome+485,Label);
");
}
void CMainFrame::TheRawImage()
{
CPaintDC dc(this);
CDC memDC;
memDC.CreateCompatibleDC(&dc);
memDC.SelectObject(&mBitmap);
dc.BitBlt(xHome,yHome,xSize,ySize,&memDC,0,0,SRCCOPY);
}
void CMainFrame::GreyLevelImage()
{
CClientDC dc(this);
int r,g,b,gs;
for (i=0;i<=ySize;i++)
{
LabelProcess(i+1,ySize+1);
for (j=0;j<=xSize;j++)
{
gs=dc.GetPixel(xHome+j,yHome+i);
r= gs >> 16;
// red
g= r << 8;
// green
b= g << 8;
// blue
gs=r+g+b;
// grayscale
f[i][j]=gs;
dc.SetPixel(xHome+j,yHome+i,f[i][j]);
}
}
}
100
void CMainFrame::ImageEdgeDraw()
{
CClientDC dc(this);
int Draw;
for (i=1;i<ySize-1;i++)
{
LabelProcess(i,ySize-2);
for (j=1;j<xSize-1;j++)
{
if(h[i][j]==1)
Draw=RGB(0,0,0);
else
Draw=RGB(255,255,255);
dc.SetPixel(xHome+j,yHome+i,Draw);
}
}
}
void CMainFrame::ConvertFileBinary(CString NameFile)
{
ofstream ofp(NameFile);
for (i=1;i<ySize;i++)
{
for (j=1;j<xSize;j++)
ofp << h[i][j];
ofp << endl;
}
ofp.close();
}
101
Code 3
Source Code of the Sobel and Kirsch Edge Detector
void CMainFrame::KSobelEdgeDetector()
{
int T,K[8];
//Compute the threshold value on Kirsch and Sobel Edge Detector
T=f[0][0];
for (i=0;i<=ySize;i++)
for (j=0;j<=xSize;j++)
if(T<f[i][j]) T=f[i][j]; //max
T= (int)(a1*T);
// Kircsh and Sobel masking for detecting the edges
for (i=1;i<ySize;i++)
for (j=1;j<xSize;j++)
{
K[0]=f[i][j+1]+2*f[i+1][j+1]+f[i+1][j]
-f[i][j-1]-2*f[i-1][j-1]-f[i-1][j];
K[1]=f[i+1][j+1]+2*f[i+1][j]+f[i+1][j-1]
-f[i-1][j-1]-2*f[i-1][j]-f[i-1][j+1];
K[2]=f[i+1][j]+2*f[i+1][j-1]+f[i][j-1]
-f[i-1][j]-2*f[i-1][j+1]-f[i][j+1];
K[3]=f[i+1][j-1]+2*f[i][j-1]+f[i-1][j-1]
-f[i-1][j+1]-2*f[i][j+1]-f[i+1][j+1];
K[4]=f[i][j-1]+2*f[i-1][j-1]+f[i-1][j]
-f[i][j+1]-2*f[i+1][j+1]-f[i+1][j];
K[5]=f[i-1][j-1]+2*f[i-1][j]+f[i-1][j+1]
-f[i+1][j+1]-2*f[i+1][j]-f[i+1][j-1];
K[6]=f[i-1][j]+2*f[i-1][j+1]+f[i][j+1]
-f[i+1][j]-2*f[i+1][j-1]-f[i][j-1];
K[7]=f[i-1][j+1]+2*f[i][j+1]+f[i+1][j+1]
-f[i+1][j-1]-2*f[i][j-1]-f[i-1][j-1];
//find maximum of K[8]
h[i][j]=K[0];
for(int c=1;c<=7;c++)
if (h[i][j]<K[c]) h[i][j]=K[c];
if(h[i][j]>=abs(T)) h[i][j]=1;
else
h[i][j]=0;
}
}
102
Code 4
Source Code of the Zhang-Suen Thinning Method
void CMainFrame::ZhangSuenThinning()
{
int count,trans,P[9];
int check;
//Zhang-Suen Thinning process
do
{
check=1;
for (i=1;i<ySize;i++)
for (j=1;j<xSize;j++)
if(h[i][j]==1)
{
count=0;
for (r=-1;r<=1;r++)
for (t=-1;t<=1;t++)
count += h[i+r][j+t];
if((count>=2)&&(count<=8))
{
P[0]=h[i-1][j];
P[1]=h[i-1][j+1];
P[2]=h[i][j+1];
P[3]=h[i+1][j+1];
P[4]=h[i+1][j];
P[5]=h[i+1][j-1];
P[6]=h[i][j-1];
P[7]=h[i-1][j-1];
P[8]=h[i-1][j];
trans=0;
for (int t=0;t<=7;t++)
if((P[t]==0)&&(P[t+1]==1))trans++;
if(trans==1)
{
if((P[0]*P[2]*P[4]==0)&&(P[2]*P[4]*P[6]==0))
{
h[i][j]=0;
check=0;
}
else if((P[6]*P[2]*P[0]==0)&&(P[0]*P[6]*P[4]==0))
{
h[i][j]=0;
check=0;
}
}
}
}
}while(check==1);
}
103
Code 5
Source Code of the Hough Transform
//Computer the Accumulator of Hough Transform
void CMainFrame::HoughAccumulator()
{
double rho,theta;
int cx=xSize/2,cy=ySize/2;
for (i=1;i<ySize;i++)
{
LabelProcess(i,ySize-1);
for (j=1;j<xSize;j++)
if(h[i][j]==1)
{
for (t=0;t<=n;t++)
{
theta=t*RAD; // average 0 < theta < 180
rho=(j-cx)*cos(theta)+(i-cy)*sin(theta);
rho+=0.5;
r=(int)rho;
A[rA+r][t]+=1;
}
}
}
}
//Determine the Threshold value on Hough Table
void CMainFrame::HoughThresholding()
{
int T=0;
for (r=0;r<m;r++)
for (t=0;t<=n;t++)
if((A[r][t]>0)&&(T<A[r][t]))
T=A[r][t]; //max
T=(int)(a2*T);
//Determine theta and rho using Acumulator
cDet.i=0;
for (r=0;r<m;r++)
for (t=0;t<=n;t++)
if(A[r][t]>=T)
{
cDet.i+=1;
cDet.rho[cDet.i]=r-rA;
cDet.theta[cDet.i]=t*RAD;
}
}
104
//Determine the line in the Image
void CMainFrame::HoughLineDetector()
{
double y;
int **g;;
int cx=xSize/2,cy=ySize/2;
g=new int *[ySize+1];
for (i=0;i<=ySize;i++)
g[i]=new int [xSize+1];
for (i=1;i<ySize;i++)
for (j=1;j<xSize;j++) g[i][j]=0;
for (r=1;r<=cDet.i;r++)
for (j=1;j<xSize;j++)
{
if(sin(cDet.theta[r])==0.0)
{
y+=1;
}
else
{
y=cDet.rho[r]-(j-cx)*cos(cDet.theta[r]);
y=0.5+(y/sin(cDet.theta[r]));
}
i=cy+(int)y;
if((i>0)&&(i<ySize))
//draw line by binary input
if (h[i][j]==1) g[i][j]=1;
}
for (i=1;i<ySize;i++)
for (j=1;j<xSize;j++) h[i][j]=g[i][j];
for (i=0;i<=ySize;i++)
delete g[i];
delete g;
}
105
//Compute the Candidate Corner
void CMainFrame::HoughCornerCandidate()
{
CClientDC dc(this);
int px,py,cx=xSize/2,cy=ySize/2;
double x,y,R1,R2;
for (i=1;i<ySize;i++)
for (j=1;j<xSize;j++) f[i][j]=0;
for (i=1;i<=cDet.i;i++)
for (j=2;j<=cDet.i;j++)
{
if (i>=j)
{
R2=tan(cDet.theta[j]);
R1=tan(cDet.theta[i]);
// the lines are not parallel but perpendicular
if(R1!=R2)
{
R1=cDet.rho[i]/sin(cDet.theta[j]-cDet.theta[i]);
R2=cDet.rho[j]/sin(cDet.theta[j]-cDet.theta[i]);
x=(R1*sin(cDet.theta[j])-R2*sin(cDet.theta[i]));
y=(R2*cos(cDet.theta[i])-R1*cos(cDet.theta[j]));
px=(int)(0.5+x+cx);
py=(int)(0.5+y+cy);
if (((px>7)&&(px<xSize-7))&&((py>7)&&(py<ySize-7)))
f[py][px]+=1;
}
}
}
//keep the candidate corner to the data structure
for (i=1;i<ySize;i++)
for (j=1;j<xSize;j++)
if(f[i][j]>=1)
{
CRect Titik(CPoint(j+xHome-5,i+yHome-5),
CPoint(j+xHome+5,i+yHome+5));
dc.FillSolidRect(&Titik,RGB(0,0,255));
}
}
106
//Compute the Accurate Corner
void CMainFrame::HoughAccurateCorner()
{
CClientDC dc(this);
CString Label;
int T=0;
//Eliminating the noise candidate corner
for (i=1;i<ySize;i++)
for (j=1;j<xSize;j++)
if((f[i][j]>0)&&(T<f[i][j]))
T=f[i][j]; //max
T=(int)(a3*T);
for (i=1;i<ySize;i++)
for (j=1;j<xSize;j++)
{
h[i][j]=0;
if(f[i][j]>=T) f[i][j]=1;
else
f[i][j]=0;
}
//Determine the real corner using the differentiate of
r=sqrt(x*x+y*y)
cDet.i=0;
for (i=1;i<ySize;i++)
for (j=1;j<xSize;j++)
if(f[i][j]==1)
{
cDet.i+=1;
cDet.px[cDet.i]=j;
cDet.py[cDet.i]=i;
cDet.rho[cDet.i]=sqrt((double)(j*j+i*i));
}
//Locating corner on paint
Label.Format("CP_%2.1f_%2.1f_%2.1f.txt",a1,a2,a3);
ofstream ofp(Label);
ofp << a1 << endl;
ofp << a2 << endl;
ofp << a3 << endl;
ofp << cDet.i << endl << endl;
if(cDet.i>=1)
{
for (i=1;i<=cDet.i;i++)
{
h[cDet.py[i]][cDet.px[i]]=1;
CRect Titik(CPoint(cDet.px[i]+xHome-9,cDet.py[i]+yHome-9),
CPoint(cDet.px[i]+xHome+9,cDet.py[i]+yHome+9));
dc.FillSolidRect(&Titik,RGB(255,0,0));
ofp << cDet.px[i] << " " << cDet.py[i] << endl;
}
}
ofp.close();
}
Download