Genetic Algorithm for Shape Detection

advertisement
University of West Bohemia in Pilsen
Department of Computer Science and Engineering
Univerzitni 8
30614 Pilsen
Czech Republic
Genetic Algorithm for Shape Detection
Study Report
Tomáš Mainzer
Technical Report No. DCSE/TR-2002-06
March, 2002
Distribution: public
Technical Report No. DCSE/TR-2002-06
March 2002
Genetic Algorithm for Shape Detection
Tomáš Mainzer
Abstract
The localisation of graphical primitives is important task in image processing.
This paper discusses problems of automatic computer-based detection and
localisation of elementary shape in image. A genetic algorithm is used for this
task. The using of genetic algorithm allows efficiently reduce time needed to
scan a task state space. A modification of genetic algorithm for shape detection
is presented.
A review of basic algorithms for shape detection is given, followed by short
genetic algorithm review. The specifics of using a genetic algorithm for
presented task are presented. In addition, some experiment results are shown.
The results are demonstrated on circle, ellipse and oblong detection.
This paper is based upon work sponsored by the Ministry of Education of the
Czech Republic under research and development project LN00B084.
Copies of this report are available on
http://www.kiv.zcu.cz/publications/
or by surface mail on request sent to the following address:
University of West Bohemia in Pilsen
Department of Computer Science and Engineering
Univerzitni 8
30614 Pilsen
Czech Republic
Copyright © 2002 University of West Bohemia in Pilsen, Czech Republic
Shape Detection
Template matching
The template matching is approach to shape detection that can be used to locate known
object in an image to search for the specific patterns. The best match is based on some
optimality criterion, which depends on object properties and object relation.
Matched patterns can represent whole object of interest or some of it’s part. Because
exact copy of the pattern of interest cannot be expected in the processed image (image is
corrupted by noise, by geometric distortion etc.) some match criterion must be defined.
The matching criterion can be defined in many ways from simple correlation up to
complex approaches of graph matching.
The match based segmentation is computational very intensive, but the process can be
faster if some performance improvement is found. The most widely used improvements
are
- Test firstly image location with a high probability of math
- Detect mismatch before all corresponding pixels have been tested
- Match at lower resolution first
Matching criteria can be defined in many ways. The simplest one is based on
summation of pixel difference between a pattern and the searched image. The minimal
difference between these is searched.
Let f(x,y) be a processed image, and h(x,y) be a searched pattern. Some simple
matching optimality criteria describing a match between f and h located at a position
(x,y) are:
(6.1) g ( x, y ) = max f ( x + i, y + j ) − h(i, j )
i, j
(6.2)
g ( x , y ) = Σ Σ f ( x + i , y + j ) − h (i , j )
i
j
Another general matching criterion is correlation between a pattern and the searched
image data. When f is the processed image and h is a searched pattern, the correlation is
defined:
2
(6.3) g ( x, y ) = f ( x , y ) o h ( x, y ) = Σ Σ( f ( x + i, y + j ) − h (i, j ) )
i
j
The location where g(x,y) is minimal identifies a position of the shape.
A problem with using correlation is the enormous number of calculation that needs to
be performed. The one of the most important strategies for reducing execution time is
FFT convolution.
The convolution in the spatial domain corresponds to the multiplication in the frequency
domain. This is a way to reduce execution time, because a multiplication is simpler
operation than a convolution, and there is very efficient fast Fourier transform (FFT)
algorithm for transform spatial domain into frequency domain.
(6.4)
f ( x , y ) o h( x, y ) ⇔ F ( u, v ) ∗ H ( u, v )
While correlation is a powerful tool, it suffers from a significant limitation – the target
image must be in the same size and rotational orientation as the corresponding area in
the searched image.
Page 1
Hough Transform
The Hough transform is a technique used to find parameterised shapes in an image.
There are many variations of the Hough transform that are differed in efficiency and
robustness under certain conditions. In this chapter an introduction to Hough transform
for lines, circles and ellipses is given. A major variation of these algorithms is also
mentioned.
Lines
The standard equation for a line in x-y plane is
(6.5)
y = mx + b
Let us define a new two-dimensional space, which we will refer as the parameter
space. For each pixel (x,y) in the image the set of all possible lines that go through this
point have a parameter pairs (m,b) given by linear equation
(6.6) b = − xm + y
For other pixels in image plane, next lines in the parameter space would be defined. If
there are some collinear points in the image plane, so the corresponding lines in the
parameter space are intersected to give the parameters of the line in the image plane.
Figure 6.2 – Hough transform for lines - (a) image plane (b) corresponding parameter
plane
Localisation of the line in the image can be taken by finding the intersection of lines in
the parameter space. The most common way of doing this is by using a histogram. To
implement this approach, the parameter space is discretised – it has a form of twodimensional matrix, known as the accumulator. In figure 6.2 is shown an example of
image plane and corresponding parameter plane.
A problem can occur, when the parameter space (m,b) is discretised. The m-axis values
varies from minus to plus infinity (e.g. vertical lines) and they can’t be covered by the
histogram.
Now, a different parameterisation is required. A line can be also expressed by using the
equation:
(6.7) x cos(θ ) + y sin(θ ) = ρ
For a pixel (x,y) in the image plane the all possible lines that go though this point lie on
a sinusoid in the parameter space. Collinear points in the image generate sinusoids,
which also intersect in the parameter space, giving parameters of the line.
Page 2
Unfortunately, for a complex images the histogram is rather complicated and finding of
the maximums can be very difficult. Another problem is a fact, that maximum in the
parameter space don’t need correspond with a line in the image.
Circles
A circle can be described by the three parameters (a,b,c) by equation
(6.8) ( x − a ) 2 + ( y − b) 2 = c 2
In general, a three-dimensional parameter space is required. For each point in the image
plane, the every corresponding points in the parameter space must be set.
Whereas in the previous section, the task was reduced to finding the intersections of
curves in two-dimensional parameter space, for a circle location the intersection of
surfaces in a three dimensional parameter space must be found.
Unfortunately, this algorithm would be very computationally intensive and memory
demanding. There are two major disadvantages of this approach
- Three-dimensional accumulator requires a large amount of memory
- Increment the accumulator elements and test every element if it is a maximum is
computationally intensive
It may be noticed that the simplification of this task can be also based on a-priory
knowledge of some parameters of detected circles. If there is known circle radius, the
task is reduced and only two-dimensional accumulator can be used.
For general circle detection task, several variations on the Hough transform have
attempted to solve this problem. Some of these are described in following chapters.
One used solution is to split the circle detection into the following two tasks:
– Finding a potential circle centres in the image.
– Finding a radius corresponding to each centre.
Let it be considered any point on a circle. Note that the centre of the circle lies on the
normal to the tangent at that point. If the normal associated with point of circle is
founded, so this would intersect the centre of the circle. By recording the normals in an
accumulator that is defined over the image plane, the points of intersection can be found
by searching for maximum of the histogram. When a noisy image is processed, a
problem with normal determination can occur.
The position of the local maximum of the histogram is used as the potential circle
centre. The circle radiuses corresponding to potential circle centres can be found as a
maximum in one-dimensional histogram of distance from potential circle centre to
image points. Note that if more than one circle is centred on the same position, so the
several histogram maximums correspond to valid radiuses.
Ellipses
An ellipse can be uniquely defined by a five parameters – the ellipse centre (x,y),
maximum and minimum radiuses of the ellipse (a,b) and angle of the major axis θ.
So a five-dimension parameter space should be used. Direct apply of the Hough
transform is became very impractical – it has a large requirement to computation
perform and very unreal requirement to memory space. Analogous to circle detection
task, the memory and computational demands can be significantly reduced by
reorganising ellipse detection into two subtasks:
– Finding potential ellipse centres in the image
Page 3
– Finding the remaining three parameters associated with each ellipse centre
To find ellipse centre the geometry feature of the ellipses can be used. In figure 6.3 is
shown construction of centre intersection line. Let x1 and x2 be two points on an ellipse
boundary. Let t be an intersection of these two tangents. Define m to be the midpoint of
x1 and x2. The line tm is than passed through the centre of the ellipse.
Figure 6.3 – construction of line though centre of ellipse
The line tm is recorded in two dimensional histogram defined over the image plane by
incrementing each histogram elements that the line intersects. Note that it is sufficient to
record only mid-line originating from m and oriented away from t. An example of the
histogram is shown in figure 6.4.
Figure 6.4 – Hough transform – (a) Image plane (b) histogram
Not all local maximums of the histogram correspond to ellipse centres. This can be seen
in figure 6.5. There are six local maximums, but only three of them correspond to
ellipse centres. The others anomalous maximums are generated by pixel pairs laying on
different ellipses.
Figure 6.5 – Hough transform – (a) Image plane (b) histogram
Page 4
The second part of the task estimates the remaining three parameters. For each ellipse
centre the image is translated so that ellipse centre is located at the origin. The
remaining three parameters can be obtained by defining a three-dimensional parameter
space and using standard Hough transform. However, in this case only a global
maximum of the histogram is required. The adaptive Hough transform can be used to
reduce memory requirement and computational demands in this case.
After determination of the remaining three parameters associated with ellipse, it is
necessary to perform a final validation step. To determine whether an ellipse exist in
image, the number of pixels lying on (or near) the ellipse must be counted. If there is
enough pixels, the ellipse in image is located.
Adaptive Hough transform
The adaptive Hough transform is an iterative algorithm, which reduces computation
demands by evaluating the histogram at the high resolution near a global maximum and
at low resolution elsewhere. The histogram is initially evaluated at very coarse
resolution and the global maximum is found. The histogram is then limited to region
surrounding the global maximum and is re-evaluated at higher resolution. This process
is repeated and after a pre-set number of iterations is halted.
Note that the same accumulator memory can be re-used in each iteration. In this way,
memory requirement of the algorithm can be minimised.
Probabilistic Hough transform
To detect the object in an image it is not generally necessary to compute the Hough
transform of every pixel in image. Excellent results can be still obtained by computing
the Hough transform of only some, evenly distributed pixels in the image.
Pixels are selected according the uniform probability distribution, typically with
replacement (i.e. the same pixel may be chosen more than once). If sufficient pixels are
chosen, then the probabilistic Hough transform is similar to computing the Hough
transform of sub-sampled image.
The advantage of probabilistic Hough transform is computational saving. Computing
the Hough transform of only p-percent of the pixels requires only p-percent of the
computations.
The pixels proportion must be selected so as accuracy of standard Hough transform and
Probabilistic Hough transform was comparable. It is found that a pixel proportion can
be mostly in the range 5% ÷20%, depending on the specific application. There are also
some methods to determine pixel proportion analytically.
Randomised Hough transform
The randomised Hough transform uses a different mechanism to accumulate points in
the parameter space. This algorithm begins by choosing pixels in the image and finding
the unique object that passes through these pixels. For instance, the randomised Hough
transform for lines is based on choosing a pairs of pixels, for circles and ellipses it uses
a tripled of pixels and its tangents.
The parameters of this object are then recorded as a point in the parameter space. This is
difference from the standard Hough transform where several-dimensional plane must be
recorded for each pixel.
Page 5
Repeating this process by randomly choosing more pixels, we find that clusters related
to object appear in the parameter space. It is not necessary to choose all potential groups
of pixels. Clusters are formed from a small random sample of pixels group. For this
reason, the randomised Hough transform tends to be faster than standard Hough
transform.
As well as increased speed, another advantage of the randomised Hough transform is a
reduction in memory requirements. Because the randomised Hough transform generates
only individual points in the parameter space a more efficient data structure can be used
for parameter space storage.
Page 6
Genetic Algorithm
Genetic algorithm (GA) is an adaptive method that mimics the metaphor of natural
biological evolution. It is an optimisation technique that operates on population of
individual solutions. Each individual solution (also called string or chromosom)
represents a proposed solution of the solved problem. The theories of the natural
selection are applied to this population to find solution.
Next generation of the population is obtained by applying mutation, reproduction and
selection stochastic operators to the population.
With these operators, the population of solutions comes to (pseudo-)optimal solution of
the problem. The quality of every individual solution is assessed by objective (fitness)
function. This function is problem dependent and it is used to propagate good solutions
into next generation.
Generally, the genetic algorithm advantages are simple using, low memory demands,
using of simple computation algorithm and ability of parallelism. Disadvantage of the
genetic algorithm is non-deterministic work time and non-guarantee finding of best
solution.
Encoding
GA (as an optimisation method) must optimise some set of variables (string) to
minimise defined measure error. The least part of string (typically one bit) are called
gen. Note that a method of coding is related with an algorithm of genetic operator
algorithm. There are many potential ways to encode this set.
The most used one is the binary encoding - the values are represented together a one
binary string.
The next one - value encoding - is based on encode of variables directly to proper type,
e.g. as an integer, real number or enumerate. For this encoding it is often necessary to
have also problem specific mutation and crossover operators.
For some cases also permutation encoding can be chosen. It is useful for ordering
problems (e.g. traveling salesman problem).
For other special cases (e.g. evolving expressions or programs) the tree encoding
scheme can be used.
Of course, there are many other ways of encoding, which are depended mainly on the
solved problem.
Selection
Generally a selection operator determines strings for recombination and mutation.
Usually this selection is based on the objective value of strings. There are several
selection method.
Roulette wheel selection select chromosomes according to their fitness - the better
chromosomes (e.g. better solution) have a major probability to be selected.
This selection can cause problem when fitness are very differed each other in the
population. Rank selection selects chromosome according to their order.
In various selection methods, the idea of elitism can be used. Elitism prevents loosing
the best solution (by mutation or crossover) from the population.
Page 7
Recombination (reproduction, crossover)
Recombination (also called reproduction or crossover) recombines the parts of parent
strings. It is a way to combine two different results.
Recombination selects genes from parent strings and creates new strings.
The most used recombination methods (especially for binary strings) swap selected
genes between parent. This fashion use for example 1-point crossover, 2-point
crossover, uniform crossover etc. These methods are differed in the way of selection of
swapped genes.
There are further special types of recombination for various strings encoding arithmetic crossover for value encoding, tree crossover for tree encoding etc.
Mutation
Mutation operator randomly alters some elements of the individual string, which causes
enhancement of genetic information in population.
The genetic information
enhancement allows continuation of evolution regardless of limited set of individuals.
The most used mutation method is binary inversion - selected bits are inverted.
There are special types of recombination for various strings encoding also - number
adding for value encoding, order changing for permutation encoding, changing operator
for tree encoding etc.
Page 8
Genetic Algorithm for Shape Detection
Image Preprocessing
Such as most shape detection algorithms, we assume that input image was transformed
by using any edge detection algorithm. For transformed image E(x,y) we define:
(1a)
E ( x, y ) = 1
for edge point
E ( x, y ) = 0
otherwise,
Where x and y are horizontal and vertical coordinates in the transformed image.
Alternatively a value of E(x,y) don’t need to be binary, but it can express a probability
that point [x,y] belongs to edge:
(1b) E ( x, y ) ∈ 0;1
E ( x, y ) = 0
when point [x,y] doesn’t belongs to edge certainly
E ( x, y ) = 1
when point [x,y] belongs to edge certainly
String Encoding
Let [ai,bi] represents of horizontal and vertical co-ordinates of object shape. Example is
in figure 1, where the highlighted points can represent a circle.
Figure 1 - circle points
Let us define a general transform function T which transform the co-ordinates of the
searched shape [ai,bi] to the image plane co-ordinates [xi,yi].
(2)
[ xi , yi ] = T( ai , bi , p) ,
where p represents parameters of transformation. The transformation T can include for
example a shape movements, spin, stretch etc.
For our shape detection task we consider, that the values [ai,bi] are constant and some
instance of shape (given by transformation, e.g. by miscellaneous values of p) is
searched.
The transformation parameters is used the natural value encoding.
Fitness Function
Now let us define fitness (objective) function so that it gives maximal value when the
points [xi,yi] represent a shape in the image E. It is accomplished if we define function F
as follows in equation 3:
(3)
F ( p) =
N −1
∑ E (T (a , b , p))
i
i
i =0
Page 9
Obviously, the shape detection (localisation) task is equivalent of task of searching of
maxima of function F over all parameters p.
Although definition (3) is formally correct but there is a problem with the search for the
maximum. Since this fitness function has a spike of maximum only when the
transformed shape is accurately equal to shape in image E, no optimisation technique
(including GA) can’t be successfully used to search for this maximum.
This problem can be eliminated by redefine of fitness function F. Let us define a fitness
function F as follows:
(4)
F ( p) =
N −1
∑ R(T (a , b , p))
i
i
i =0
where R is defined as:
(5a)


1
R ( x , y ) = MAX  E ( x + q, y + w) −
q2 + w2  ,
∀q , w 
d

or computationally less intensive:
(5b)
(
)
1


R ( x , y ) = MAX  E ( x + q, y + w) −
q + w ,
∀q , w 
d

where d is a positive constant.
The fitness function (4) takes into account nearest edge points, so as the bigger length of
nearest edge points were penalised. Penalisation is determined by length constant d. The
choice of the constant d depends on the image structure. Generally better convergence
of the solution can be achieved when the constant d is large, but too large value is not
suited for real (and noised) images. Disadvantage of using the too large value is a
possibility of detection of non-existing circles with radius smaller or comparable to the
length constant d, which can happen especially for images damaged by noise.
Note that equation (5) can be usable for binary images according definition (1a) and
without change also according (1b).
Also note that in order to save the computer requirements, the equation (5) can be
partitioned into two parts - image plane transformation (equation 5) that can be precomputed ones before GA processing and own fitness function (4).
Selection
Steady-State selection is used for proposed algorithm. Main idea of this selection is that
big part of chromosomes should “survive” to next generation.
This is one of the simplest versions of genetic algorithm. One step of this algorithm
consists of three substeps:
- mutation – randomly selected strings are mutated
- crossover – two randomly selected strings are crossovered (theirs parts are swapped)
- selection – the worst string (solution) from population is replaced by the best string.
Mutation
Several mutation algorithms were tested in proposed genetic algorithm.
As a basic method the binary inversion was used.
Page 10
For chosen arithmetic string encoding, an arithmetic mutation was tested. The
individual string values are (independently with some probability) perturbated.
Perturbatiton is chosen so as enhance a probability of small variation.
Next method is proposed especially for shape detection task. It is partially shapedependent.
Proposed mutation method is based on assumption that a shape instance (given by
string) has one shape point located correctly. The string (e.g. coefficients of shape
transformation) is then randomly changed so as to accomplished condition that chosen
point is located on the same place also for new string (transformation).
This is demonstrated in figure 2, where the point marked by arrow is chosen to be
stable. The grey circles are potential circles after mutation (one is stretched, the other is
rotated).
Figure 2 - Circle mutation
For polygonal shapes (or shape parts) a mutation can be extended to operation of
movement in direction of derivation of shape in selected points.
The mutation rate (probability of mutation in one step of GA) can also be adapted in
according to GA convergence. The proposed approach increases a mutation rate when
the best fitness does not improve for some time.
Crossover
Two types of crossover can be used for chosen string encoding – binary crossover or
arithmetic crossover.
As a binary crossover a 2-point crossover is used. This crossover operator uses two
randomly chosen crossover points. Strings swap the segments between these two points.
Arithmetic crossover independently with some probabilities swaps individual values
from string.
Page 11
Shape Detection Experiments and Results
There are shown some experiments results and method comparison in this chapter. The
genetic algorithm suggested in previous chapter is tested.
Circular shape detection
The detection of circles and ellipses is described in this chapter. Note that presented
basic principles can be also used on circular parts of complex shapes.
Image preprocessing
If there is no other information, let us assume an image as follows:
- condition (1b) is accomplished
- 256*256 points size.
The tested images are created artificially. Because in real applications the images are
very often corrupted (corrupted by noise, incomplete, smoothed, etc) the next three
types of corruption are usually tested - image with fuzzy edges, image corrupted by
noise and image with fuzzy edges and noise together.
In image with fuzzy edges are edges fuzzed randomly in the range ±4 pixels. In noise
corrupted image, an additive noise with uniform distribution in range <-1;1> is used.
The image examples are shown in figure 1.
GA Image preprocessing
For GA fitness evaluation is the image preprocessed by equation (5b). This equation has
significantly lower computation demands (than equation 5a) and GA detection results
for both proposed equations are comparable.
The example of the source image and the image processed by equation (5b) is shown in
figure 3. Note that preprocessed image is shown in negative.
Figure 3 - (a) source image (b) preprocesed image
Shape description and encoding
Let us assume that circular shape (e.g. circle or ellipse) detection is needed. A shape
must be detected for all sizes and all spin.
Circle and ellipse shape co-ordinates can be both described by equation (6). These tasks
are differed only in required transform coefficients.
Page 12
(6)
 2πi 
ai = cos

 N 
 2πi 
bi = sin
,
 N 
where i ∈< 0; N − 1 >
The required transform is described as follows:
(7)
ai ' = x 0 + ai k x cos(α ) + bi k y sin(α ) ,
bi ' = y 0 + a i k x sin(α ) + bi k y cos(α ) ,
where ai and bi are shape description coordinates, ai´ and bi´are transformed shape
coordinates and x0 ,y0, kx, ky, α are transform coefficients.
The x0 ,y0 are coordinates of transformed shape centre, kx, ky are coefficients of shape
stretch in orthogonal direction and α is spin of transformed shapes.
GA string for representation of ellipse transformation includes all above shown
coefficients, e.g. x0 ,y0, kx, ky and α.
If there is any more knowledge about explored shape the number of used coefficients
can be lowered.
For example, for circle transformation the coefficients kx, ky must be equal and value of
α has no meaning. So GA string for representation of circle transformation includes:
transformed shape coordinates x0 ,y0 and shape stretch k (k is equal to kx, ky).
Genetic operators
In previous chapters was mentioned that genetic operators which can be used for shape
detection task. In this chapter, we will discus an influence of these operators on our
task.
For the proposed transformation (7), the maximal number of the transform coefficient
(e.g. string items) is five. This transform includes most significant task - from simple
template matching (only two coefficients - position of shape - are unknown) to full
planar shape localisation (position, stretch (in both direction) and spin of object are
unknown).
Firstly we discuss a right number of strings in population. The 16÷32 strings appear to
be optimal for miscellaneous images and number of string items, but there are not
usually significantly worse results for any number in range 8 to 64.
The three types of mutation operator was noticed - binary mutation, arithmetic
mutation and mutation proposed for shape detection. The best one is combination of the
mutation proposed for shape detection and the arithmetic mutation. The solo arithmetic
mutation is improper and gives worse results than the binary mutation. Differences
between the mutation operators are bigger for complex images or for when number of
the string coefficients is bigger.
The adaptation of mutation rate appears to be useful - the mutation rate is increased
when population fitness does not improve. Increment of the mutation rate also depends
on the population fitness. The mutation rate increments more for lower fitness and vice
versa.
The two types of crossover were noticed – the binary crossover and the arithmetic
crossover. There is no significant difference between both types.
Page 13
Detection
In this chapter some results of circle and ellipse detection by using proposed genetic
algorithm is shown. The quality of the results is represented as a number of genetic
algorithm steps (iterations) necessary to circle location.
The circular shape is represented by 32 points. The points are expressed in equation (6),
where N = 32.
As it was shown in previous chapter a circle detection task needs GA string with tree
coefficients. The transformation coefficients x0 and y0 are in range from 0 to 255, k is in
range from 16 to 128. A GA population with 16 strings is used.
The tested images are artificially created and they are shown in figure 4. Each of them
consists of 1 circle and 32 randomly placed lines. The first one (fig.4a) is the source
image without any corruption. In figures 4b, 4c and 4d there are shown corrupted
images.
Figure 4 – (a) Test image - circle, (b) with fuzzy edges,
(c) corrupted by noise, (d) with fuzzy edges and corrupted by noise
Note when edges are fuzzy, we can not expect that a circle in the image will be located
precisely. As correctly detected we will consider circles that fulfil the following
condition:
(8)
(x'− x0 ) + ( y'− y 0 ) + (k '−k ) ≤ 1 + f ,
where f is blur constant (4 points in this case), x0, y0, k are coefficients of GA string and
x', y', k' specifying the correct position of the circle in image.
The results (number of iteration to circle localisation) are summarised in table 1.
2
2
2
Page 14
Figure
Test image (fig.4a)
Test image with fuzzy edges (fig.4b)
Test image with noise (fig.4c)
Test image with fuzzy edges and noise (fig.4d)
Iterations
1500
2700
5400
8400
Table 1 – Results – number of iterations
As it was suggested, the ellipse detection task is rather similar to the circle detection
task. So ellipse detection also uses 32-points shape representation and the same range of
strings coefficients (e.g. x0 and y0 are in range 0 to 255, kx ,ky are in range from 16 to
128). The range of coefficient α is from 0° to 180°. Population with 16 strings are also
used.
The test image is shown in figure 5.
Figure 5 - Test image - ellipse
The results (number of iteration to ellipse localisation) are summarised in table 2.
Figure
Test image (fig.2)
Test image with fuzzy edges
Test image with noise
Test image with fuzzy edges and noise
Iterations
90000
120000
260000
700000
Table 2 – Results – number of iterations
There are also many special detection cases where one or several ellipse parameters are
known. Some results for these cases are shown in table 3.
It can be rather surprising that there are 1700 iterations for string with three items (x0,
y0, k) and 2000 iterations for string with only two items (x0, y0). This result is caused by
the proposed mutation, which evoke faster convergence. Also difference of number of
iterations between both strings with three items (x0, y0, k versus x0, y0, α) is caused by
the mutation.
Page 15
String items
x0, y0
x0, y0, kx or x0, y0, ky
x0, y0, α
x0, y0, kx, ky
x0, y0, kx, α or x0, y0, ky, α
Iterations
2000
1700
9000
13000
18000
Table 3 – Results – number of iterations
Polygonal shape detection
The detection of oblongs is described in this chapter. Presented basic principles can be
used also on others polynomial shapes or shape parts.
There is no difference in the image preprocessing. It is done by the same way as for
circular shape detection.
The genetic operators are also the same, only mutation is extended to random tangent
movement as was suggested.
Shape description and encoding
Let us assume that oblong shape detection is needed. A shape must be detected for all
sizes and all spins.
Oblong shape co-ordinates can be both described by (9):
(9)
a 4 i +0 = qi
a 4 i +1 = 1
a 4i + 2 = q N −i −1 a 4i +3 = −1
b4i +0 = 1
b4i +1 = q N −i −1
b4i + 2 = −1
b4i +3 = qi ,
where i ∈< 0; N − 1 > and qi =
2i
−1
N
The required transform is described as follows:
(10) ai ' = x 0 + ai k x cos(α ) + bi k y sin(α ) ,
bi ' = y 0 + a i k x sin(α ) + bi k y cos(α ) ,
where ai and bi are shape description coordinates, ai´ and bi´are transformed shape
coordinates and x0 ,y0, kx, ky, α are transform coefficients.
The x0 ,y0 are coordinates of transformed shape centre, kx, ky are coefficients of shape
stretch in orthogonal direction and α is spin of transformed shapes.
GA string for representation of oblong transformation will include all above shown
coefficients, e.g. x0 ,y0, kx, ky and α.
Detection
The oblong shape is represented by 32 points. The points are expressed in equation (9),
where N = 8.
The string coefficients x0 and y0 are in the range from 0 to 255, kx ,ky are in the range
from 16 to 128 and the range of coefficient α is from 0° to 180°. A GA population with
16 strings is used.
The tested images are artificially created and they are shown in figure 6. It consists of 1
oblong and 32 randomly placed lines.
Page 16
Figure 6 - Test image - oblong
The results (number of iteration to oblong localisation) are summarised in table 4.
Figure
Test image (fig.3)
Test image with fuzzy edges
Test image with noise
Test image with fuzzy edges and noise
Iterations
108000
134000
>1000000*
310000
*fitness function had no local maximum at shape position
Table 4 – Results – number of iterations
Page 17
Conclusion
One of the major problems of computer vision is the localisation of object of interest.
Much of the structural information in an image is encoded within the edges, which gives
the information about shape of object in image. Shapes are important information for
acquiring the notion of object. There are many imaging applications where image
analysis can be reduced to the analysis of shapes (e.g. analysis of cells, machine parts or
characters, Traffic sign detection, etc.).
A standard algorithm for circle detection is the Hough Transform or its variant, the
probabilistic Hough Transform. These algorithms are slow, memory intensive and have
a limited accuracy as the number of objects in the image increases.
The presented algorithm uses evolvable technique based on genetic algorithm for circle
localisation. It allows reduction of memory and computation requirements.
The proposed solution enables the shape detection in an image by using the genetic
algorithm. It is shown that a right choice of mutation operator can significantly reduce
the number of iterations steps needed to find the solution.
A genetic algorithm also allows simple integration of constrains conditions (e.g.
approximate location or dimensions of the circle) and herewith also reduces a
computational demands.
The computational and memory demands are lower than the demands of the standard
Hough transform. On the other hand, the advantage of the Hough transform is the
ability to find all circles in image altogether.
Page 18
References
[1] Poli R.; Genetic programming for Image Analysis; The university of Birmingham, Technical report
CSPR-96-1, 1996
[2] Sonka M., Hlavac V., Boyle R., Image Processing, Analysis and Machine vision; Brooks and Cole
Publishing 1998
[3] Whitley D., A Genetic Algorithm Tutorial, Statistics and Computing 1994, Volume 4, pp. 65-85
[4] Ilingworth J, Kittler J., A Survey of the Hough Transform, Computer Vision, Graphics, and Image
processing 1988, Volume 44, pp.87-116
[5] J.Serra; Image analysis and mathematical morphology; Academic press inc. Ltd 1982
[6] Mirriam Di Ianni, Ralf Diekrmann, Reinhard Lüling, Jürgen Shulze, Stefan Tschöke; Simulated
annealing and genetic algorithms for shape detection; Control and Cybernetics, vol. 25, No. 1, pp. 159175, 1996.
[7] R. Veltkamp and M. Hagedoorn; State-of-the-art in shape matching; Technical Report UU-CS-1999-27,
Utrecht University, the Netherlands, 1999
Page 19
Download