Genetic Algorithm for Shape Detection

University of West Bohemia in Pilsen Department of Computer Science and Engineering Univerzitni 8 30614 Pilsen Czech Republic Genetic Algorithm for Shape Detection Study Report Tomáš Mainzer Technical Report No. DCSE/TR-2002-06 March, 2002 Distribution: public Technical Report No. DCSE/TR-2002-06 March 2002 Genetic Algorithm for Shape Detection Tomáš Mainzer Abstract The localisation of graphical primitives is important task in image processing. This paper discusses problems of automatic computer-based detection and localisation of elementary shape in image. A genetic algorithm is used for this task. The using of genetic algorithm allows efficiently reduce time needed to scan a task state space. A modification of genetic algorithm for shape detection is presented. A review of basic algorithms for shape detection is given, followed by short genetic algorithm review. The specifics of using a genetic algorithm for presented task are presented. In addition, some experiment results are shown. The results are demonstrated on circle, ellipse and oblong detection. This paper is based upon work sponsored by the Ministry of Education of the Czech Republic under research and development project LN00B084. Copies of this report are available on http://www.kiv.zcu.cz/publications/ or by surface mail on request sent to the following address: University of West Bohemia in Pilsen Department of Computer Science and Engineering Univerzitni 8 30614 Pilsen Czech Republic Copyright © 2002 University of West Bohemia in Pilsen, Czech Republic Shape Detection Template matching The template matching is approach to shape detection that can be used to locate known object in an image to search for the specific patterns. The best match is based on some optimality criterion, which depends on object properties and object relation. Matched patterns can represent whole object of interest or some of it’s part. Because exact copy of the pattern of interest cannot be expected in the processed image (image is corrupted by noise, by geometric distortion etc.) some match criterion must be defined. The matching criterion can be defined in many ways from simple correlation up to complex approaches of graph matching. The match based segmentation is computational very intensive, but the process can be faster if some performance improvement is found. The most widely used improvements are - Test firstly image location with a high probability of math - Detect mismatch before all corresponding pixels have been tested - Match at lower resolution first Matching criteria can be defined in many ways. The simplest one is based on summation of pixel difference between a pattern and the searched image. The minimal difference between these is searched. Let f(x,y) be a processed image, and h(x,y) be a searched pattern. Some simple matching optimality criteria describing a match between f and h located at a position (x,y) are: (6.1) g ( x, y ) = max f ( x + i, y + j ) − h(i, j ) i, j (6.2) g ( x , y ) = Σ Σ f ( x + i , y + j ) − h (i , j ) i j Another general matching criterion is correlation between a pattern and the searched image data. When f is the processed image and h is a searched pattern, the correlation is defined: 2 (6.3) g ( x, y ) = f ( x , y ) o h ( x, y ) = Σ Σ( f ( x + i, y + j ) − h (i, j ) ) i j The location where g(x,y) is minimal identifies a position of the shape. A problem with using correlation is the enormous number of calculation that needs to be performed. The one of the most important strategies for reducing execution time is FFT convolution. The convolution in the spatial domain corresponds to the multiplication in the frequency domain. This is a way to reduce execution time, because a multiplication is simpler operation than a convolution, and there is very efficient fast Fourier transform (FFT) algorithm for transform spatial domain into frequency domain. (6.4) f ( x , y ) o h( x, y ) ⇔ F ( u, v ) ∗ H ( u, v ) While correlation is a powerful tool, it suffers from a significant limitation – the target image must be in the same size and rotational orientation as the corresponding area in the searched image. Page 1 Hough Transform The Hough transform is a technique used to find parameterised shapes in an image. There are many variations of the Hough transform that are differed in efficiency and robustness under certain conditions. In this chapter an introduction to Hough transform for lines, circles and ellipses is given. A major variation of these algorithms is also mentioned. Lines The standard equation for a line in x-y plane is (6.5) y = mx + b Let us define a new two-dimensional space, which we will refer as the parameter space. For each pixel (x,y) in the image the set of all possible lines that go through this point have a parameter pairs (m,b) given by linear equation (6.6) b = − xm + y For other pixels in image plane, next lines in the parameter space would be defined. If there are some collinear points in the image plane, so the corresponding lines in the parameter space are intersected to give the parameters of the line in the image plane. Figure 6.2 – Hough transform for lines - (a) image plane (b) corresponding parameter plane Localisation of the line in the image can be taken by finding the intersection of lines in the parameter space. The most common way of doing this is by using a histogram. To implement this approach, the parameter space is discretised – it has a form of twodimensional matrix, known as the accumulator. In figure 6.2 is shown an example of image plane and corresponding parameter plane. A problem can occur, when the parameter space (m,b) is discretised. The m-axis values varies from minus to plus infinity (e.g. vertical lines) and they can’t be covered by the histogram. Now, a different parameterisation is required. A line can be also expressed by using the equation: (6.7) x cos(θ ) + y sin(θ ) = ρ For a pixel (x,y) in the image plane the all possible lines that go though this point lie on a sinusoid in the parameter space. Collinear points in the image generate sinusoids, which also intersect in the parameter space, giving parameters of the line. Page 2 Unfortunately, for a complex images the histogram is rather complicated and finding of the maximums can be very difficult. Another problem is a fact, that maximum in the parameter space don’t need correspond with a line in the image. Circles A circle can be described by the three parameters (a,b,c) by equation (6.8) ( x − a ) 2 + ( y − b) 2 = c 2 In general, a three-dimensional parameter space is required. For each point in the image plane, the every corresponding points in the parameter space must be set. Whereas in the previous section, the task was reduced to finding the intersections of curves in two-dimensional parameter space, for a circle location the intersection of surfaces in a three dimensional parameter space must be found. Unfortunately, this algorithm would be very computationally intensive and memory demanding. There are two major disadvantages of this approach - Three-dimensional accumulator requires a large amount of memory - Increment the accumulator elements and test every element if it is a maximum is computationally intensive It may be noticed that the simplification of this task can be also based on a-priory knowledge of some parameters of detected circles. If there is known circle radius, the task is reduced and only two-dimensional accumulator can be used. For general circle detection task, several variations on the Hough transform have attempted to solve this problem. Some of these are described in following chapters. One used solution is to split the circle detection into the following two tasks: – Finding a potential circle centres in the image. – Finding a radius corresponding to each centre. Let it be considered any point on a circle. Note that the centre of the circle lies on the normal to the tangent at that point. If the normal associated with point of circle is founded, so this would intersect the centre of the circle. By recording the normals in an accumulator that is defined over the image plane, the points of intersection can be found by searching for maximum of the histogram. When a noisy image is processed, a problem with normal determination can occur. The position of the local maximum of the histogram is used as the potential circle centre. The circle radiuses corresponding to potential circle centres can be found as a maximum in one-dimensional histogram of distance from potential circle centre to image points. Note that if more than one circle is centred on the same position, so the several histogram maximums correspond to valid radiuses. Ellipses An ellipse can be uniquely defined by a five parameters – the ellipse centre (x,y), maximum and minimum radiuses of the ellipse (a,b) and angle of the major axis θ. So a five-dimension parameter space should be used. Direct apply of the Hough transform is became very impractical – it has a large requirement to computation perform and very unreal requirement to memory space. Analogous to circle detection task, the memory and computational demands can be significantly reduced by reorganising ellipse detection into two subtasks: – Finding potential ellipse centres in the image Page 3 – Finding the remaining three parameters associated with each ellipse centre To find ellipse centre the geometry feature of the ellipses can be used. In figure 6.3 is shown construction of centre intersection line. Let x1 and x2 be two points on an ellipse boundary. Let t be an intersection of these two tangents. Define m to be the midpoint of x1 and x2. The line tm is than passed through the centre of the ellipse. Figure 6.3 – construction of line though centre of ellipse The line tm is recorded in two dimensional histogram defined over the image plane by incrementing each histogram elements that the line intersects. Note that it is sufficient to record only mid-line originating from m and oriented away from t. An example of the histogram is shown in figure 6.4. Figure 6.4 – Hough transform – (a) Image plane (b) histogram Not all local maximums of the histogram correspond to ellipse centres. This can be seen in figure 6.5. There are six local maximums, but only three of them correspond to ellipse centres. The others anomalous maximums are generated by pixel pairs laying on different ellipses. Figure 6.5 – Hough transform – (a) Image plane (b) histogram Page 4 The second part of the task estimates the remaining three parameters. For each ellipse centre the image is translated so that ellipse centre is located at the origin. The remaining three parameters can be obtained by defining a three-dimensional parameter space and using standard Hough transform. However, in this case only a global maximum of the histogram is required. The adaptive Hough transform can be used to reduce memory requirement and computational demands in this case. After determination of the remaining three parameters associated with ellipse, it is necessary to perform a final validation step. To determine whether an ellipse exist in image, the number of pixels lying on (or near) the ellipse must be counted. If there is enough pixels, the ellipse in image is located. Adaptive Hough transform The adaptive Hough transform is an iterative algorithm, which reduces computation demands by evaluating the histogram at the high resolution near a global maximum and at low resolution elsewhere. The histogram is initially evaluated at very coarse resolution and the global maximum is found. The histogram is then limited to region surrounding the global maximum and is re-evaluated at higher resolution. This process is repeated and after a pre-set number of iterations is halted. Note that the same accumulator memory can be re-used in each iteration. In this way, memory requirement of the algorithm can be minimised. Probabilistic Hough transform To detect the object in an image it is not generally necessary to compute the Hough transform of every pixel in image. Excellent results can be still obtained by computing the Hough transform of only some, evenly distributed pixels in the image. Pixels are selected according the uniform probability distribution, typically with replacement (i.e. the same pixel may be chosen more than once). If sufficient pixels are chosen, then the probabilistic Hough transform is similar to computing the Hough transform of sub-sampled image. The advantage of probabilistic Hough transform is computational saving. Computing the Hough transform of only p-percent of the pixels requires only p-percent of the computations. The pixels proportion must be selected so as accuracy of standard Hough transform and Probabilistic Hough transform was comparable. It is found that a pixel proportion can be mostly in the range 5% ÷20%, depending on the specific application. There are also some methods to determine pixel proportion analytically. Randomised Hough transform The randomised Hough transform uses a different mechanism to accumulate points in the parameter space. This algorithm begins by choosing pixels in the image and finding the unique object that passes through these pixels. For instance, the randomised Hough transform for lines is based on choosing a pairs of pixels, for circles and ellipses it uses a tripled of pixels and its tangents. The parameters of this object are then recorded as a point in the parameter space. This is difference from the standard Hough transform where several-dimensional plane must be recorded for each pixel. Page 5 Repeating this process by randomly choosing more pixels, we find that clusters related to object appear in the parameter space. It is not necessary to choose all potential groups of pixels. Clusters are formed from a small random sample of pixels group. For this reason, the randomised Hough transform tends to be faster than standard Hough transform. As well as increased speed, another advantage of the randomised Hough transform is a reduction in memory requirements. Because the randomised Hough transform generates only individual points in the parameter space a more efficient data structure can be used for parameter space storage. Page 6 Genetic Algorithm Genetic algorithm (GA) is an adaptive method that mimics the metaphor of natural biological evolution. It is an optimisation technique that operates on population of individual solutions. Each individual solution (also called string or chromosom) represents a proposed solution of the solved problem. The theories of the natural selection are applied to this population to find solution. Next generation of the population is obtained by applying mutation, reproduction and selection stochastic operators to the population. With these operators, the population of solutions comes to (pseudo-)optimal solution of the problem. The quality of every individual solution is assessed by objective (fitness) function. This function is problem dependent and it is used to propagate good solutions into next generation. Generally, the genetic algorithm advantages are simple using, low memory demands, using of simple computation algorithm and ability of parallelism. Disadvantage of the genetic algorithm is non-deterministic work time and non-guarantee finding of best solution. Encoding GA (as an optimisation method) must optimise some set of variables (string) to minimise defined measure error. The least part of string (typically one bit) are called gen. Note that a method of coding is related with an algorithm of genetic operator algorithm. There are many potential ways to encode this set. The most used one is the binary encoding - the values are represented together a one binary string. The next one - value encoding - is based on encode of variables directly to proper type, e.g. as an integer, real number or enumerate. For this encoding it is often necessary to have also problem specific mutation and crossover operators. For some cases also permutation encoding can be chosen. It is useful for ordering problems (e.g. traveling salesman problem). For other special cases (e.g. evolving expressions or programs) the tree encoding scheme can be used. Of course, there are many other ways of encoding, which are depended mainly on the solved problem. Selection Generally a selection operator determines strings for recombination and mutation. Usually this selection is based on the objective value of strings. There are several selection method. Roulette wheel selection select chromosomes according to their fitness - the better chromosomes (e.g. better solution) have a major probability to be selected. This selection can cause problem when fitness are very differed each other in the population. Rank selection selects chromosome according to their order. In various selection methods, the idea of elitism can be used. Elitism prevents loosing the best solution (by mutation or crossover) from the population. Page 7 Recombination (reproduction, crossover) Recombination (also called reproduction or crossover) recombines the parts of parent strings. It is a way to combine two different results. Recombination selects genes from parent strings and creates new strings. The most used recombination methods (especially for binary strings) swap selected genes between parent. This fashion use for example 1-point crossover, 2-point crossover, uniform crossover etc. These methods are differed in the way of selection of swapped genes. There are further special types of recombination for various strings encoding arithmetic crossover for value encoding, tree crossover for tree encoding etc. Mutation Mutation operator randomly alters some elements of the individual string, which causes enhancement of genetic information in population. The genetic information enhancement allows continuation of evolution regardless of limited set of individuals. The most used mutation method is binary inversion - selected bits are inverted. There are special types of recombination for various strings encoding also - number adding for value encoding, order changing for permutation encoding, changing operator for tree encoding etc. Page 8 Genetic Algorithm for Shape Detection Image Preprocessing Such as most shape detection algorithms, we assume that input image was transformed by using any edge detection algorithm. For transformed image E(x,y) we define: (1a) E ( x, y ) = 1 for edge point E ( x, y ) = 0 otherwise, Where x and y are horizontal and vertical coordinates in the transformed image. Alternatively a value of E(x,y) don’t need to be binary, but it can express a probability that point [x,y] belongs to edge: (1b) E ( x, y ) ∈ 0;1 E ( x, y ) = 0 when point [x,y] doesn’t belongs to edge certainly E ( x, y ) = 1 when point [x,y] belongs to edge certainly String Encoding Let [ai,bi] represents of horizontal and vertical co-ordinates of object shape. Example is in figure 1, where the highlighted points can represent a circle. Figure 1 - circle points Let us define a general transform function T which transform the co-ordinates of the searched shape [ai,bi] to the image plane co-ordinates [xi,yi]. (2) [ xi , yi ] = T( ai , bi , p) , where p represents parameters of transformation. The transformation T can include for example a shape movements, spin, stretch etc. For our shape detection task we consider, that the values [ai,bi] are constant and some instance of shape (given by transformation, e.g. by miscellaneous values of p) is searched. The transformation parameters is used the natural value encoding. Fitness Function Now let us define fitness (objective) function so that it gives maximal value when the points [xi,yi] represent a shape in the image E. It is accomplished if we define function F as follows in equation 3: (3) F ( p) = N −1 ∑ E (T (a , b , p)) i i i =0 Page 9 Obviously, the shape detection (localisation) task is equivalent of task of searching of maxima of function F over all parameters p. Although definition (3) is formally correct but there is a problem with the search for the maximum. Since this fitness function has a spike of maximum only when the transformed shape is accurately equal to shape in image E, no optimisation technique (including GA) can’t be successfully used to search for this maximum. This problem can be eliminated by redefine of fitness function F. Let us define a fitness function F as follows: (4) F ( p) = N −1 ∑ R(T (a , b , p)) i i i =0 where R is defined as: (5a)   1 R ( x , y ) = MAX  E ( x + q, y + w) − q2 + w2  , ∀q , w  d  or computationally less intensive: (5b) ( ) 1   R ( x , y ) = MAX  E ( x + q, y + w) − q + w , ∀q , w  d  where d is a positive constant. The fitness function (4) takes into account nearest edge points, so as the bigger length of nearest edge points were penalised. Penalisation is determined by length constant d. The choice of the constant d depends on the image structure. Generally better convergence of the solution can be achieved when the constant d is large, but too large value is not suited for real (and noised) images. Disadvantage of using the too large value is a possibility of detection of non-existing circles with radius smaller or comparable to the length constant d, which can happen especially for images damaged by noise. Note that equation (5) can be usable for binary images according definition (1a) and without change also according (1b). Also note that in order to save the computer requirements, the equation (5) can be partitioned into two parts - image plane transformation (equation 5) that can be precomputed ones before GA processing and own fitness function (4). Selection Steady-State selection is used for proposed algorithm. Main idea of this selection is that big part of chromosomes should “survive” to next generation. This is one of the simplest versions of genetic algorithm. One step of this algorithm consists of three substeps: - mutation – randomly selected strings are mutated - crossover – two randomly selected strings are crossovered (theirs parts are swapped) - selection – the worst string (solution) from population is replaced by the best string. Mutation Several mutation algorithms were tested in proposed genetic algorithm. As a basic method the binary inversion was used. Page 10 For chosen arithmetic string encoding, an arithmetic mutation was tested. The individual string values are (independently with some probability) perturbated. Perturbatiton is chosen so as enhance a probability of small variation. Next method is proposed especially for shape detection task. It is partially shapedependent. Proposed mutation method is based on assumption that a shape instance (given by string) has one shape point located correctly. The string (e.g. coefficients of shape transformation) is then randomly changed so as to accomplished condition that chosen point is located on the same place also for new string (transformation). This is demonstrated in figure 2, where the point marked by arrow is chosen to be stable. The grey circles are potential circles after mutation (one is stretched, the other is rotated). Figure 2 - Circle mutation For polygonal shapes (or shape parts) a mutation can be extended to operation of movement in direction of derivation of shape in selected points. The mutation rate (probability of mutation in one step of GA) can also be adapted in according to GA convergence. The proposed approach increases a mutation rate when the best fitness does not improve for some time. Crossover Two types of crossover can be used for chosen string encoding – binary crossover or arithmetic crossover. As a binary crossover a 2-point crossover is used. This crossover operator uses two randomly chosen crossover points. Strings swap the segments between these two points. Arithmetic crossover independently with some probabilities swaps individual values from string. Page 11 Shape Detection Experiments and Results There are shown some experiments results and method comparison in this chapter. The genetic algorithm suggested in previous chapter is tested. Circular shape detection The detection of circles and ellipses is described in this chapter. Note that presented basic principles can be also used on circular parts of complex shapes. Image preprocessing If there is no other information, let us assume an image as follows: - condition (1b) is accomplished - 256*256 points size. The tested images are created artificially. Because in real applications the images are very often corrupted (corrupted by noise, incomplete, smoothed, etc) the next three types of corruption are usually tested - image with fuzzy edges, image corrupted by noise and image with fuzzy edges and noise together. In image with fuzzy edges are edges fuzzed randomly in the range ±4 pixels. In noise corrupted image, an additive noise with uniform distribution in range <-1;1> is used. The image examples are shown in figure 1. GA Image preprocessing For GA fitness evaluation is the image preprocessed by equation (5b). This equation has significantly lower computation demands (than equation 5a) and GA detection results for both proposed equations are comparable. The example of the source image and the image processed by equation (5b) is shown in figure 3. Note that preprocessed image is shown in negative. Figure 3 - (a) source image (b) preprocesed image Shape description and encoding Let us assume that circular shape (e.g. circle or ellipse) detection is needed. A shape must be detected for all sizes and all spin. Circle and ellipse shape co-ordinates can be both described by equation (6). These tasks are differed only in required transform coefficients. Page 12 (6)  2πi  ai = cos   N   2πi  bi = sin ,  N  where i ∈< 0; N − 1 > The required transform is described as follows: (7) ai ' = x 0 + ai k x cos(α ) + bi k y sin(α ) , bi ' = y 0 + a i k x sin(α ) + bi k y cos(α ) , where ai and bi are shape description coordinates, ai´ and bi´are transformed shape coordinates and x0 ,y0, kx, ky, α are transform coefficients. The x0 ,y0 are coordinates of transformed shape centre, kx, ky are coefficients of shape stretch in orthogonal direction and α is spin of transformed shapes. GA string for representation of ellipse transformation includes all above shown coefficients, e.g. x0 ,y0, kx, ky and α. If there is any more knowledge about explored shape the number of used coefficients can be lowered. For example, for circle transformation the coefficients kx, ky must be equal and value of α has no meaning. So GA string for representation of circle transformation includes: transformed shape coordinates x0 ,y0 and shape stretch k (k is equal to kx, ky). Genetic operators In previous chapters was mentioned that genetic operators which can be used for shape detection task. In this chapter, we will discus an influence of these operators on our task. For the proposed transformation (7), the maximal number of the transform coefficient (e.g. string items) is five. This transform includes most significant task - from simple template matching (only two coefficients - position of shape - are unknown) to full planar shape localisation (position, stretch (in both direction) and spin of object are unknown). Firstly we discuss a right number of strings in population. The 16÷32 strings appear to be optimal for miscellaneous images and number of string items, but there are not usually significantly worse results for any number in range 8 to 64. The three types of mutation operator was noticed - binary mutation, arithmetic mutation and mutation proposed for shape detection. The best one is combination of the mutation proposed for shape detection and the arithmetic mutation. The solo arithmetic mutation is improper and gives worse results than the binary mutation. Differences between the mutation operators are bigger for complex images or for when number of the string coefficients is bigger. The adaptation of mutation rate appears to be useful - the mutation rate is increased when population fitness does not improve. Increment of the mutation rate also depends on the population fitness. The mutation rate increments more for lower fitness and vice versa. The two types of crossover were noticed – the binary crossover and the arithmetic crossover. There is no significant difference between both types. Page 13 Detection In this chapter some results of circle and ellipse detection by using proposed genetic algorithm is shown. The quality of the results is represented as a number of genetic algorithm steps (iterations) necessary to circle location. The circular shape is represented by 32 points. The points are expressed in equation (6), where N = 32. As it was shown in previous chapter a circle detection task needs GA string with tree coefficients. The transformation coefficients x0 and y0 are in range from 0 to 255, k is in range from 16 to 128. A GA population with 16 strings is used. The tested images are artificially created and they are shown in figure 4. Each of them consists of 1 circle and 32 randomly placed lines. The first one (fig.4a) is the source image without any corruption. In figures 4b, 4c and 4d there are shown corrupted images. Figure 4 – (a) Test image - circle, (b) with fuzzy edges, (c) corrupted by noise, (d) with fuzzy edges and corrupted by noise Note when edges are fuzzy, we can not expect that a circle in the image will be located precisely. As correctly detected we will consider circles that fulfil the following condition: (8) (x'− x0 ) + ( y'− y 0 ) + (k '−k ) ≤ 1 + f , where f is blur constant (4 points in this case), x0, y0, k are coefficients of GA string and x', y', k' specifying the correct position of the circle in image. The results (number of iteration to circle localisation) are summarised in table 1. 2 2 2 Page 14 Figure Test image (fig.4a) Test image with fuzzy edges (fig.4b) Test image with noise (fig.4c) Test image with fuzzy edges and noise (fig.4d) Iterations 1500 2700 5400 8400 Table 1 – Results – number of iterations As it was suggested, the ellipse detection task is rather similar to the circle detection task. So ellipse detection also uses 32-points shape representation and the same range of strings coefficients (e.g. x0 and y0 are in range 0 to 255, kx ,ky are in range from 16 to 128). The range of coefficient α is from 0° to 180°. Population with 16 strings are also used. The test image is shown in figure 5. Figure 5 - Test image - ellipse The results (number of iteration to ellipse localisation) are summarised in table 2. Figure Test image (fig.2) Test image with fuzzy edges Test image with noise Test image with fuzzy edges and noise Iterations 90000 120000 260000 700000 Table 2 – Results – number of iterations There are also many special detection cases where one or several ellipse parameters are known. Some results for these cases are shown in table 3. It can be rather surprising that there are 1700 iterations for string with three items (x0, y0, k) and 2000 iterations for string with only two items (x0, y0). This result is caused by the proposed mutation, which evoke faster convergence. Also difference of number of iterations between both strings with three items (x0, y0, k versus x0, y0, α) is caused by the mutation. Page 15 String items x0, y0 x0, y0, kx or x0, y0, ky x0, y0, α x0, y0, kx, ky x0, y0, kx, α or x0, y0, ky, α Iterations 2000 1700 9000 13000 18000 Table 3 – Results – number of iterations Polygonal shape detection The detection of oblongs is described in this chapter. Presented basic principles can be used also on others polynomial shapes or shape parts. There is no difference in the image preprocessing. It is done by the same way as for circular shape detection. The genetic operators are also the same, only mutation is extended to random tangent movement as was suggested. Shape description and encoding Let us assume that oblong shape detection is needed. A shape must be detected for all sizes and all spins. Oblong shape co-ordinates can be both described by (9): (9) a 4 i +0 = qi a 4 i +1 = 1 a 4i + 2 = q N −i −1 a 4i +3 = −1 b4i +0 = 1 b4i +1 = q N −i −1 b4i + 2 = −1 b4i +3 = qi , where i ∈< 0; N − 1 > and qi = 2i −1 N The required transform is described as follows: (10) ai ' = x 0 + ai k x cos(α ) + bi k y sin(α ) , bi ' = y 0 + a i k x sin(α ) + bi k y cos(α ) , where ai and bi are shape description coordinates, ai´ and bi´are transformed shape coordinates and x0 ,y0, kx, ky, α are transform coefficients. The x0 ,y0 are coordinates of transformed shape centre, kx, ky are coefficients of shape stretch in orthogonal direction and α is spin of transformed shapes. GA string for representation of oblong transformation will include all above shown coefficients, e.g. x0 ,y0, kx, ky and α. Detection The oblong shape is represented by 32 points. The points are expressed in equation (9), where N = 8. The string coefficients x0 and y0 are in the range from 0 to 255, kx ,ky are in the range from 16 to 128 and the range of coefficient α is from 0° to 180°. A GA population with 16 strings is used. The tested images are artificially created and they are shown in figure 6. It consists of 1 oblong and 32 randomly placed lines. Page 16 Figure 6 - Test image - oblong The results (number of iteration to oblong localisation) are summarised in table 4. Figure Test image (fig.3) Test image with fuzzy edges Test image with noise Test image with fuzzy edges and noise Iterations 108000 134000 >1000000* 310000 *fitness function had no local maximum at shape position Table 4 – Results – number of iterations Page 17 Conclusion One of the major problems of computer vision is the localisation of object of interest. Much of the structural information in an image is encoded within the edges, which gives the information about shape of object in image. Shapes are important information for acquiring the notion of object. There are many imaging applications where image analysis can be reduced to the analysis of shapes (e.g. analysis of cells, machine parts or characters, Traffic sign detection, etc.). A standard algorithm for circle detection is the Hough Transform or its variant, the probabilistic Hough Transform. These algorithms are slow, memory intensive and have a limited accuracy as the number of objects in the image increases. The presented algorithm uses evolvable technique based on genetic algorithm for circle localisation. It allows reduction of memory and computation requirements. The proposed solution enables the shape detection in an image by using the genetic algorithm. It is shown that a right choice of mutation operator can significantly reduce the number of iterations steps needed to find the solution. A genetic algorithm also allows simple integration of constrains conditions (e.g. approximate location or dimensions of the circle) and herewith also reduces a computational demands. The computational and memory demands are lower than the demands of the standard Hough transform. On the other hand, the advantage of the Hough transform is the ability to find all circles in image altogether. Page 18 References [1] Poli R.; Genetic programming for Image Analysis; The university of Birmingham, Technical report CSPR-96-1, 1996 [2] Sonka M., Hlavac V., Boyle R., Image Processing, Analysis and Machine vision; Brooks and Cole Publishing 1998 [3] Whitley D., A Genetic Algorithm Tutorial, Statistics and Computing 1994, Volume 4, pp. 65-85 [4] Ilingworth J, Kittler J., A Survey of the Hough Transform, Computer Vision, Graphics, and Image processing 1988, Volume 44, pp.87-116 [5] J.Serra; Image analysis and mathematical morphology; Academic press inc. Ltd 1982 [6] Mirriam Di Ianni, Ralf Diekrmann, Reinhard Lüling, Jürgen Shulze, Stefan Tschöke; Simulated annealing and genetic algorithms for shape detection; Control and Cybernetics, vol. 25, No. 1, pp. 159175, 1996. [7] R. Veltkamp and M. Hagedoorn; State-of-the-art in shape matching; Technical Report UU-CS-1999-27, Utrecht University, the Netherlands, 1999 Page 19

Genetic Algorithm for Shape Detection

Related documents

Products

Support

Genetic Algorithm for Shape Detection

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib