Fast Robust GA-Based Ellipse Detection Jie Yao*, Nawwaf Kharma*1 and Peter Grogono** *Electrical & Computer Engineering and **Computer Science Departments, Concordia University, 1455 Blvd. de Maisonneuve O., Montreal, Quebec, Canada, H3G 1M8. 1 kharma@ece.concordia.ca Abstract This paper discusses a novel and effective technique for extracting multiple ellipses from an image, using a Multi-Population Genetic Algorithm (MPGA). MPGA evolves a number of subpopulations in parallel, each of which is clustered around an actual or perceived ellipse. It utilizes both evolution and clustering to direct the search for ellipses – full or partial. MPGA is explained in detail, and compared with both the widely used Randomized Hough Transform (RHT) and the Sharing Genetic Algorithm (SGA). In thorough and fair experimental tests, utilizing both synthetic and real-world images, MPGA exhibits solid advantages over RHT and SGA in terms of accuracy of recognition - even in the presence of noise or/and multiple imperfect ellipses, as well as speed of computation. all potential optima, resulting in a dramatic increase in the GA’s ability to detect multiple ellipses. Indeed, not only is the leading edge of GA-based ellipse detection technique advanced, but also the MPGA’s detection ability and computational efficiency are superior to those of the widely-used Randomized Hough Transform (RHT) [1] or Sharing Genetic Algorithm [2]. Our MPGA algorithm employs both evolution and clustering to detect ellipses. In addition, the algorithm uses two complementary measures of fitness and specially designed form of mutation. This results in a robust algorithm that performs very well on images with multiple ellipses, imperfect ellipses, and in the presence of noise, with which the performance of RHT and SGA degrade considerably. Keywords 2. The Multi-Population Genetic Algorithm Genetic Algorithms, clustering, Sharing GA, Randomized Hough Transform, shape detection, ellipse detection. 1. Introduction In this paper, we propose a novel Multi-Population Genetic Algorithm (MPGA) for accurate and efficient detection of multiple imperfect/partial ellipses in noisy images. This capability that the algorithm provides is genuinely useful for many real-world image processing applications, such as: object detection, pattern recognition, scene characterization and event detection. Rather than evolving a single population, as in traditional Genetic Algorithms, we evolve a number of subpopulations in parallel, which simultaneously seek The algorithm starts with a single population, which is ranked in terms of both similarity and distance and searched for good candidates, if any. A clustering technique is then used to divide the chromosomes into a number of clusters (or subpopulations). All the subpopulations are evolved in parallel. If one of the subpopulations converges on an optimal or a suboptimal chromosome, that whole subpopulation and the corresponding ellipse (in the image) are removed. This has the positive side effect of accelerating the search process, since the rest of the subpopulations will have one less ellipse to search for. The program, as a whole, terminates when all (full and partial) ellipses are found, or when a pre-set maximum number of generations are reached. 1 2.1 Representation and Initialization 2.3 Termination Conditions Since an ellipse is represented using a minimum of 5 points, each chromosome contains 5 genes, each of which is a coordinate pair of a point. The MPGA algorithm creates an initial population of between 30 and 100 chromosomes, depending on the complexity of the target image. The five points comprising each new chromosome are selected, at random, from the set of foreground (or black) pixels in the target image. The termination of the evolution of any one of these subpopulations occurs independently of the termination of the others. A subpopulation terminates: if there exists an “optimal” chromosome; if there exists a “good” chromosome and its subpopulation has been stuck there for a long period; or a subpopulation is effectively terminated when it is merged with another subpopulation. 2.2 Fitness Evaluation In MPGA, fitness has two components: Similarity and Distance. A. Similarity (S) is defined as: E ( x i, y j ) d i, j ( x, y ) S #total (1) For a given point (x,y) on an ideal ellipse, the term E ( x i, y j ) returns 1 if there is a point in the target image that coincides with, or is close to (x,y); otherwise E ( x i, y j ) returns 0. The terms i and j represent the horizontal and vertical displacements, respectively, between a point on the ideal ellipse and the corresponding actual point in the image. #total is the total number of points on the ideal template. d i , j is defined as: di , j e |i | | j | 4 (2) B. Distance (D) is defined as: D d i, j ( x, y ) # eff (3) #eff is the total number of effective points. This item is used to estimate the average pixel-pixel distance between an actual elliptical segment and its ideal template. In MPGA, fitness is computed in the following manner: an ideal ellipse is computed from the 5 genes of each chromosome. By imposing this ellipse on the image, equations (1-3) are used to compute how this ellipse matches with the actual pixel distribution in terms of similarity and distance. 2.4 Clustering: Merging Migration, Splitting and In the MPGA algorithm, the subpopulations are manipulated through a clustering process, and each cluster is centered around a chromosome with the greatest similarity. This process involves Migration, Splitting and Merging. It is applied on all “good” chromosomes in each subpopulation. All poor chromosomes are simply left in their own clusters since they would not affect the evolution direction of their clusters. The Euclidean distance ED between a good chromosome and the various existing cluster centers determines whether this chromosomes remains in its own cluster or migrates. ED is computed as follows: given two sets of ellipse parameters (a1, b1, x1, y1, ω1) and (a2, b2, x2, y2, ω2): ED (a1 a2 ) 2 (b1 b2 ) 2 ( x1 x2 ) 2 ( y1 y 2 ) 2 (1 2 ) 2 (4) 5 (ai, bi) is the long and short axis, (xi, yi) is the center and ωi is the orientation. A. Migration. If the Euclidean distance between a chromosome and its own cluster centre is higher than a pre-defined threshold, this chromosome is moved to another cluster with which it actually belongs to. When a chromosome migrates from subpopulation A to subpopulation B, it replaces the weakest chromosome in the latter, and the vacancy in the former is filled by the processes of evolution. B. Splitting. If the algorithm is unable to find a cluster (with a centre) sufficiently close to a migrating chromosome, then a new cluster is created around this chromosome. This action is called splitting. C. Merging. As two different clusters may evolve toward a single (local or global) optima, any two clusters with centers closer to each other than an empirically-derived threshold d are merged. This is done by taking the fittest 50% (in terms of similarity) 2 of the chromosomes in each cluster and placing them in the new merged cluster. All clusters are checked periodically to see whether some of them could be merged. Merging is barred for the first 50 generations since early or/and frequent mergers would greatly reduce the diversity of the whole population. 2.5 Evolution: Selection and Diversification Evolution proceeds mainly via Selection and Diversification. Selection focuses the search on promising areas of the fitness surface. On the other hand, diversity is achieved via crossover and mutation, which together serve to direct the search towards new and potentially promising areas. In MPGA, selection is realized using Elitism and Fitness Proportional Selection. Diversification is realized via Crossover and Mutation. During evolution of a subpopulation, elitism copies the fittest 15% of the current generation (by similarity) into the next generation, without modification. Following that, two chromosomes are selected from the whole current population with a probability proportional to their relative fitness (similarity). The two chromosomes are then crossed-over (single point) to generate new offspring. For mutation, we define a new mutation operator configured specifically for our application. First a gene (or point) is randomly selected from the chromosome that we intend to mutate, which acts as the starting point for a path that traverses the perimeter of a pattern. If the path is long enough, the remaining genes (points) are also picked, at random, from this path. As long as the starting point lies on a promising candidate ellipse, it is highly possible that the other points will do so as well. This greatly enhances the possibility of mutating a given chromosome into a better one. Finally, every new chromosome introduced into the next generation is tested for goodness and further for migration-splitting, if it is good enough. 3. Experimental Results and Analysis This section compares the performance of the MPGA, SGA and RHT algorithms, using both synthetic and real-world images. algorithm’s performance on images with different numbers of ellipses. Hence, the first collection contains 50 images of single ellipses; the second collection contains images of two ellipses, and so on. Set B, on the other hand, is used to test the algorithm’s performance on noisy images. This set is made of 5 collections of 50 images each. The first, second, third, fourth and fifth collections contain images with 0%, 0.5%, 1%, 3% and 5% salt-andpepper noise, respectively. Fig. 1 shows that the accuracy of both SGA and RHT decreases from 100% for one ellipse to between 45 and 80% for seven ellipses. In contrast, MPGA outperforms the other two algorithms by maintaining an almost perfect level of detection accuracy, and slightly dipping to around 98%, for images with seven ellipses. Fig. 1: Detection Accuracy for Multiple Ellipses Fig. 2 demonstrates the clear advantage that MPGA has over both RHT and GAS in regards to speed time. For 5 or more ellipses, MPGA realizes a clear and widening gap in speed. Fig. 2: Detection Speed for Multiple Ellipses Fig. 3 and Fig. 4 show the performance of these algorithms on noisy images. Again, MPGA shows considerable robustness in the presence of salt-andpepper noise, while simultaneously operating faster than the other two algorithms. 3.1 Synthetic Images The synthetic images are comprised of two sets, set A and set B. Set A is partitioned into 7 collections of 50 images each. These collections are used to test the 3 Fig. 3: Detection Accuracy for Noisy Images Algorithms Accuracy (%) Ave. Time (sec) False Positives (%) MPGA 92.761 134.58 6.905 SGA 15.113 370.39 0 RHT 64.387 809.73 18.633 Table 1: Overall Performance for Real-World Images From the table we can see that SGA is almost totally ineffective with complicated real world images with an average accuracy of less than 20%. RHT suffers from long computation times and high false positive rate. MPGA is, by all measures, the best performer. Fig. 4: Detection Speed for Noisy Images Many false positives are observed for RHT with high noise, as Fig. 5 shows. MPGA is almost not affected by noise at all. Fig. 5: Percentage of False Positives 3.2 Real World Images We have amassed three databases: hand-written English letters, microscopic cell images and road signs. From each of those databases, 50 images, are selected at random, and then used for assessing the real-world performance of MPGA, RHA and SGA. Fig. 6 shows a typical handwritten digit with elliptical curves detected, and a pre-processed images of cells with all the cells (typically elliptical in shape) detected by the MPGA algorithm. 4. Conclusions This paper presents a new GA, which uses multiplepopulations and an elaborate process of evolutionclustering to efficiently and accurately detect multiple potentially deformed ellipses in noisy images. This algorithm was thoroughly tested on a large number of synthetic and three types of real-world images, and compared to the very widely-used RHT and one of the best-available GA-based ellipse detection techniques: SGA. Despite the conceptual complexity of the MPGA algorithm, this algorithm performs more efficiently and more accurately than both RHT and SGA. References [1] Lei Xu, E.Oja, & P.Kultanen, “A New Curve Detection Method: Randomized Hough Transform (RHT)”, Pattern Recognition Letters, Vol. 11, No. 5, May 1990, pp. 331-338. [2] Evelyne Lutton and Patrice Martinez, “A Genetic Algorithm for the Detection of 2D Geometric Primitives in Images”, Proceedings of the 12th International Conference on Pattern Recognition, Jerusalem, Israel, 9-13 October 1994, Vol. 1, pp. 52652. Fig. 6 Detected Ellipses in Real-world Images 4