Image Segmentation Algorithms Comparison and Analysis of the segmentation Algorithms Divya Varghese M.Tech, Computer Science and Engg Nitte Meenakshi Institute of Technology Bangalore, India divyavarghese.123@gmail.com Dr.Jharna Majumdar Dean R&D, H.O.D computer science and Engg Nitte Meenakshi Institute of Technology Bangalore, India Jharna.majumdar@gmail.com Abstract— Image segmentation is grouping of similar pixels based on properties like color, intensity, texture, depth, motion. Image segmentation and its performance evaluation are very difficult but important problems in computer vision. Because image segmentation is used for further processing like object recognition and data compression. In this paper we compare three different region-based segmentation algorithms and find the best image segmentation algorithm for an image, based on the performance metrics. The first algorithm is Mean-shift algorithm. Second is Watershed algorithm. Third is Fast Scanning algorithm. Index Terms— Fast Scanning, Mean-shift, performance metrics, Segmentation, Watershed. I. INTRODUCTION Image Segmentation [2][3] involves partitioning an image into a set of homogeneous and meaningful regions so that the pixels in each partitioned region possess an identical set of properties or attributes. Image segmentation algorithms are generally based on two basic properties of intensity values of image pixels: discontinuity and similarity. Segmentation algorithms are based on different parameters of an image like gray-level, colour, texture, depth or motion. The result of image segmentation is a set of segments that collectively cover the entire image. All the pixels in same region are similar with respect to some property such as color, intensity or texture. Adjacent regions differ with respect to same characteristics. A large number of segmentation techniques are available. But, there is no general algorithm that can excellently perform the segmentation task. Available segmentation techniques include thresholding, region-based, clustering, classifier, neural network-based, MRF model based approaches [1]. Image segmentation is very important in digital image processing and computer vision applications. It is considered as most critical components of an image analysis and pattern recognition system. It is also considered as an important operation for meaningful analysis and Mrs.Bhuvaneshvari S Patil Asst.Professor, Computer Science and Engg Nitte Meenakshi Institute of Technology Bangalore, India interpretation of the images acquired. The extraction of objects from the background of a digital image has been a challenging task in the field of digital image processing. With the increasing demand for complex image analysis and interpretation, the demand for accurate segmentation of images has also grown stronger and as a result many image segmentation methods and algorithms have been developed over the past few decades. The goal of this paper is to analyze the function performance of region-based segmentation methods. The segmentation method is used as initial analysis of the image which is used as input for further computer vision tasks like Object detection, recognition, shape analysis, tracking ,Use of Artificial Intelligence and Machine Learning. Here, we consider the three general-purpose methods, mean-shift, Watershed, and fast scanning algorithms. The mean-shift method, initially considers a bin value for a window, and repeats the shifting procedure till convergence. The watershed method constructs the dams across the regions. The fast scanning method groups the pixels into an existing cluster based on some merging criterion. The applications of image segmentation includes several domains such as Medical Science, Analysis of Remotely sensed Image, Fingerprint Recognition, and Traffic System Monitoring and so on. The paper is organized as follows: Section 2 presents the algorithms for region-based segmentation. Section 3 introduces briefly the performance measurement details. Section 4 shows Experiment results and analysis charts. A brief conclusion is given in Section 5. II. ALGORITHMS FOR REGION-BASED SEGMENTATION Region-based segmentation is a technique for determining regions directly. The basic properties for region-based segmentation are: Segmentation must be complete, that is every pixel must be in a region. Points in a region must be connected in some predefined sense. The new mean is calculated as, 1(6) + 1(7) + 6(6) + 7(7) The different regions must be disjoint. A. Mean – Shift Segmentation (MS) Mean-shift is used for clustering of the pixels in a specified range. The idea of mean-shift [4] is very simple and based on a histogram of the image. Let X={x1,x2,…….xN} be a set of N image pixels assuming property vectors V(xi). Let H(x) be the image histogram. H(x) is an array of bins H0, H1, H2…..HNG. Here, 0 is minimum gray-tone value and NG is the maximum gray-tone value. A window of the histogram of continuous bins of some fixed length ws is created. In general ws will be an odd number. The idea of mean-shift is to start with windows centered on a random bin and shift the center of the bin according to the data within it converges at a mode of the histogram. This mode defines a cluster, and the process is repeated until all bins are associated with one of the computed modes. The shifting procedure is given by the following steps: Initialize with a random seed that selects a bin and select the window W centered on that bin. Calculate the center of gravity or weighted mean wm of the histogram values in window W. wm = ∑ bi f ( bi ) b €W i where f ( bi ) is the count in bin bi normalized by dividing by the sum of counts in all the bins of window W. Translate the search window W to the weighted mean wm. Repeat previous two steps till convergence. =6 15 The convergence point for each window W is stored, and all the pixel values which satisfy the clustering criteria are grouped together. B. Watershed Method (WM) In the past years the watershed transformation has proven to be a very useful and powerful tool for morphological image segmentation. Idea of the watershed construction is simple. A grey-scale picture is considered as a topographic relief. Every pixel of this digital image is assigned to the catchment basin of a regional minimum. Numerous techniques for computing the watershed method are present. The first who proposed immersion based watershed algorithm are Beucher and Lantuejoul [5]. Another approach for catchment basin computing is described by Vince and Soille [6]. The authors simulate the flooding procedure. Here, the water is coming up out of the ground and flooding the catchment basins. The dams are built across the regions so that the water does not overflow to neighbouring regions. Another approach for computing watershed transformation is based on rainfalling simulation proposed by Alina N. Moga, Bodgan Cramariuc and Moncef Gabbouj [7]. We use the flooding procedure [8] for the watershed segmentation technique. In this method, regions correspond to the catchment basins and contours are determined by the watershed lines. The shift procedure is illustrated in figure1. In the figure a window of length 9 of the histogram is shown with bins 1 through 9, centered on bin 5. The first 5 bins have count 0, while the other 4 have nonzero values totalling to 15. Figure 2: Image shows the watershed regions shift 1 2 3 4 5 6 7 8 Initial window centre Figure1: Represents the 1D histogram 9 As shown in Fig 2, there are three regions in watershed segmentation method: Grey level values which represent the surface. Catchment basin which represent the segmented region. Watershed lines or dams which represent the contours or boundaries of the region. The procedure for flooding process is given as follows: The pixels in an image are sorted in increasing order of the gradient values. Starting with the lowest gradient altitude, the water gradually fills up the first catchment basin. Suppose the flooding reaches a given level h. Every catchment basin whose corresponding minimum is smaller than or equal to h is assigned a unique label. The pixels with a gradient value h+1 are now examined. If a pixel has a labeled pixel as its neighbour, it is assigned the same label as its neighbour. If a pixel does not have a labeled pixel as its neighbour, it corresponds to a local minimum at level h+1. This procedure is repeated until every pixel in the image has been assigned a label. The flooding procedure terminates when the level is higher than the maximum gradient value and the region boundaries are given by dams. Figure 3: Steps in the flooding procedure. The fig 3(a) shows the flooding procedure which takes place from the minima and floods till it reaches a level h. fig 3(b) shows when region is flooded the merging of regions takes place and that is when a dam is constructed. Fig 3(c) and 3(d) shows the construction of dams to prevent the merging of regions. C. Fast Scanning Method The concept of fast scanning algorithm [9] is to scan from the upper-left corner to lower-right corner of the whole image and determine if we can merge the pixel into an existed clustering. The merged criterion is based on our assigned threshold. If the difference between the pixel value and the average pixel value of the adjacent cluster is smaller than the threshold, then this pixel can be merged into the cluster. It is a simple concept and we list the steps of algorithm as below. We use C[m, n] (m = 1, 2, …, M, n= 1, 2, …, N) to denote the value of the pixel [m, n], we use R[m, n] to denote the pixel [m, n] is classify into which region, use A(j) to denote the mean of pixels in the jth region, and use B(j) to denote the number of pixels in the jth region. (Step 1): Classify the first pixel [1, 1] as Region 1, We set R[1, 1] = 1, A(1) = C[1, 1], B(1) =1, m = 1, n = 1, and j = 1. (Step 2): Then, set n = n+1 and scan the next pixel. If R[1, n−1] = j and Case 1: if |C[m, n] − A(j)| ≤ threshold, (1) then set R[m, n] = j and set A(j) = {A(j) B(j) + C[m, n]}/( B(j) +1) B(j) = B(j) +1. (2) Case 2: if |C[m, n] − A(j)| > threshold, (3) then set R[m, n] = j+1, A(j+1) = C[m, n], B(j+1) = 1, and j = j+1. (Step 3): Repeat Step 2 until n = N. (Step 4): Then, set m = m+1, n = 1, and scan the first pixel in the next row. (Step 6): Repeat Step 3 and Step 4 until all the pixels in the image have been scanned. (Step 7): If B(i) < Δ, we delete Region i and assign the pixels in Region i to the adjacent regions. Sometimes, the isolated dots (due to details or noise) of an image may cause oversegmentation. This step can avoid the problem. (Step 8): Sort the regions according to B (i), i.e., the number of pixels within them. III. Performance Measure Many segmentation methods have been developed, but there is still no satisfactory performance measure, which makes it hard to compare different segmentation methods. The success or failure of computerized analysis procedures is determined by the segmentation accuracy. In this section the different performance metrics [10] which are used for finding the accuracy of segmentation methods are discussed. The probability function of the gray level image is estimated from the percentages of the count at the specific level over the total count across its histogram plot. A. Gray Level Energy The gray level energy indicates how the gray levels are distributed. It is formulated as (1), where E(x) represents the gray level energy with 256 bins and p(i) refers to the probability distribution functions, which contains the histogram counts. The energy reaches its maximum value of 1 when an image has a constant gray level. k E(x) = ∑ p (i) 2 (1) i=1 The larger energy value corresponds to the lower number of gray levels, which means simple. The smaller energy corresponds to the higher number of gray levels, which means complex. B. Discrete Entropy The discrete entropy is the measure of image information content, which is interpreted as the average uncertainty of information source. It is calculated as the summation of the products of the probability of outcome multiplied by the log of the inverse of the outcome probability, taking into considerations of all possible outcomes {1, 2, …, n} in the event {x1, x2, …, xn}, where n is the gray level; p(i) is the probability distribution, considering all histogram counts. It is formulated as (2). k H(x) = ∑ p (i) log 2 1 p(i) (2) R = I(X ; Y) H(x) + H(y) (5) IV. Experiment results and analysis charts The tables below show the values obtained by the segmentation methods for the various performance metrics. i=1 For image processing, the discrete entropy is a measure of how many bits are needed for coding the image data, which is a statistical measure of randomness. The maximal entropy occurs when all potential outcomes are equal. When the outcome is certainty, the minimal entropy occurs which is equal to zero. The discrete entropy represents average amount of information conveyed from each individual image. It is the fact that when the image pixels are distributed among more gray levels, the entropy value will increase. C. Mutual Information The notion of the mutual information can be applied as another objective metric. The mutual information acts as a symmetric function, which is formulated in (3). I(X;Y) = -∑ PX (x) log 2 PX (x) + ∑ PXY (x,y) log 2 PXY (x,y) (3) x x,y PY (y) = H(x) – H(x|y) where I(X ; Y) represents the mutual information; H(X) and H(X|Y) are entropy and conditional entropy values. It is interpreted as the information that Y can tell about X is equal to the reduction in uncertainty of X due to the existence of Y. in image segmentation, the better match between the source and processed images, the less value of the mutual information. D. Normalized Mutual Information The normalized mutual information is a well defined measure covering contents from both discrete entropies and mutual information. It is formulated as (4). I(X ; Y) N_MI = √ H(x) H(y) (4) where I(X, Y) is the mutual information; H(X) and H(Y) are the discrete entropies. Once again, similar to that of the mutual information, the better match between the source and processed images, the smaller the normalized mutual information. TABLE 1: PERFORMANCE COMPARISON BASED ON TIME FOR EXECUTION Images Time taken for execution Mean-shift Watershed Fast Scanning 1 0.078 0 1.469 2 0.078 0 1.469 3 0.078 0.016 1.391 4 0.078 0 2.235 5 0.079 0 1.703 From the Table 1, it is observed that in all the cases the fast scanning algorithm takes more time for execution. The meanshift method takes almost the same time for all the images. Watershed method is the fastest method which takes almost negligible time for execution. TABLE 2: PERFORMANCE COMPARISON BASED ON GREY LEVEL ENERGY Images Grey level energy values Mean-shift Watershed Fast Scanning 1 0.059921 0.023758 0.05049 2 0.180893 0.371033 0.05049 3 0.131012 0.111008 0.26503 4 0.238693 0.183929 0.224838 5 0.301506 0.383707 0.306076 From the Table 2 it can be viewed that in most of the cases the mean-shift method has higher grey level energy compared to other methods. Higher grey-level energy refers to the lower number of grey levels. TABLE 3: PERFORMANCE COMPARISON BASED ON DISCRETE ENTROPY E. Information Redundancy Another symmetric information measure can be used to indicate redundancy in image segmentation. It reaches the minima of zero when all variables are independent. It is formulated as (5). Bigger is the information redundancy, greater the dependence between two images. Images Discrete entropy values Mean-shift Watershed Fast Scanning 1 2.531187 1.770925 0.940478 2 4.770844 4.157135 0.940478 3 4.214459 2.657318 1.303519 Images Discrete entropy values Mean-shift Watershed Fast Scanning TABLE 6: PERFORMANCE COMPARISON BASED ON INFORMATION REDUNDANCY Images 4 3.116527 3.005952 1.610933 5 4.517675 2.901501 1.886162 Information redundancy values Mean-shift From the table 3 values obtained above, it can be concluded that mean-shift has greater discrete entropy value compared to other two methods. Entropy indicates the degree of randomness or uncertainty. Watershed Fast Scanning 1 0.287847 0.103024 0.635131 2 0.33998 0.075184 0.635131 3 0.354571 0.341729 1.336725 4 0.437975 0.066982 0.292611 5 0.558267 0.35884 1.121955 TABLE 4: PERFORMANCE COMPARISON BASED ON MUTUAL INFORMATION Images Mutual information values Mean-shift Watershed Fast Scanning 1 2 0.637502 2 2 5.611849 1.19488 2 3 5.287725 5.564091 8.892962 4 5.180542 0.153376 1.745973 5 9.775905 5.703765 9.405248 From the table 4, it can be viewed that the mean-shift method has greater mutual information compared to the other methods. The greater match between the source and processed images indicates a lesser value of mutual information. So from the above table, watershed method shows greater match with the source image. TABLE 5: PERFORMANCE COMPARISON BASED ON NORMALIZED INFORMATION Images Normalized information values Mean-shift Watershed Fast Scanning 1 0.598145 0.22794 1.387743 2 0.749991 0.171071 1.387743 3 0.787472 0.855992 3.367751 4 0.994224 0.22794 0.65911 5 1.275958 0.928939 2.686783 From the table 5, it can be viewed that fast scanning method has greater normalized information compared to the other methods. The greater match between the source and processed images indicates a lesser value of normalized information. So from the above table, watershed method shows greater match with the source image. From the table 6, it is observed that the watershed algorithm has the lowest value for information redundancy and fast scanning method has the highest value for information redundancy. Greater the value of information redundancy, greater is the dependence between the two images. V. Conclusion In this paper we have seen the different segmentation techniques namely mean-shift, watershed, and fast scanning methods. We used the different performance metrics to find out the best segmentation algorithm. The tabular values for the performance functions show that watershed segmentation is the best of the three algorithms. Watershed method is the best algorithm because it takes very less time for the execution. It has very less value for mutual information performance metrics which says it matches with the source image. It has comparatively less value for discrete entropy which means it has less uncertainty or randomness. It also has a less value for normalized information which is again a measure of matching with the source image. Thus we conclude from the results of the above performance metrics that the watershed based segmentation algorithm is the best segmentation algorithm when compared to mean-shift and fast scanning algorithm. Figure 4: The segmentation results of mean-shift, watershed, and fast scanning algorithm [5] S.Beucher and C.Lantuejoul, “Use of watersheds in contour REFERENCES [1] Dr.P.Raviraj, Angeline Lydia, Dr.M.Y.Sanavullah, “An Accurate Image Segmentation Using Region Splitting Technique,” GECJ: Computer science and Telecommunications 2011. No.2 (31) [2] Biplab Banerjee, Tanushree Bhattacharjee, Nirmalya Chowdhury, “Color Image Segmentation Technique Using “Natural Grouping” of Pixels”, International Journal of Image Processing(IJIP), 2010 Volume(4): Issue(4). [3] Takumi Uemera, Gou Koutaki and Keiichi Uchimura, “Image Segmentation based on edge detection using boundary code”, IJICIC Volume 7, No.10, Oct 2011. [4] Dingding Liu, Bilge Soran, Gregg Petrie, and Linda Shapiro, “ A Review of Computer Vision Segmentation Algorithms” pg 9-11, 2010. detection”, In International workshop on Image processing, Sep 1979. [6] L. Vincent and P.Soille. Watersheds in digital spaces: An efficient algorithm based on immersion simulations. IEEE PAMI, 1991,13(6).583-598. [7] Alina N Moga, Bodgan Cramariuc, and Moncef Gabbouj. A parallel watershed algorithm based on rainfalling simulation.in European conference on circuit theory and design.vol 1 pg 339-342. [8] Jong-Bae Kim, Hang-Joon Kim, “ Multiresolution-based watersheds for efficient image segmentation”, Pattern recognition letters, Elsevier, June 2002. [9] Jian-JiunDing, Cheng-Jin Kuo, Wen-Chih Hong, “An efficient image segmentation technique by fast scanning and adaptive merging”, 2008. [10] Zhengmao Ye,” Objective Assessment of Nonlinear Segmentation Approaches to Gray Level Underwater Images”, ICGST-GVIP Journal, ISSN 1687-398X, Volume (9), Issue (II), April 2009. .