Edge detection Date : 07/16/04 Typically, there are three steps to perform edge detection: 1. Noise reduction, where we try to suppress as much noise as possible, without smoothing away the meaningful edges. 2. Edge enhancement, where we apply some kind of filter that responds strongly at edges and weakly elsewhere, so that the edges may be identified as local maxima in the filter’s output . One suggestion is to use some kind of high pass filter. 3. Edge localization, where we decide which of the local maxima output by the filter are meaningful edges and which are caused by noise. Edges are places in the image with strong intensity contrast. Since edges often occur at image locations representing object boundaries, edge detection is extensively used in image segmentation when we want to divide the image into areas corresponding to different objects. Representing an image by its edges has the further advantage that the amount of data is reduced significantly while retaining most of the image information. Since edges consist of mainly high frequencies, we can, in theory, detect edges by applying a highpass frequency filter in the Fourier domain or by convolving the image with an appropriate kernel in the spatial domain. In practice, edge detection is performed in the spatial domain, because it is computationally less expensive and often yields better results. 1. Derivative filters One image can be represented as a surface, with height corresponding to gray level value. Brightness function depends on two variables co-ordinates in the image plane ( gray level value z = f ( x,y)) As averaging of the pixels over a region is analogous to integration, differentiation can be expected to have the opposite effect and thus sharpen an image. Edges are pixels where brightness function changes abruptly. We can describe changes of continuous functions using derivatives. Brightness function depends on two variables co-ordinates in the image plane and so operators describing edges are expressed using partial derivatives. What is the smallest possible window we can choose? The smallest possible window we can choose for convolution consists of two adjacent pixels. We can calculate difference of the gray level values of two pixels. When this difference is high we say we have an edge passing between the two pixels. If a pixel’s gray level value is similar to those around it, there is probably not an edge at that point .If a pixel has neighbors with widely varying dray levels , it may represent an edge in this point. In other words, an edge is defined by a discontinuity in gray level values. The difference of the gray values of the two pixels is an estimate of the first derivative of the intensity function (image brightness function) with respect to the spatial variable along the direction of with we take the difference. This is because first derivatives are approximated by first difference in the discrete case. g x f ( x 1, y) f ( x, y) and g y f ( x, y 1) f ( x, y ) (1) Calculating gx at each pixel position is equivalent to convolving the image with a mask (filter) of the form -1 +1 in the x direction , and calculating gy is equivalent to convolving the image with a filter -1 +1 In the y direction. We also can express gradient calculation as a pair of convolution operations, g x ( x, y) hx * f ( x, y) and g y h y * f ( x, y ) (2) In the first output, produced by convolution with mask hx, any pixel that has an absolute value larger than values of its left and right neighbors is a candidate pixel to be a vertical edge pixel. In the second output , produced by the convolution with mask hy, any pixel that has an absolute value larger than the values of its top and bottom neighbors is a candidate pixel to be a horizontal edge pixel. The process of identifying the local maxima as candidate edge pixels is called non-maxima suppression. What happen when the image has noise ? In the presence of noise every small fluctuation in the intensity value will be amplified by differentiating the image . It is common sense then that we should smooth the image first with lowpass filters and then find the local differences. Let us consider for example a 1 dimensional signal . We smooth the signal by replacing each intensity value with the average of three successive intensity values: Ax I x 1 I x I x 1 3 (3) Then we estimate the derivative at position x by averaging the two differences between the value at position x and its left and right neighbors. Fx ( Ax 1 Ax ) ( Ax Ax 1 ) A Ax 1 x 1 2 2 (4) If we substitute from (3) into ( 4) , we obtain: Fx 1 I x 2 I x 1 I x 1 I x 2 6 (5) It is obvious from this example that we can combine the two linear operations of smoothing and funding differences into one operation if we use large enough masks like -1/6 -1/6 0 1/6 1/6 It is clear that the larger the mask used , the better is the smoothing. But is also clear that the more blurred the edge becomes so the more inaccurately its position will be specified. For an image which is a 2D signal one should use 2 dimensional masks. The smallest mask that one should use to combine minimum smoothing with differencing is a 3 x 3 mask. 1.1 The Prewitt operator Consider the arrangement of the pixels 3 x 3 about the pixel (x,y) g x ( x, y) f ( x 1, y) f ( x 1, y) and g y ( x, y ) f ( x, y 1) f ( x, y 1) where the kernels are 1 0 1 hx 1 0 1 1 0 1 and 1 1 1 h y 0 0 0 1 1 1 (7) (6) There are known as a Prewitt kernels 1.2 Sobel operator A similar pair of kernels are the Sobel kernels 1 0 1 hx 2 0 2 1 0 1 and 1 2 1 h y 0 0 0 1 2 1 (8) This operator place an emphasis on pixels that are closer to the center of the mask. The Sobel operator is one of most commonly used edge detectors. Clearly , the kernel hx is sensitive to changes in the x direction, edges that run vertically or have a vertical component. Similarly, the kernel hy is sensitive to changes in the direction y , edges that run horizontally or have a horizontal component. Since edges correspond to strong illumination gradients, we can highlight them by calculating the derivatives of the image. This is illustrated for the one-dimensional case in Figure 1. Figure 1 :1st and 2nd derivative of an edge illustrated in one dimension. We can see that the position of the edge can be estimated with the maximum of the 1st derivative or with the zero-crossing of the 2nd derivative. Therefore we want to find a technique to calculate the derivative of a two-dimensional image. For a discrete one-dimensional function f(i), the first derivative can be approximated by Calculating this formula is equivalent to convolving the function with [-1 1]. Similarly the 2nd derivative can be estimated by convolving f(i) with [1 -2 1]. Different edge detection kernels which are based on the above formula enable us to calculate either the 1st or the 2nd derivative of a two-dimensional image. There are two common approaches to estimate the 1st derivative in a two-dimensional image, Prewitt compass edge detection and gradient edge detection. Prewitt compass edge detection involves convolving the image with a set of (usually 8) kernels, each of which is sensitive to a different edge orientation. The kernel producing the maximum response at a pixel location determines the edge magnitude and orientation. Gradient edge detection is the second and more widely used technique. Here, the image is convolved with only two kernels, one estimating the gradient in the x-direction, gx, the other the gradient in the y-direction, gy. A changes of the image function are based on the estimation of gray level gradient at a pixel. Significant changes in the gray values in an image can be detected by using a discrete approximation to the gradient. The gradient is the two-dimensional equivalent of the first derivative and is defined as the gradient vector. g x f ( x, y ) g g y (9) The two gradients ( in x and in y direction) computed at each pixel by Equation (1) or Eq. ( 7) or Eq (8) can be regarded as the x and y components of a gradient vector ( 9) This vector is oriented along the direction of the maximum rate of increase of the gray level function f (x,y) . There are two important properties associated with the gradient: gradient magnitude and direction of the gradient The outputs of the two convolution are squared, added and square rooted to produce the gradient magnitude g g x2 g y2 (10) equals the maximum rate of increase of function f(x,y) per unit distance in the direction g . The magnitude of the gradient vector is the length of the hypotenuse of the right triangle having sides gx an gy, and this reflects the strength of the edge, or edge response, at any given pixel. It is common practice, to approximate the gradient magnitude by absolute values: g g x g y or g max( g x , g y ) (11) From vector analysis, the direction of the gradient gy gx tan 1 (12) Figure 2: 1. Gradient direction gives the direction of maximal growth of the function, e.g. from black (f(x,y)=0) to white (f(x,y)=255). This is illustrated below: closed lines are lines of the same brightness. 2. The orientation 0 points East Edge direction Gradient direction 0 White 255 Whichever operator is used to compute the gradient, the resulting vector contains information about how strong the edge is at that pixel and what its direction is. The edge magnitude is a real number. Any pixel having a gradient that exceeds a specified threshold value is said to be an edge pixel, and others are not. g ( x, y ) edge po int 1 if f ( x, y ) 0 if f ( x , y ) (13) After having calculated the magnitude of the 1st derivative, we now have to identify those pixels corresponding to an edge. The easiest way is to threshold the gradient image, assuming that all pixels having a local gradient above the threshold must represent an edge. An alternative technique is to look for local maxima in the gradient image, thus producing one pixel wide edges. A more sophisticated technique is used by the Canny edge detector. It first applies a gradient edge detector to the image and then finds the edge pixels using non-maximal suppression and hysteresis tracking. There are three problems with this approach ( thresholding of gradient magnitude): 1. First, the boundaries between scene elements are not always sharp. 2. The noise in an image can sometimes produce gradients as high as , or even higher than, those resulting from meaningful edges. 3. With a simple threshold approach is that the local maximum in gray level gradient associated with an edge lies at the summit of a ridge. Thresholding detects a portion of this ridge, rather than the single point of maximum gradient. The ridge can be rather broad in the case of diffuse edges, resulting in a thick band of pixels in the edge map. How we can choose the weights of a 3 x 3 mask for edge detection ? A11 A21 A31 A12 A22 A32 A13 A23 A33 If we are going to use one such mask to calculate gx and another to calculate gy . Such masks must obey the following conditions : 1. The mask with calculates gx must be produced from the mask that calculates gy by rotation by 90. 2. We do not want to give any extra weight to the left or the right neighbors of the central pixel, so we must have identical weights in the left and right columns. A11 A21 A31 A12 A22 A32 A11 A21 A31 3. Let us say that we want to subtract the signal “in front” of the central pixel from the signal “behind” it, in order to find local differences , and we want these two subtracted signals to have equal weights. -A11 A21 A11 -A12 A22 A12 -A11 A21 A11 4. If the image is absolutely smooth we want to have zero response. So the sum of all the weights must be zero. Therefore, A22 = -2A21: -A11 A21 A11 -A12 -2A21 A12 -A11 A21 A11 5. In the case of a smooth signal, and as we differentiate in the direction of columns, we expect each column to produce 0 output. Therefore , A21=0 -A11 0 A11 -A12 0 A12 -A11 0 A11 We can divide these weighs throughout by a11 so that finally, this mask depends only on one parameter -1 0 1 -K 0 K -1 0 1 If we choose K=2 we have the Sobel masks for differentiating an image along two directions 2. Prewitt compass edge detection( don’t be confuse with Prewitt operator for differential gradient edge detection ) Compass Edge Detection is an alternative approach to the differential gradient edge detection (see the Roberts Cross and Sobel operators). When using compass edge detection the image is convolved with a set of (in general 8) convolution kernels, each of which is sensitive to edges in a different orientation. For each pixel the local edge gradient magnitude is estimated with the maximum response of all 8 kernels at this pixel location where is the response of the kernel i at the particular pixel position and n is the number of convolution kernels. The local edge orientation is estimated with the orientation of the kernel that yields the maximum response. Various kernels can be used for this operation; for the following discussion we will use the Prewitt kernel. Two templates out of the set of 8 are shown in Figure The whole set of 8 kernels is produced by taking one of the kernels and rotating its coefficients circularly. Each of the resulting kernels is sensitive to an edge orientation ranging from 0° to 315° in steps of 45°, where 0° corresponds to a vertical edge. The maximum response |G| for each pixel is the value of the corresponding pixel in the output magnitude image. The values for the output orientation image lie between 1 and 8, depending on which of the 8 kernels produced the maximum response. This edge detection method is also called edge template matching, because a set of edge templates is matched to the image, each representing an edge in a certain orientation. The edge magnitude and orientation of a pixel is then determined by the template that matches the local area of the pixel the best. The compass edge detector is an appropriate way to estimate the magnitude and orientation of an edge. Although differential gradient edge detection needs a rather time-consuming calculation to estimate the orientation from the magnitudes in the xand y-directions, the compass edge detection obtains the orientation directly from the kernel with the maximum response. The compass operator is limited to (here) 8 possible orientations; however experience shows that most direct orientation estimates are not much more accurate. On the other hand, the compass operator needs (here) 8 convolutions for each pixel, whereas the gradient operator needs only 2, one kernel being sensitive to edges in the vertical direction and one to the horizontal direction. The result for the edge magnitude image is very similar with both methods, provided the same convolving kernel is used. As already mentioned earlier, there are various kernels that can be used for Compass Edge Detection. The most common ones are shown in Figure 3: Figure 3: For every template, the set of all eight kernels is obtained by shifting the coefficients of the kernel circularly. The result for using different templates is similar; the main difference is the different scale in the magnitude image. The advantage of Sobel and Robinson kernels is that only 4 out of the 8 magnitude values must be calculated. Since each pair of kernels rotated about 180° opposite is symmetric, each of the remaining four values can be generated by inverting the result of the opposite kernel. 3. The Laplacian operator An operator based on the 2nd derivative of an image is the Marr edge detector also known as zero crossing detector. Here, the 2nd derivative is calculated using a Laplacian of Gaussian (LoG) filter. The Laplacian has the advantage that it is an isotropic measure of the 2nd derivative of an image, i.e. the edge magnitude is obtained independently from the edge orientation by convolving the image with only one kernel. The edge positions are then given by the zero-crossings in the LoG image. The scale of the edges which are to be detected can be controlled by changing the variance of the Gaussian. If we imagine an image to be a surface, with height corresponding to gray level, then it is clear that convolution with the Sobel kernels gives us first order derivatives that measure the local slope of this surface in the x and y directions. It is also possible to compute second order derivations. These measure the rate at which the slope of the grey level surface changes with distance traveled in the x and y directions. These second order derivatives can be used for edge localization. The Laplacian operator combines the second order derivatives as follows: 2 f ( x , y ) 2 f ( x, y ) f ( x, y ) x 2 y 2 2 (14) The simple digital approximation to the Laplacian over a 3x 3 neighboorhood is given by the kernel 0 1 0 1 4 1 0 1 0 (15) Consider the example 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 Figure: A simple image containing a vertical step edge 0 0 0 0 0 0 6 6 -6 -6 0 0 0 0 0 0 0 0 0 0 0 0 6 6 -6 -6 0 0 0 0 0 0 In area of constant value the preceding convolution mask would return a value zero. In this example , the zero crossing , corresponding to the edge in the original image, lies halfway between the two center pixels. The Laplacian is seldom used for edge detection because the second derivative of the intensity in very sensitive to noise. First we need to apply Gaussian filter . 4. Laplacian of Gaussian 1. The smoothing filer is a Gaussian. 2. The enhancement step is the second derivative ( Laplacian in two dimensions) 3. The detection criterion is the presence of a zero crossing in the second derivative with a corresponding large peak in the first derivative. In this approach, an image should first be convolved with Gaussian filter g ( x, y) 2 G( x, y, ) f ( x, y) (16) The order of performing differentiation and convolution can be interchanged because of the linearity of the operators involved: g ( x, y) 2 G( x, y) f ( x, y) G( x, y) exp x2 y 2 2 2 G (r ) e (17) r2 2 2 (18) where r2=x2+y2 messures distance from the origin, this is reasonable , as the Gaussian is circularly symmetric. This substitution converts the 2D Cousian into 1D function that is easier to differentiate. The first derivative is then G ( r ) ' 1 2 re r2 2 2 (19) and the second derivative , the Lanlacian of a Gaussian is r2 2 1 r2 G '' (r ) 2 2 1e 2 (20) After returning to the original co-ordinates x,y and introducing a normalizing multiplicative coefficient c, we get a convolution mask of LoG operator x y h( x, y ) c 4 2 2 2 e x2 y2 2 2 (21) where c normalizes the sum of mask elements to zero .LoG operator is commonly called Mexican hat. For example 0 0 -1 0 0 0 -1 -2 -1 0 -1 -2 16 -2 -1 0 -1 -2 -1 0 0 0 -1 0 0 3. 1 Laplacian of Gaussian - Guidelines for Use The LoG operator calculates the second spatial derivative of an image. This means that in areas where the image has a constant intensity (i.e. where the intensity gradient is zero), the LoG response will be zero. In the vicinity of a change in intensity, however, the LoG response will be positive on the darker side, and negative on the lighter side. This means that at a reasonably sharp edge between two regions of uniform but different intensities, the LoG response will be: zero at a long distance from the edge, positive just to one side of the edge, negative just to the other side of the edge, Figure 4 : Response of 1-D LoG filter to a step edge. The left hand graph shows a 1-D image, 200 pixels long, containing a step edge. The right hand graph shows the response of a 1-D LoG filter with Gaussian = 3 pixels. 4. Difference of Gaussians ( DoG) Operator It is possible to approximate the LoG filter with a filter that is just the difference of two differently sized Gaussians. Such a filter is known as a DoG filter (short for `Difference of Gaussians'). The common feature of the two types of receptive fields is that the centre and surround regions are antagonistic. The DoG operator uses simple linear differences to model this centre-surround mechanism and the response R(x,t) is analogous to a measure of contrast of events happening between centre and surround regions of the operator. (see Figure 5) Figure 5: A linear model of the space-time receptive field. (A) The spatial sensitivities of the centre and surround mechanism both follow a Gaussian curve. (B) The signal from the surround mechanism is temporally delayed and added to the signal from the centre mechanism The difference of Gaussian ( DoG) model is composed of the difference of two response functions that model the centre and surround mechanisms of retinal cells. Mathematically, the DoG operator is defined as DoG( x ) c G( x; c ) s G( x; s ) (22) where G is a two-dimensional Gaussian operator at x: G x; 1 | x|2 2 2 (23) e 2 2 Parameters c and s are standard deviations for centre and surround Gaussian functions respectively. This Gaussian functions are weighted by integrated sensitivities c (for centre) and s (for the surround). The response R(x,t), of the DoG filter to an input signal s(x,t) at x during time t is given by: R x, t DoGw s x w, t dxdy w (24) where the convolution is applied over a circular region centered at x ( i.e , |w| denotes the distance from x, the centre of the operator) 4.1 Extending the DoG operator to detect changes in a sequence of images 5. Parametric edge models Parametric models are based on the idea that the discrete image intensity function can be considered a sampled and noisy approximation of the underlying continuous or piecewise continuous image intensity function . Therefore we try to model the image as a simple piecewise analytical function. Now the task becomes the reconstruction of the individual piecewise function. We try to find simple functions which best approximate the intensity values only in the local neighborhood of each pixel. This approximation is called the facet model ( Haralick and Shapiro,92) These functions, and not the pixel values , are used to locate edges in the image . To provide an edge detection example, consider a bi-cubic polynomial facet model: f ( x, y ) k1 k 2 x k 3 y k 4 x 2 k 5 xy k 6 y 2 k 7 x 3 k 8 x 2 y k 9 xy 2 k10 y 3 (25) where x and y are coordinates with respect to (0,0), the center of the image plane neighborhood that is being approximated We can calculate the coefficients using least squares methods and singular-value decomposition or using 5 x 5 masks. Edge we can find when we find relative extrema in the first derivative and zero crossing in the second derivative in the direction of the first derivative The first derivative in the direction is given by f ( x, y) ' f f cos sin x y The second directional derivative in the direction is given by 2 f 2 f 2 f 2 f ( x, y ) 2 cos 2 cos sin 2 sin 2 xy x y '' (26) where sin k2 k2 k 23 2 and cos k3 k 22 k 23 We are considering points only on the line in the direction , x0 =cos and y0 =sin At a point ( x0,y0), the second directional derivative in the direction is given by f 6(k 7 sin 3 k 8 sin 2 cos k 9 sin cos 2 k10 cos 3 ) '' 2(k 4 sin 2 k 5 sin cos k 6 cos 2 ) 6 A B (27) Thus, there is an edge at (x0,y0) in the image if for some p,|p| <p0, where p0 is the length of the side of a pixel, f’’(x0,y0;p) =0 and f’(x0,y0:p) != 0 mark the pixel as an edge if it is located within the boundaries of the pixel. There are many possibilities: A=0, f’=2Bp +C, f’’ = 2B, pl=-C/B, ps=0 1. if B>0 and f’’>0 ______ valley 2. if B<0 and f’’< 0 _______ridge 3. if B=0 and f”” =0 ______plane (28)