EDGE_DETECTION_Part1

advertisement
Edge detection
Date : 07/16/04
Typically, there are three steps to perform edge detection:
1. Noise reduction, where we try to suppress as much noise as possible, without smoothing away the meaningful edges.
2. Edge enhancement, where we apply some kind of filter that responds strongly at edges and weakly elsewhere, so
that the edges may be identified as local maxima in the filter’s output . One suggestion is to use some kind of high
pass filter.
3. Edge localization, where we decide which of the local maxima output by the filter are meaningful edges and which
are caused by noise.
Edges are places in the image with strong intensity contrast. Since edges often occur at image locations representing object
boundaries, edge detection is extensively used in image segmentation when we want to divide the image into areas
corresponding to different objects. Representing an image by its edges has the further advantage that the amount of data is
reduced significantly while retaining most of the image information.
Since edges consist of mainly high frequencies, we can, in theory, detect edges by applying a highpass frequency filter in the
Fourier domain or by convolving the image with an appropriate kernel in the spatial domain. In practice, edge detection is
performed in the spatial domain, because it is computationally less expensive and often yields better results.
1. Derivative filters
One image can be represented as a surface, with height corresponding to gray level value. Brightness function depends on
two variables co-ordinates in the image plane ( gray level value z = f ( x,y))
As averaging of the pixels over a region is analogous to integration, differentiation can be expected to have the opposite
effect and thus sharpen an image.
Edges are pixels where brightness function changes abruptly. We can describe changes of continuous functions using
derivatives. Brightness function depends on two variables co-ordinates in the image plane and so operators describing edges
are expressed using partial derivatives.
What is the smallest possible window we can choose?
The smallest possible window we can choose for convolution consists of two adjacent pixels. We can calculate difference of
the gray level values of two pixels. When this difference is high we say we have an edge passing between the two pixels. If a
pixel’s gray level value is similar to those around it, there is probably not an edge at that point .If a pixel has neighbors with
widely varying dray levels , it may represent an edge in this point. In other words, an edge is defined by a discontinuity in
gray level values.
The difference of the gray values of the two pixels is an estimate of the first derivative of the intensity function (image
brightness function) with respect to the spatial variable along the direction of with we take the difference. This is because first
derivatives are approximated by first difference in the discrete case.
g x  f ( x  1, y)  f ( x, y) and g y  f ( x, y  1)  f ( x, y )
(1)
Calculating gx at each pixel position is equivalent to convolving the image with a mask (filter) of the form
-1
+1
in the x direction , and calculating gy is equivalent to convolving the image with a filter
-1
+1
In the y direction. We also can express gradient calculation as a pair of convolution operations,
g x ( x, y)  hx * f ( x, y) and g y  h y * f ( x, y )
(2)
In the first output, produced by convolution with mask hx, any pixel that has an absolute value larger than values of its left
and right neighbors is a candidate pixel to be a vertical edge pixel. In the second output , produced by the convolution with
mask hy, any pixel that has an absolute value larger than the values of its top and bottom neighbors is a candidate pixel to be
a horizontal edge pixel. The process of identifying the local maxima as candidate edge pixels is called non-maxima
suppression.
What happen when the image has noise ?
In the presence of noise every small fluctuation in the intensity value will be amplified by differentiating the image . It is
common sense then that we should smooth the image first with lowpass filters and then find the local differences.
Let us consider for example a 1 dimensional signal . We smooth the signal by replacing each intensity value with the average
of three successive intensity values:
Ax 
I x 1  I x  I x 1
3
(3)
Then we estimate the derivative at position x by averaging the two differences between the value at position x and its left and
right neighbors.
Fx 
( Ax 1  Ax )  ( Ax  Ax 1 )
A
 Ax 1
 x 1
2
2
(4)
If we substitute from (3) into ( 4) , we obtain:
Fx 
1 

I x  2  I x 1  I x 1  I x  2 

6

(5)
It is obvious from this example that we can combine the two linear operations of smoothing and funding differences into one
operation if we use large enough masks like
-1/6
-1/6
0
1/6
1/6
It is clear that the larger the mask used , the better is the smoothing. But is also clear that the more blurred the edge becomes
so the more inaccurately its position will be specified.
For an image which is a 2D signal one should use 2 dimensional masks. The smallest mask that one should use to combine
minimum smoothing with differencing is a 3 x 3 mask.
1.1 The Prewitt operator
Consider the arrangement of the pixels 3 x 3 about the pixel (x,y)
g x ( x, y)  f ( x  1, y)  f ( x  1, y) and g y ( x, y )  f ( x, y  1)  f ( x, y  1)
where the kernels are
 1 0 1
hx   1 0 1
 1 0 1
and
 1  1  1
h y   0 0 0 
 1 1 1 
(7)
(6)
There are known as a Prewitt kernels
1.2 Sobel operator
A similar pair of kernels are the Sobel kernels
  1 0 1
hx   2 0 2
  1 0 1
and
 1  2  1
h y   0
0
0 
 1
2
1 
(8)
This operator place an emphasis on pixels that are closer to the center of the mask. The Sobel operator is one of most
commonly used edge detectors.
Clearly , the kernel hx is sensitive to changes in the x direction, edges that run vertically or have a vertical component.
Similarly, the kernel hy is sensitive to changes in the direction y , edges that run horizontally or have a horizontal component.
Since edges correspond to strong illumination gradients, we can highlight them by calculating the derivatives of the image.
This is illustrated for the one-dimensional case in Figure 1.
Figure 1 :1st and 2nd derivative of an edge illustrated in one dimension.
We can see that the position of the edge can be estimated with the maximum of the 1st derivative or with the zero-crossing of
the 2nd derivative. Therefore we want to find a technique to calculate the derivative of a two-dimensional image. For a
discrete one-dimensional function f(i), the first derivative can be approximated by
Calculating this formula is equivalent to convolving the function with [-1 1]. Similarly the 2nd derivative can be estimated by
convolving f(i) with [1 -2 1].
Different edge detection kernels which are based on the above formula enable us to calculate either the 1st or the 2nd
derivative of a two-dimensional image.
There are two common approaches to estimate the 1st derivative in a two-dimensional image, Prewitt compass edge
detection and gradient edge detection.
Prewitt compass edge detection involves convolving the image with a set of (usually 8) kernels, each of which is sensitive to
a different edge orientation. The kernel producing the maximum response at a pixel location determines the edge magnitude
and orientation.
Gradient edge detection is the second and more widely used technique. Here, the image is convolved with only two kernels,
one estimating the gradient in the x-direction, gx, the other the gradient in the y-direction, gy.
A changes of the image function are based on the estimation of gray level gradient at a pixel.
Significant changes in the gray values in an image can be detected by using a discrete approximation to the gradient.
The gradient is the two-dimensional equivalent of the first derivative and is defined as the gradient vector.
g x 
f ( x, y )  g   
g y 
(9)
The two gradients ( in x and in y direction) computed at each pixel by Equation (1) or Eq. ( 7) or Eq (8) can be regarded as
the x and y components of a gradient vector ( 9)
This vector is oriented along the direction of the maximum rate of increase of the gray level function f (x,y) .
There are two important properties associated with the gradient: gradient magnitude and direction of the gradient
The outputs of the two convolution are squared, added and square rooted to produce the gradient magnitude
g  g x2  g y2
(10)
equals the maximum rate of increase of function f(x,y) per unit distance in the direction g .
The magnitude of the gradient vector is the length of the hypotenuse of the right triangle having sides gx an gy, and this
reflects the strength of the edge, or edge response, at any given pixel.
It is common practice, to approximate the gradient magnitude by absolute values:
g  g x  g y or g  max( g x , g y )
(11)
From vector analysis, the direction of the gradient
 gy
 gx
  tan 1 



(12)
Figure 2:
1. Gradient direction gives the direction of maximal growth of the function, e.g. from black (f(x,y)=0) to white
(f(x,y)=255). This is illustrated below: closed lines are lines of the same brightness.
2. The orientation 0 points East
Edge
direction
Gradient
direction
0
White 255
Whichever operator is used to compute the gradient, the resulting vector contains information about how strong the edge is at
that pixel and what its direction is. The edge magnitude is a real number. Any pixel having a gradient that exceeds a specified
threshold value is said to be an edge pixel, and others are not.
g ( x, y )
edge po int
 1 if f ( x, y )  


 0 if
f ( x , y )  



(13)
After having calculated the magnitude of the 1st derivative, we now have to identify those pixels corresponding to an edge.
The easiest way is to threshold the gradient image, assuming that all pixels having a local gradient above the threshold must
represent an edge. An alternative technique is to look for local maxima in the gradient image, thus producing one pixel wide
edges. A more sophisticated technique is used by the Canny edge detector. It first applies a gradient edge detector to the
image and then finds the edge pixels using non-maximal suppression and hysteresis tracking.
There are three problems with this approach ( thresholding of gradient magnitude):
1. First, the boundaries between scene elements are not always sharp.
2. The noise in an image can sometimes produce gradients as high as , or even higher than, those resulting from
meaningful edges.
3. With a simple threshold approach is that the local maximum in gray level gradient associated with an edge lies at the
summit of a ridge. Thresholding detects a portion of this ridge, rather than the single point of maximum gradient. The
ridge can be rather broad in the case of diffuse edges, resulting in a thick band of pixels in the edge map.
How we can choose the weights of a 3 x 3 mask for edge detection ?
A11
A21
A31
A12
A22
A32
A13
A23
A33
If we are going to use one such mask to calculate gx and another to calculate gy . Such masks must obey the following
conditions :
1. The mask with calculates gx must be produced from the mask that calculates gy by rotation by 90.
2. We do not want to give any extra weight to the left or the right neighbors of the central pixel, so we must have
identical weights in the left and right columns.
A11
A21
A31
A12
A22
A32
A11
A21
A31
3. Let us say that we want to subtract the signal “in front” of the central pixel from the signal “behind” it, in order to
find local differences , and we want these two subtracted signals to have equal weights.
-A11
A21
A11
-A12
A22
A12
-A11
A21
A11
4. If the image is absolutely smooth we want to have zero response. So the sum of all the weights must be zero.
Therefore, A22 = -2A21:
-A11
A21
A11
-A12
-2A21
A12
-A11
A21
A11
5. In the case of a smooth signal, and as we differentiate in the direction of columns, we expect each column to produce
0 output. Therefore , A21=0
-A11
0
A11
-A12
0
A12
-A11
0
A11
We can divide these weighs throughout by a11 so that finally, this mask depends only on one parameter
-1
0
1
-K
0
K
-1
0
1
If we choose K=2 we have the Sobel masks for differentiating an image along two directions
2. Prewitt compass edge detection( don’t be confuse with Prewitt operator for differential gradient edge detection )
Compass Edge Detection is an alternative approach to the differential gradient edge detection (see the Roberts Cross and
Sobel operators).
When using compass edge detection the image is convolved with a set of (in general 8) convolution kernels, each of which is
sensitive to edges in a different orientation. For each pixel the local edge gradient magnitude is estimated with the maximum
response of all 8 kernels at this pixel location
where is the response of the kernel i at the particular pixel position and n is the number of convolution kernels. The local
edge orientation is estimated with the orientation of the kernel that yields the maximum response.
Various kernels can be used for this operation; for the following discussion we will use the Prewitt kernel. Two templates out
of the set of 8 are shown in Figure
The whole set of 8 kernels is produced by taking one of the kernels and rotating its coefficients circularly. Each of the
resulting kernels is sensitive to an edge orientation ranging from 0° to 315° in steps of 45°, where 0° corresponds to a vertical
edge.
The maximum response |G| for each pixel is the value of the corresponding pixel in the output magnitude image. The values
for the output orientation image lie between 1 and 8, depending on which of the 8 kernels produced the maximum response.
This edge detection method is also called edge template matching, because a set of edge templates is matched to the image,
each representing an edge in a certain orientation. The edge magnitude and orientation of a pixel is then determined by the
template that matches the local area of the pixel the best.
The compass edge detector is an appropriate way to estimate the magnitude and orientation of an edge. Although differential
gradient edge detection needs a rather time-consuming calculation to estimate the orientation from the magnitudes in the xand y-directions, the compass edge detection obtains the orientation directly from the kernel with the maximum response.
The compass operator is limited to (here) 8 possible orientations; however experience shows that most direct orientation
estimates are not much more accurate.
On the other hand, the compass operator needs (here) 8 convolutions for each pixel, whereas the gradient operator needs only
2, one kernel being sensitive to edges in the vertical direction and one to the horizontal direction.
The result for the edge magnitude image is very similar with both methods, provided the same convolving kernel is used.
As already mentioned earlier, there are various kernels that can be used for Compass Edge Detection. The most
common ones are shown in Figure 3:
Figure 3: For every template, the set of all eight kernels is obtained by shifting the coefficients of the kernel circularly.
The result for using different templates is similar; the main difference is the different scale in the magnitude image. The
advantage of Sobel and Robinson kernels is that only 4 out of the 8 magnitude values must be calculated. Since each pair of
kernels rotated about 180° opposite is symmetric, each of the remaining four values can be generated by inverting the result
of the opposite kernel.
3. The Laplacian operator
An operator based on the 2nd derivative of an image is the Marr edge detector also known as zero crossing detector. Here,
the 2nd derivative is calculated using a Laplacian of Gaussian (LoG) filter. The Laplacian has the advantage that it is an
isotropic measure of the 2nd derivative of an image, i.e. the edge magnitude is obtained independently from the edge
orientation by convolving the image with only one kernel. The edge positions are then given by the zero-crossings in the LoG
image. The scale of the edges which are to be detected can be controlled by changing the variance of the Gaussian.
If we imagine an image to be a surface, with height corresponding to gray level, then it is clear that convolution with the
Sobel kernels gives us first order derivatives that measure the local slope of this surface in the x and y directions. It is also
possible to compute second order derivations. These measure the rate at which the slope of the grey level surface changes
with distance traveled in the x and y directions. These second order derivatives can be used for edge localization.
The Laplacian operator combines the second order derivatives as follows:
 2 f ( x , y )  2 f ( x, y )
 f ( x, y ) 

x 2
y 2
2
(14)
The simple digital approximation to the Laplacian over a 3x 3 neighboorhood is given by the kernel
 0 1 0 
 1 4  1


 0  1 0 
(15)
Consider the example
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
8
8
8
8
8
8
8
8
8
8
8
8
8
8
8
8
8
8
8
8
8
8
8
8
8
8
8
8
8
8
Figure: A simple image containing a vertical step edge
0
0
0
0
0
0
6
6
-6
-6
0
0
0
0
0
0
0
0
0
0
0
0
6
6
-6
-6
0
0
0
0
0
0
In area of constant value the preceding convolution mask would return a value zero. In this example , the zero crossing ,
corresponding to the edge in the original image, lies halfway between the two center pixels.
The Laplacian is seldom used for edge detection because the second derivative of the intensity in very sensitive to noise. First
we need to apply Gaussian filter .
4. Laplacian of Gaussian
1. The smoothing filer is a Gaussian.
2. The enhancement step is the second derivative ( Laplacian in two dimensions)
3. The detection criterion is the presence of a zero crossing in the second derivative with a corresponding large peak in
the first derivative.
In this approach, an image should first be convolved with Gaussian filter
g ( x, y)   2 G( x, y,  )  f ( x, y)
(16)
The order of performing differentiation and convolution can be interchanged because of the linearity of the operators
involved:


g ( x, y)   2 G( x, y)  f ( x, y)
G( x, y)  exp

x2  y 2
2 2
 G (r )  e
(17)

r2
2 2
(18)
where r2=x2+y2 messures distance from the origin, this is reasonable , as the Gaussian is circularly symmetric. This
substitution converts the 2D Cousian into 1D function that is easier to differentiate.
The first derivative is then G ( r )  
'

1

2
re
r2
2 2
(19)
and the second derivative , the Lanlacian of a Gaussian is
r2
  2
1  r2
G '' (r )  2  2  1e 2
 

(20)
After returning to the original co-ordinates x,y and introducing a normalizing multiplicative coefficient c, we get a
convolution mask of LoG operator
 x  y 
h( x, y )  c
4

2
2
2

e


x2  y2
2 2
(21)
where c normalizes the sum of mask elements to zero .LoG operator is commonly called Mexican hat. For example
0
0
-1
0
0
0
-1
-2
-1
0
-1
-2
16
-2
-1
0
-1
-2
-1
0
0
0
-1
0
0
3. 1 Laplacian of Gaussian - Guidelines for Use
The LoG operator calculates the second spatial derivative of an image. This means that in areas where the image has a
constant intensity (i.e. where the intensity gradient is zero), the LoG response will be zero. In the vicinity of a change in
intensity, however, the LoG response will be positive on the darker side, and negative on the lighter side. This means that at a
reasonably sharp edge between two regions of uniform but different intensities, the LoG response will be:



zero at a long distance from the edge,
positive just to one side of the edge,
negative just to the other side of the edge,
Figure 4 : Response of 1-D LoG filter to a step edge. The left hand graph shows a 1-D image, 200 pixels long, containing a
step edge. The right hand graph shows the response of a 1-D LoG filter with Gaussian = 3 pixels.
4. Difference of Gaussians ( DoG) Operator
It is possible to approximate the LoG filter with a filter that is just the difference of two differently sized Gaussians. Such a
filter is known as a DoG filter (short for `Difference of Gaussians').
The common feature of the two types of receptive fields is that the centre and surround regions are antagonistic. The DoG
operator uses simple linear differences to model this centre-surround mechanism and the response R(x,t) is analogous to a
measure of contrast of events happening between centre and surround regions of the operator. (see Figure 5)
Figure 5: A linear model of the space-time receptive field. (A) The spatial sensitivities of the centre and surround mechanism
both follow a Gaussian curve. (B) The signal from the surround mechanism is temporally delayed and added to the signal
from the centre mechanism
The difference of Gaussian ( DoG) model is composed of the difference of two response functions that model the centre and
surround mechanisms of retinal cells. Mathematically, the DoG operator is defined as
DoG( x )   c G( x;  c )   s G( x;  s )
(22)
where G is a two-dimensional Gaussian operator at x:
G  x;   
1

| x|2
2 2
(23)
e
2 2
Parameters c and s are standard deviations for centre and surround Gaussian functions respectively. This Gaussian
functions are weighted by integrated sensitivities c (for centre) and s (for the surround). The response R(x,t), of the DoG
filter to an input signal s(x,t) at x during time t is given by:
R x, t    DoGw s x  w, t dxdy
w
(24)
where the convolution is applied over a circular region centered at x ( i.e , |w| denotes the distance from x, the centre of the
operator)
4.1 Extending the DoG operator to detect changes in a sequence of images
5. Parametric edge models
Parametric models are based on the idea that the discrete image intensity function can be considered a sampled and noisy
approximation of the underlying continuous or piecewise continuous image intensity function . Therefore we try to model the
image as a simple piecewise analytical function. Now the task becomes the reconstruction of the individual piecewise
function. We try to find simple functions which best approximate the intensity values only in the local neighborhood of each
pixel. This approximation is called the facet model ( Haralick and Shapiro,92) These functions, and not the pixel values , are
used to locate edges in the image .
To provide an edge detection example, consider a bi-cubic polynomial facet model:
f ( x, y )  k1  k 2 x  k 3 y  k 4 x 2  k 5 xy  k 6 y 2  k 7 x 3  k 8 x 2 y  k 9 xy 2  k10 y 3
(25)
where x and y are coordinates with respect to (0,0), the center of the image plane neighborhood that is being approximated
We can calculate the coefficients using least squares methods and singular-value decomposition or using 5 x 5 masks.
Edge we can find when we find relative extrema in the first derivative and zero crossing in the second derivative in the
direction of the first derivative
The first derivative in the direction  is given by
f  ( x, y) 
'
f
f
cos   sin 
x
y
The second directional derivative in the direction  is given by
2 f
2 f
2 f
2
f  ( x, y )  2 cos   2
cos  sin   2 sin 2 
xy
x
y
''
(26)
where
sin  
k2
k2  k 23
2
and
cos  
k3
k 22  k 23
We are considering points only on the line in the direction ,
x0 =cos  and y0 =sin
At a point ( x0,y0), the second directional derivative in the direction  is given by
f   6(k 7 sin 3   k 8 sin 2  cos   k 9 sin  cos 2   k10 cos 3  )  
''
2(k 4 sin 2   k 5 sin  cos   k 6 cos 2  )  6 A  B
(27)
Thus, there is an edge at (x0,y0) in the image if for some p,|p| <p0, where p0 is the length of the side of a pixel,
f’’(x0,y0;p) =0 and f’(x0,y0:p) != 0
mark the pixel as an edge if it is located within the boundaries of the pixel.
There are many possibilities:
A=0, f’=2Bp +C, f’’ = 2B, pl=-C/B, ps=0
1. if B>0 and f’’>0 ______ valley
2. if B<0 and f’’< 0 _______ridge
3. if B=0 and f”” =0 ______plane
(28)
Download