CS3432: Lecture 6

advertisement
CS3432: Lecture 6
Edge Detection and Convolution: Part 2
In lecture 5 we examined the process of convolution and found that with appropriate choice of convolution kernel,
first derivatives, second derivatives or integration (smoothing) of the image function can be obtained. Here we
show that smoothing combined with edge detection results in scale-selective edge detection.
Selecting Edge Scale.
The figure below shows, in the 1D case, that smoothing with a weighted averaging filter removes small-scale
structure (noise). Differentiating the smoothed image with a first derivative kernel results in a peak in the
derivative image corresponding to the (blurred) edge. The position of the peak in the derivative is the same as that
of the step edge in the original, unsmoothed image.
6
6
Image Sample
Values
I(x)
2
2
1
2
6
5
5
6
5
2
5
*
2
1
1
1
22
22
22
1
0
0
0
0
0
6
0
0
1
*
10
6
0
22
18
6
f(x)
1
6
0
0
0
0
0
0
0
0
f(x)
-1
x=0
8
4
0
0
4
0
0
0
0
This is a widely used strategy for edge detecting at selected scale. Of course, the blurring function of choice is the
Gaussian function, the scale being selected by the spread parameter σ. We can make the process more efficient
by noting that convolution is an associative process. That is
to say (I ( x, y ) ∗ g ( x, y ) ) ∗ d ( x, y ) (the sequence
illustrated above in 1D) is equivalent to
I (x, y ) ∗ ( g ( x, y ) ∗ d ( x, y ) ) . In other words we could
convolve the image function with the derivative of the
smoothing function. If our smoothing function is a
−
x2
2σ 2
Gaussian of the form e
(we may as well consider a 1D Gaussian, as we would implement the filter as a cascade
of 1-D convolutions), the derivative is −
x
σ
2
e
−
x2
2σ 2
, which
has the form shown on the right. The first derivative filter
would take a form similar to the Sobel Operator. First we detect vertical edges using an x-derivative filter: 1D
derivative of gaussian horizontally cascaded with a 1D vertical Gaussian smoother. Then detect the horizontal
edges with a 1-D y-derivative filter cascaded with a horizontal Gaussian smoother. The edge magnitude and
direction can then be calculated in the same way as in the Sobel operator.
Canny Edge Detector
The above is the first step in the Canny edge detector – the most widely used edge detector in computer vision.
The remaining steps are:
Suppression of non-maxima
The result of the derivative-of-Gaussian filtering is a grey-level
Check for maximum
image in which the pixel values represent the local derivative of
the image. The best position for edges is on local maxima of the
derivative. We wish, therefore to locate the pixels where there is a
maximum across the edge. (We are not concerned about maxima
along the edge.) The process of suppression of non-maxima visits
each pixel in turn, looking at the gradient magnitude and direction
of the edge. If the gradient value at the pixel is greater than at the
neighbouring positions on either side across the edge, it is a local
maximum and retained. If not it is discarded. Note that, since the
local edge direction will not, in general, be exactly along a row,
column or diagonal, the “neighbouring positions” may not be
exactly on integer pixel positions. The neighbouring values can be
determined with sub-pixel precision by interpolation between the
gradient values at the integer pixel positions. This is illustrated by
the example on the right. The blue line represents the direction of
the local edge. We are looking for a maximum in the direction of
Interpolate
Local edge direction
the red arrow (across the edge). The gradient value at the centre
gradient values
pixel is compared with the neighbouring values interpolated from
the integer pixel positions. If it is greater than both, it is accepted as a maximum, otherwise it is discarded. This
results in a set of edge points a single pixel thick.
Hysteresis Thresholding
The gradient values at the edge points will vary from high, representing significant edges, to low at weak edges,
perhaps representing noise or unimportant features. We can select the edges we require by thresholding. The
difficulty with thresholding is to select a value that accepts all the desired pixels and rejects all the unwanted ones.
This, as we have seen, is almost impossible to achieve. The Canny approach is to use a pair of thresholds, say
th_hi and th_lo. We begin by accepting only edge (maximal) pixels with values greater than th_hi. We try to
minimise the resultant gaps by tracking along the detected edges, including all pixels with gradient values greater
than th_lo. The result is a set of single-pixel wide edge contours detected at the chosen scale. All points with
gradient less than th_lo are totally eliminated; those with values lower than th_hi are also eliminated unless they
are connected directly to an edge pixel with gradient greater than th_hi. The Canny operator is therefore
controlled by three parameters: σ, th_hi and th_lo. The example below shows edge detection on a brake-assembly
image using first the Sobel operator and then the Canny operator at different scales with appropriate values of the
thresholds.
Brake Image
Sobel edges
Canny edges σ = 0.5
Canny edges σ = 1.0
Canny edges σ = 3.0
Canny edges σ = 5.0
The removal of detected features at larger scales is clear.
Edge Detection by Second Derivative.
In lecture 5 we noted that edges may also be detected by seeking zero-crossings in the
second derivative. Two versions of the basic 2D second derivative kernel are shown to
the right. This is known as the Laplacian kernel. The two forms are different digital
approximations to a continuous second derivative kernel. The important points are the
negative centre and positive surround (or vice-versa), and that the sum of all values
should be zero (it is a differentiating kernel, so should not apply any scaling).
Convolving either of these kernels with the image will result in the second derivative,
in which edge positions are located by a zero-crossing. Unlike the first derivative
filters, the Laplacian is isotropic, i.e. it gives the same response to edges in any
orientation. This is good news up to a point. The bad news is that we lose information
about edge direction, which can be useful, and the response of the filter is not very
directly related to the magnitude of the edge.
The same arguments about scale selection and smoothing can be applied to second
derivative operators, and we could use a Laplacian of Gaussian (LoG) edge detector,
which, by analogy with the Canny detector would be the second derivative of the 2D
Gaussian with an appropriate scale. This takes the form
1  x2 + y2  −

− 1e
σ 2  σ 2

2
-1
2
-1
-4
-1
2
-1
2
1
1
1
1
-8
1
1
1
1
x2 + y2
2σ 2
. The 1-D profile of this function is
shown on the right. Defining where the zero-crossing occurs in a
LoG filtered image can be more problematic than locating the
maxima in a Canny image. The LoG filter is not as widely used as
the first derivative filter. The main interest derives from the fact that
it simulates the response of certain detectors in the mammalian
visual system, and for that reason was introduced by David Marr,
one of the originators of Computer Vision as a discipline. It is
sometimes called the Marr-Hildreth operator.
Non-linear Filters
Convolutions are sometimes known as linear filters because the output image value is a linear combination of the
input image values over the extent of the kernel (sometimes called the support region), and because they have a
direct relationship with frequency-domain filters. We need not be restricted to calculating a linear combination
over the support region. We could calculate anything we want, and it is quite easy to think up non-linear filters
that can transform the image in some desired way. One example is the median filter (see Jain section 4.4). This is
a non-linear analogue of the unweighted average convolution for image smoothing. Rather than replace the centre
pixel value with the mean value in the support region, we replace it with the median. In a list of values, the
median is the one whose value is in the middle of the list. For many distributions of values the median is very
close to the mean. The mean, however is sensitive to outliers – single values that are very different from the
others in the distribution. So if the image is corrupted by noise that
can cause very large changes to individual pixels (impulse, or saltand-pepper noise), the mean value can undergo undesirably large
variations. The median, however, is robust to outliers and can be a
good choice for smoothing out such noise. Median smoothing also
tends to produce less blurring of edges. (Edges are by definition
rapidly changing regions, where some values will be very different
from the local average.) The median is an example of a rank-order
filter. This is a class of filters that takes the pixel values in the
support region, ranks them and then selects a particular ranked value.
The median is the centre-ranked value. If we chose the highest
ranked value, we would have a max filter, and the lowest would give
us a min filter. We have seen max and min filters already in binary
morphology as dilation and erosion operators respectively. The
support region is called the structuring element in the terminology of
mathematical morphology. By noting that dilation and erosion are
simply rank-order filters we can extend the morphological operations
to work on grey-scale images. Grey-scale dilation and erosion will
have the result of expanding dark or light image regions respectively.
Combinations of these operations can be used in a similar way to
their use in the binary case. The example to the right illustrates a
“top-hat” transform (a grey-level opening followed by a subtraction)
on a 1-D profile through a grey-scale image. The result is to separate
large-scale and small-scale structures.
g(x)
It is also possible to construct a “morphological edge
detector”. This is illustrated below on the right. The image
profile expands on dilation (max operator) and shrinks on
erosion (min operator). Subtracting the two generates an
image with peak values on the edges of the original image.
Image profile
Grey-level open
Subtract
profile
x
Reading for lecture 6
Jain
dilate
erode
The reading for this lecture is mostly the same as that
recommended for lecture 5: Jain chapters 4 and 5. The
material in these two chapters covers the topics of the two
lectures in a slightly different order. The following reading is
essential unless noted as optional. Jain separates the topics
subtract
into filtering (chapter 4) and edge detection (5). Most of
chapter 4 is relevant except 4.1 (we won’t be too interested in
point transformations). Section 4.2 on linear systems is
optional, but puts convolution on a rather more systematic basis than we have here. Also any section or
subsection dealing with Fourier Transforms is optional (although if you’re not scared of them, these sections
round the picture out). There are plenty of examples of 2D convolutions, digital Gaussian masks and cascaded 1D
convolutions. Chapter 5 up to and including section 5.7 is all relevant. Certain topics, like Laplacian filters, are
dealt with in more detail than in the lectures. Section 5.5 on fitting surfaces can be omitted.
Sonka
Similarly, Sonka addresses the material in section 4.3, going into rather more detail than we need, as usual.
Section 4.3.1 covers what we say about smoothing, with some straightforward extensions, including some
discussion of median filtering. Similarly, sections 4.3.2 and 4.3.3 cover what we say about edge detection. 4.3.4
deals with the issue of scale explicitly, including the topic of scale-space analysis, which goes beyond our
requirements but indicates the importance of scale as an issue in computer vision. Section 4.3.5 deals with the
Canny edge detector (lecture 6) starting from the notion of optimality in edge detection, which was the original
starting point for Canny’s approach. The notion is historically interesting, if not particularly useful. There are
some examples of grey-level morphological properties in section 11.4
Download