CS3432: Lecture 6 Edge Detection and Convolution: Part 2 In lecture 5 we examined the process of convolution and found that with appropriate choice of convolution kernel, first derivatives, second derivatives or integration (smoothing) of the image function can be obtained. Here we show that smoothing combined with edge detection results in scale-selective edge detection. Selecting Edge Scale. The figure below shows, in the 1D case, that smoothing with a weighted averaging filter removes small-scale structure (noise). Differentiating the smoothed image with a first derivative kernel results in a peak in the derivative image corresponding to the (blurred) edge. The position of the peak in the derivative is the same as that of the step edge in the original, unsmoothed image. 6 6 Image Sample Values I(x) 2 2 1 2 6 5 5 6 5 2 5 * 2 1 1 1 22 22 22 1 0 0 0 0 0 6 0 0 1 * 10 6 0 22 18 6 f(x) 1 6 0 0 0 0 0 0 0 0 f(x) -1 x=0 8 4 0 0 4 0 0 0 0 This is a widely used strategy for edge detecting at selected scale. Of course, the blurring function of choice is the Gaussian function, the scale being selected by the spread parameter σ. We can make the process more efficient by noting that convolution is an associative process. That is to say (I ( x, y ) ∗ g ( x, y ) ) ∗ d ( x, y ) (the sequence illustrated above in 1D) is equivalent to I (x, y ) ∗ ( g ( x, y ) ∗ d ( x, y ) ) . In other words we could convolve the image function with the derivative of the smoothing function. If our smoothing function is a − x2 2σ 2 Gaussian of the form e (we may as well consider a 1D Gaussian, as we would implement the filter as a cascade of 1-D convolutions), the derivative is − x σ 2 e − x2 2σ 2 , which has the form shown on the right. The first derivative filter would take a form similar to the Sobel Operator. First we detect vertical edges using an x-derivative filter: 1D derivative of gaussian horizontally cascaded with a 1D vertical Gaussian smoother. Then detect the horizontal edges with a 1-D y-derivative filter cascaded with a horizontal Gaussian smoother. The edge magnitude and direction can then be calculated in the same way as in the Sobel operator. Canny Edge Detector The above is the first step in the Canny edge detector – the most widely used edge detector in computer vision. The remaining steps are: Suppression of non-maxima The result of the derivative-of-Gaussian filtering is a grey-level Check for maximum image in which the pixel values represent the local derivative of the image. The best position for edges is on local maxima of the derivative. We wish, therefore to locate the pixels where there is a maximum across the edge. (We are not concerned about maxima along the edge.) The process of suppression of non-maxima visits each pixel in turn, looking at the gradient magnitude and direction of the edge. If the gradient value at the pixel is greater than at the neighbouring positions on either side across the edge, it is a local maximum and retained. If not it is discarded. Note that, since the local edge direction will not, in general, be exactly along a row, column or diagonal, the “neighbouring positions” may not be exactly on integer pixel positions. The neighbouring values can be determined with sub-pixel precision by interpolation between the gradient values at the integer pixel positions. This is illustrated by the example on the right. The blue line represents the direction of the local edge. We are looking for a maximum in the direction of Interpolate Local edge direction the red arrow (across the edge). The gradient value at the centre gradient values pixel is compared with the neighbouring values interpolated from the integer pixel positions. If it is greater than both, it is accepted as a maximum, otherwise it is discarded. This results in a set of edge points a single pixel thick. Hysteresis Thresholding The gradient values at the edge points will vary from high, representing significant edges, to low at weak edges, perhaps representing noise or unimportant features. We can select the edges we require by thresholding. The difficulty with thresholding is to select a value that accepts all the desired pixels and rejects all the unwanted ones. This, as we have seen, is almost impossible to achieve. The Canny approach is to use a pair of thresholds, say th_hi and th_lo. We begin by accepting only edge (maximal) pixels with values greater than th_hi. We try to minimise the resultant gaps by tracking along the detected edges, including all pixels with gradient values greater than th_lo. The result is a set of single-pixel wide edge contours detected at the chosen scale. All points with gradient less than th_lo are totally eliminated; those with values lower than th_hi are also eliminated unless they are connected directly to an edge pixel with gradient greater than th_hi. The Canny operator is therefore controlled by three parameters: σ, th_hi and th_lo. The example below shows edge detection on a brake-assembly image using first the Sobel operator and then the Canny operator at different scales with appropriate values of the thresholds. Brake Image Sobel edges Canny edges σ = 0.5 Canny edges σ = 1.0 Canny edges σ = 3.0 Canny edges σ = 5.0 The removal of detected features at larger scales is clear. Edge Detection by Second Derivative. In lecture 5 we noted that edges may also be detected by seeking zero-crossings in the second derivative. Two versions of the basic 2D second derivative kernel are shown to the right. This is known as the Laplacian kernel. The two forms are different digital approximations to a continuous second derivative kernel. The important points are the negative centre and positive surround (or vice-versa), and that the sum of all values should be zero (it is a differentiating kernel, so should not apply any scaling). Convolving either of these kernels with the image will result in the second derivative, in which edge positions are located by a zero-crossing. Unlike the first derivative filters, the Laplacian is isotropic, i.e. it gives the same response to edges in any orientation. This is good news up to a point. The bad news is that we lose information about edge direction, which can be useful, and the response of the filter is not very directly related to the magnitude of the edge. The same arguments about scale selection and smoothing can be applied to second derivative operators, and we could use a Laplacian of Gaussian (LoG) edge detector, which, by analogy with the Canny detector would be the second derivative of the 2D Gaussian with an appropriate scale. This takes the form 1 x2 + y2 − − 1e σ 2 σ 2 2 -1 2 -1 -4 -1 2 -1 2 1 1 1 1 -8 1 1 1 1 x2 + y2 2σ 2 . The 1-D profile of this function is shown on the right. Defining where the zero-crossing occurs in a LoG filtered image can be more problematic than locating the maxima in a Canny image. The LoG filter is not as widely used as the first derivative filter. The main interest derives from the fact that it simulates the response of certain detectors in the mammalian visual system, and for that reason was introduced by David Marr, one of the originators of Computer Vision as a discipline. It is sometimes called the Marr-Hildreth operator. Non-linear Filters Convolutions are sometimes known as linear filters because the output image value is a linear combination of the input image values over the extent of the kernel (sometimes called the support region), and because they have a direct relationship with frequency-domain filters. We need not be restricted to calculating a linear combination over the support region. We could calculate anything we want, and it is quite easy to think up non-linear filters that can transform the image in some desired way. One example is the median filter (see Jain section 4.4). This is a non-linear analogue of the unweighted average convolution for image smoothing. Rather than replace the centre pixel value with the mean value in the support region, we replace it with the median. In a list of values, the median is the one whose value is in the middle of the list. For many distributions of values the median is very close to the mean. The mean, however is sensitive to outliers – single values that are very different from the others in the distribution. So if the image is corrupted by noise that can cause very large changes to individual pixels (impulse, or saltand-pepper noise), the mean value can undergo undesirably large variations. The median, however, is robust to outliers and can be a good choice for smoothing out such noise. Median smoothing also tends to produce less blurring of edges. (Edges are by definition rapidly changing regions, where some values will be very different from the local average.) The median is an example of a rank-order filter. This is a class of filters that takes the pixel values in the support region, ranks them and then selects a particular ranked value. The median is the centre-ranked value. If we chose the highest ranked value, we would have a max filter, and the lowest would give us a min filter. We have seen max and min filters already in binary morphology as dilation and erosion operators respectively. The support region is called the structuring element in the terminology of mathematical morphology. By noting that dilation and erosion are simply rank-order filters we can extend the morphological operations to work on grey-scale images. Grey-scale dilation and erosion will have the result of expanding dark or light image regions respectively. Combinations of these operations can be used in a similar way to their use in the binary case. The example to the right illustrates a “top-hat” transform (a grey-level opening followed by a subtraction) on a 1-D profile through a grey-scale image. The result is to separate large-scale and small-scale structures. g(x) It is also possible to construct a “morphological edge detector”. This is illustrated below on the right. The image profile expands on dilation (max operator) and shrinks on erosion (min operator). Subtracting the two generates an image with peak values on the edges of the original image. Image profile Grey-level open Subtract profile x Reading for lecture 6 Jain dilate erode The reading for this lecture is mostly the same as that recommended for lecture 5: Jain chapters 4 and 5. The material in these two chapters covers the topics of the two lectures in a slightly different order. The following reading is essential unless noted as optional. Jain separates the topics subtract into filtering (chapter 4) and edge detection (5). Most of chapter 4 is relevant except 4.1 (we won’t be too interested in point transformations). Section 4.2 on linear systems is optional, but puts convolution on a rather more systematic basis than we have here. Also any section or subsection dealing with Fourier Transforms is optional (although if you’re not scared of them, these sections round the picture out). There are plenty of examples of 2D convolutions, digital Gaussian masks and cascaded 1D convolutions. Chapter 5 up to and including section 5.7 is all relevant. Certain topics, like Laplacian filters, are dealt with in more detail than in the lectures. Section 5.5 on fitting surfaces can be omitted. Sonka Similarly, Sonka addresses the material in section 4.3, going into rather more detail than we need, as usual. Section 4.3.1 covers what we say about smoothing, with some straightforward extensions, including some discussion of median filtering. Similarly, sections 4.3.2 and 4.3.3 cover what we say about edge detection. 4.3.4 deals with the issue of scale explicitly, including the topic of scale-space analysis, which goes beyond our requirements but indicates the importance of scale as an issue in computer vision. Section 4.3.5 deals with the Canny edge detector (lecture 6) starting from the notion of optimality in edge detection, which was the original starting point for Canny’s approach. The notion is historically interesting, if not particularly useful. There are some examples of grey-level morphological properties in section 11.4