Convolution and correlation

Neighborhood Operations o o o o o Date : 07/14/04 the operation at a pixel depends on the gray-level values of a region of interest surrounding that pixel we assume that a neighborhood is centered on a pixel, then it must be a square neighborhood with odd dimensions for point operations, once an input pixel has been processed its original value is no longer needed. This is not possible with neighborhood operators because, even after an output pixel has been calculated, the corresponding input pixel at that location is included in other neighborhoods let us examine spatial variations in the image applications include blurring, sharpening, noise reduction, edge detection, feature identification, measuring the similarity of two images Many image processing operations can be modeled as a linear system. 1. Linear Systems Theory o o o o o o a linear system  x(t) => y(t) is linear if x1(t) + x2(t) => y1(t) + y2(t)  then ax(t) => ay(t) shift invariance  x(t - T) => y(t-T) convolution integral  want an expression that relates the output signal to the input signal  use the properties of linearity and shift invariance to produce the convolution integral  output signal is given by the convolution of the input signal with the impulse response convolution is commutative, associative, and distributive over addition 2D convolution discrete 2D convolution For a linear system, when the input to the system is an impulse (x,y) centered at the origin , the output h(x,y) is the system’s impulse response. Input (x,y) Linear system Output h(x,y) A system whose response remains the same irrespective of the position of the input pulse is called a space invariant system: Input (x-x0,y-y0) Linear space invariant system Output h (x-x0, y-y0) A linear space invariant system can be completely described by its impulse response h(x,y) as follows: Input f(x,y) Linear space invariant system h(x,y) Output g(x,y) Where f(x,y) and g(x,y) are input and output image . The above system must satisfy the following relationship: af1 ( x, y)  bf 2 ( x, y)  ag1 ( x, y)  bg 2 ( x, y) ,where a and b are constant scaling factors. For such a system the output g(x,y) is convolution of f(x,y) with the impulse response h(x,y) and for the discrete functions: n m g (i, j )   f ( x, y)h(i  x, j  y) (1) x1 y 1 where h (i-x,j-y)=h(x, i,y,j) is impulse response . We also can define h(x,i,y,j) as a point spread function . In optics, h(x,y) the inverse ( Fourier transform)of the optical transfer function , is called the point spread function. This name is based on the optical phenomenon that the impulse corresponds to a point of light and that an optical system responds by blurring the point, with the degree of blurring being determined by the quality of the optical components. 1.1 How are operators defined? The point spread function of an operator is what we get if we apply the operator on a point source. O[point source] = point spread function O[(x-i, y-j)] =h(x,i,y,j) Where (x-i, y-j) is a point source of brighthness 1 centered at point (I,J) This function express how much the input value at position (x,y) influences the output value at position (i,j) . If the influence expressed by he point spread function is independent of the actual positions but depends only on the relative position of the influencing and the influenced pixels, we have shift invariant point spread function - h (i-x,j-y)=h(x, i,y,j) 1.2 How does an operator transform an image? If the operator is linear, when the point source (the pixel) each with its own brightness value is a a times brighter, the result will be a time larger:O[a(x-i, y-j)] =ah(x,i,y,j) An image is a collection of point sources (the pixels).We may say that an image is the sum of these point sources. Then the effect of an operator characterized by point spread function h(x,i,y,j) on an image f(x,y) can be presented by equation (2). N 1 N 1 For the input image N x N a convolution is: g (i. j )   f ( x, y)h( x, i, y, j ) (2) x 0 y 0 We can express Equation (2) using the shorthand form g(x,y) = h*f(x,y) , where * is convolution operation. If the columns are influenced independent from the row of the image then the point spread function is separable: h( x, i, y, j )  hc ( x, i)hr ( y, j ) Then the equation (2) can be written as a cascade of two1D transformation: N 1 N 1 x 0 y 0 g (i, j )   hc ( x, i) f ( x, y)hr ( y, j ) (3) N 1 Notice that factor  f ( x, y)h ( y, j) actually represents the product of two N x N matrices, which must be y 0 r another matrix of the same size. Let us define it as: N 1 s( x, j )   f ( x, y)hr ( y, j )  fhr (4) and g  hcT fhr (5) y 0 where f and g are the input and output images and hr , hc, are matrices expressing the point spread function of the operator. g  hcT f hr A separable n x n kernel can be represented as a vector product of two orthogonal, one-dimensional kernels, each of width n. Convolution with the kernel can be carried out using these one-dimensional components. One of them is applied down the columns of an image, generating an intermediate result. The other kernel is then applied along the rows of the intermediate image, producing the final result. If the digital image looks like this:  f (0,0)  f (1,0) f ( x, y )    ...   f ( N  1,0) f (0,1) f (1,1) ... ... .... ... f ( N  1,1) ..... f (0, N  1) f (1, N  1) ...      f ( N  1, N  1) We can rewrite equation (2) as follows: g (i, j )  f (0,0)h(0, i,0, j )  f (1,0)h(1, i,0, j )  ...  f ( N  1,0)h( N  1, i,0, j )  f (0,1)h(0, i,1, j )  f (1,1)h(1, i,1, j )  .......  f ( N  1,1, h( N  1, i,1, j )   ....  f (0, N  1)h(0, i, N  1, j )  f (1, N  1)h(1, i, N  1, j )  f ( N  1, N  1)h( N  1, i, N  1, j ) g  Hf (6) The right hand side of this expression can be thought of as the dot product of vector h Tij  [h(0, i,0, j ), h(1, i,0, j ),...h( N  1, i,0, j ), h(0, i,1, j ), h(1, i,1, j ),... h( N  1, i,1, j ),...., h(1, i, N  1, j ).,...., h( N  1, i, N  1, j ) with vector f T  [ f (0,0), f (1,0),...., f ( N  1,0), f (1,1), , , , f ( N  1,1),..., f (0, N  1), f (1, N  1),... f ( N  1, N  1) This last vector is actually the image f(x,y) written as a vector by stacking its columns one under other. It we imagine writing g(i,j) in the same way, then vectors hTij will arrange themselves as the rows of a matrix H, where for i=0, j will run 0 to N-1 to give the first N rows of the matrix, then for i=2, j will run again from 0 to N-1 to give the second N rows of the matrix, and so on . H is the square N2 x N2 matrix that is made up of N x N submatrices of size N x N each, arranged in the following way:      H  i      x  y  0    j  0  y  0    j  1  y0     j  N  1  y  1    j  0  y  1    j  1  y 1     j  N  1   y  N  1      j  0    y  N  1     j  1   y  N  1      j  N  1  According to the separability assumption, we can replace h(x,i,y,j) by hc(x,i)hr(y,j). We can present matrix H as a Kornecker product of matrices hc(x,i)hr(y,j). H = hTc(x,i)  hTr(y,j) g  hcT f hr 2. What is the purpose of Image Processing ?  Given an image f choose matrices hc and hr so that the output image g is “better” than f according to some subject criteria. This is the problem of Image Enhancement.  Given an image f choose matrices hc and hr so that g can be represented by fewer bits than f without much loss of detail. This is the problem of Image Compression.  Given an image g and estimate of h(x,i,y,j), recover image f. This is the problem of Image Restoration.  Given an image f choose matrices hc and hr so that the output image g included certain features of f. This is the problem of preparation of an image for Automatic Vision 3. Calculating a convolution During convolution, we take each kernel coefficient in turn and multiply it by a value from a neighborhood of the image lying under the kernel. We apply the kernel to the image in such a way that the value of the topleft corner of the kernel (h) is multiplied by the value at the bottom-right corner of the neighborhood. g ( x, y )  h(1,1) f ( x  1, y  1) h(1,0) f ( x  1, y )   h(1,1) f ( x  1, y  1) h(0,1) f ( x, y  1) h(0,0) f ( x, y )   h(0,1) f ( x, y  1)  h(1,1) f ( x  1, y  1) h(1,0) f ( x  1, y )    h(1,1) f ( x  1, y  1) The pixels associates with these kernel coefficients are sequenced in precisely the opposite direction. g ( x, y)  1 1  h( j, k ) f ( x  j, y  k ) (7) k  1 j  1 1 1 0 1 0 1 1 0 1 2 0 2 1 0 1 x 72 53 60 y 76 56 65 88 78 82 g(x,y) =(-1 x 82) +(1 x 88)+(-2 x 65) +(2 x 76) + (-1 x 60) +(1 x 72) =40 Note that if we rotate the kernel by 180, then both sequences would run in the same direction. This is implementation of correlation rather than convolution. Note that the distinction between convolution and correlation disappears when the kernel is symmetric under 180 rotation. 3.1 Convolution - Conclusions o o used for filtering and edge detection the output gray-level at a pixel is a weighted sum of the gray-levels of pixels in the neighborhood the weights are expressed in matrix form in a convolution kernel kernel is typically small and the coefficients may be real numbers, and they may be negative kernel elements are paired with image element as follows:  kernel elements are taken left-to-right, top-to-bottom  image elements (in the neighborhood) are taken right-to-left, bottom-to-top o o o 4. Correlation The correlation can be expressed as follows: g ( x, y)  1 1  h( j, k ) f ( x  j, y  k ) (8) k  1 j  1 This differs from convolution, only that kernel indices j and k are added to, rather than subtracted from, pixel coordinates x and y. This has the effect of pairing each kernel coefficient with the image pixel that lies directly beneath it. The correlation function given in (8) has the disadvantage of being sensitive to changes in the amplitude of f(x,y) and h(j,k). in brighter parts of an image. For example, doubling all values of f(x,y) doubles the value of g(x,y). It is customary to normalize g(x,y) dividing by the sum of gray levels in the image neighborhood, i.e g ' ( x, y )   k g ( x, y ) f ( x  j, y  k ) (9) j An approach frequently used to overcome this difficulty is to perform matching via the correlation coefficient, which is defined as:   ( x, y )    [h( j, k )  h][ f ( x  k j   [ f (x  k j   j , y  k )  f ( x, y )] j , y  k )  f ( x, y )] 2 (10)    [ h( j , k )  h ] k 2 j   where h is the average pixel value in the template, and neigborhood f ( x, y ) is the average pixel value in the image 4.1 Correlation - Conclusions o o o o o o o used for feature recognition the kernel is now called a template template may be larger and usually contains a small image (with integer gray-scale elements) algorithm is the same as convolution except for the pairing of weights with pixels:  kernel elements are taken left-to-right, top-to-bottom  image elements (in the neighborhood) are taken left-to-right, top-to-bottom you can get correlation using the convolution algorithm by rotating the kernel 180° convolution and correlation are the same for kernels that are symmetric under 180° rotation need to normalize the output 5. Computational problems with convolution and correlation 5. 1The first problem is one of representation. o o weighted sum result may exceed the range supported by the number of bits per pixel  we could normalize the result (usually factored into the kernel coefficients)  or we could define the output image with more bits per pixel If any of the kernel coefficients are negative, it is possible for g(x,y) to be negative . In such cases, we must use a signed data type for the output image. 5.2 The second problem concerns the borders of the image. The entire neighborhood is not defined when processing pixels that border the image .    1) set the border pixels to black . The simple solution is to ignore those pixels for which convolution is not possible.  causes a black border in the output image  will affect histogram of the output image 2) set the output border pixels to the gray-level in the input border pixels . To copy the corresponding pixel value from the input image wherever it is not possible to carry out convolution.  causes a border of unprocessed pixels in the output image 3) make the output image smaller than the input image and include only processed pixels  will affect ability to arithmetically combine input and output images . To compare them on a pixel by pixel basis.      4) truncate the kernel . To modified the kernel  complicates the algorithm for performing convolution or correlation  not always possible to come up with sensible truncated kernels 5) reflected indexing  let the image pixels outside the border of the image be a reflection of the row or column they are adjacent to 6) circular indexing  let the image pixels outside the border of the image be a repetition of the row or column they are adjacent to . Making the input image periodic by assuming the first column comes immediately after the last. Performance o convolution and correlation are O(n2) o if the kernel is separable it can be represented by two orthogonal 1D kernels o separable convolution is done in two passes  apply one of the kernels down the columns of the image, generating an intermediate result  apply the other kernels across the rows of the intermediate image to generate the final result o separable convolution is O(n) o Java Hotspot Technology replaces JIT compilation Convolution in Java2D o Kernel class int width = 3; int height = 3; float[] coeff = new float[width * height]; for (int i = 0; i < coeff.length; i++) coeff[i] = 1.0f/coeff.length; Kernel kernel = new Kernel(width, height, coeff); o o this kernel is normalized (sum of the coefficients equals 1) ConvolveOp implements BufferedImageOp ConvolveOp op = new ConvolveOp(kernel); BufferedImage outImage = op.filter(inImage, ConvolveOp.EDGE_ZERO_FILL); o   ConvolveOp drawbacks  does not support all border processing methods  does not support separable convolution  does not support rescaling of output image (clamps output) Efford's implementation o StandardKernel.java o NeighborhoodOp.java o ConvolutionOp.java ConvolutionTool application

Convolution and correlation

Related documents

Products

Support

Convolution and correlation

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib