Introduction to Image Processing Signal processing dealt with the manipulation of one-dimensional arrays corresponding to signals. Image processing deals with the manipulation of two-dimensional arrays corresponding to images. One method of signal processing dealt with the transformation of discrete-time signals into the frequency domain. An example of this transformation was the DFT: DFT x[ n] X (k ). After taking the DFT, we then performed a “filtering” process by enhancing some frequency components while attenuating other frequency components. After performing operations on the frequency components, an inverse DFT was performed to obtain the output signal. DFT x[n] X (k ). Y ( k ) H (k ) X (k ). IDFT Y ( k ) y[n]. The DFT operation could be represented by a matrix multiplication of a discrete-time sample vector: X Wx . The vectors x and X are Nx1 matrices (vectors). W is an NxN matrix. The frequency-domain vector X can then be filtered, and the filtered vector can be converted back to time-domain. Y HX. 1 y W Y. Similar processing can be performed on twodimensional arrays. The transformation, however, is itself two-dimensional. DFT 2 x[n1 , n2 ] X (k1 , k 2 ). n1 0, 1, ..., N1 1. n2 0, 1, ..., N 2 1. The two-dimensional transformation operates on the rows and columns of x. On the following slide is a MATLAB script demonstrating the two-dimensional discrete Fourier transformation [fft2()]. » » » » » » » » n1 = 0:31; n1 = n1/32; n2 = 0:31; n2 = n2/32; x = kron(cos(2*pi*n1),cos(2*pi*2*n2)'); mesh(n1,n2,x); X = fft2(x)/1024; mesh(0:15,0:15,abs(X(1:16,1:16))); The graphics outputs are shown on the following two slides. Frequency Domain 0.25 0.2 0.15 0.1 0 0.05 0 0 5 5 10 10 15 k2 15 k1 The matrix transformation process can be achieved (as with signals) using matrix multiplication: X W1xW2 . T The matrices x and X are N1xN2 matrices. W1 is an N1xN1 matrix; W2 is an N2xN2 matrix. The frequency-domain matrix X can then be filtered filtered using a matrix H. Y HX. The filtered matrix (Y) can then be converted back to time-domain. y W YW2 . T 1 The filter matrix H is called a mask. As an example of image processing, we could transform a picture with “high-frequency” components, and filter-out these high-frequency components. » » » » » » » n1 = 0:31; n1 = n1/32; n2 = 0:31; n2 = n2/32; x = zeros(32,32); for k1=1:2:11 for k2=1:2:11 » x = x + kron(sin(2*k1*pi*n1)/k1,sin(2*k2*pi*2*n 2)'/k2); » end; » end; » mesh(n1,n2,x); » » » » » » » » » » » » X = fft2(x)/1024; mesh(0:15,0:15,abs(X(1:16,1:16))); H = zeros(32,32); H(1:4,1:2) = ones(4,2); H(29:32,1:2) = ones(4,2); H(1:4,31:32) = ones(4,2); H(29:32,31:32) = ones(4,2); Y = H.*X; y = real(ifft2(Y)); mesh(0:15,0:15,abs(H(1:16,1:16))); mesh(0:15,0:15,abs(Y(1:16,1:16))); mesh(n1,n2,y); Frequency Domain: X 0.25 0.2 0.15 0.1 0 0.05 0 0 5 5 10 10 15 k2 15 k1 Frequency Domain: H 1 0.8 0.6 0.4 0 0.2 0 0 5 5 10 10 15 k2 15 k1 Frequency Domain: Y 0.25 0.2 0.15 0.1 0 0.05 0 0 5 5 10 10 15 k2 15 k1 In the MATLAB script, the mask matrix H was created in the following way: » » » » » H = zeros(32,32); H(1:4,1:2) = ones(4,2); H(29:32,1:2) = ones(4,2); H(1:4,31:32) = ones(4,2); H(29:32,31:32) = ones(4,2); The full mask (from k1, k2 = 0 to 31) looks like the image on the following slide. Filtering Mask: H The reason for the extra filtering blocks at the “corners” is to take into account aliasing. The corner blocks are like “frequency mirrors.” Aliasing occurs in two dimensions. This process of masking a transformed image can be used in image compression. Suppose an image consists of mostly “low-frequency” components. We can apply a mask to the FFT of the image, and store only the non-zero components of the masked image. As an example, suppose we have an image corresponding to the two-dimensional signal in the first example: a two-dimensional sinewave with frequencies 1 Hz and 2 Hz in the first and second dimensions. We could take the FFT, and only store the components k1=0,1 and k2=0,1,2. We can then perform a “compression” by a factor of 162/(2•3). (Here we only concentrate on the non-aliased components of the spectrum.) The Discrete Cosine Transform An easier-to-use transform to perform compression is the discrete Cosine transform (DCT). The DCT has only real components, and the mask does not need to take into account aliases. The definition of the (one-dimensional) DCT is as follows: X [k ] DCTx[n] 2 N 1 (2n 1)k W [k ] x[n] cos . N n 0 2N The W(k) function is a special “frequency window”: 1 (k 0), W [k ] 2 1 (k 1,, N 1). The inverse DCT is as follows: x[n] IDCTX [k ] (2n 1)k W [k ] X [k ] cos . 2N n 0 N 1 As with the DFT, the DCT, can be represented by a matrix multiplication of a discrete-time sample vector: X Wx . The elements of W correspond to (2n 1)k W [k ] cos . 2N The two-dimensional DCT can be constructed using two matrices W1 and W2: X W1xW2 . T We can then apply a mask H to the matrix (image) X: Y HX. The non-zero portion of the masked matrix (Y) can then be stored (using fewer bits or bytes than the original image x). The stored image Y can be transformed back into an image using y W YW2 . T 1 The restored image y generally suffers some degradation, but not as much degradation if we simply “throwing away pixels” or going to a lower resolution in order to achieve compression. To summarize, the transformation, masking (compression), decompression process is X W1xW2 . T Y HX. y W YW2 . T 1 This process is performed in .jpeg image storage. In the examples that follow the two-dimensional DCT transformation was applied to some images with various factors of compression. The factors of compression are equal to k2, where k is the compression factor in one-dimension. Since many versions of MATLAB do not come with a twodimensional DCT function, a MATLAB script file dct2.m is available (on my website in the zipped course file). Original Image: aerial.tif aerial.tif Compressed by Pixel Elimination factor of 4 aerial.tif Compressed by DCT factor of 4 Original Image: camera.tif camera.tif Compressed by Pixel Elimination factor of 9 camera.tif Compressed by DCT factor of 9 Original Image: text.tif text.tif Compressed by Pixel Elimination factor of 16 text.tif Compressed by DCT factor of 16 See also the MATLAB script file imagedemo.m (on my website in the zipped course file).