Introduction to Image Processing

advertisement
Introduction to Image
Processing
Signal processing dealt with the manipulation of
one-dimensional arrays corresponding to signals.
Image processing deals with the manipulation of
two-dimensional arrays corresponding to images.
One method of signal processing dealt with the
transformation of discrete-time signals into the
frequency domain. An example of this
transformation was the DFT:
DFT
x[ n]  X (k ).
After taking the DFT, we then performed a “filtering”
process by enhancing some frequency components
while attenuating other frequency components. After
performing operations on the frequency
components, an inverse DFT was performed to
obtain the output signal.
DFT
x[n]   X (k ).
Y ( k )  H (k ) X (k ).
IDFT
Y ( k )  
 y[n].
The DFT operation could be represented by a matrix
multiplication of a discrete-time sample vector:
X  Wx .
The vectors x and X are Nx1 matrices (vectors). W
is an NxN matrix.
The frequency-domain vector X can then be filtered,
and the filtered vector can be converted back to
time-domain.
Y  HX.
1
y  W Y.
Similar processing can be performed on twodimensional arrays. The transformation, however, is
itself two-dimensional.
DFT 2
x[n1 , n2 ] 
 X (k1 , k 2 ).
n1  0, 1, ..., N1  1.
n2  0, 1, ..., N 2  1.
The two-dimensional transformation operates on the
rows and columns of x.
On the following slide is a MATLAB script
demonstrating the two-dimensional discrete Fourier
transformation [fft2()].
»
»
»
»
»
»
»
»
n1 = 0:31;
n1 = n1/32;
n2 = 0:31;
n2 = n2/32;
x = kron(cos(2*pi*n1),cos(2*pi*2*n2)');
mesh(n1,n2,x);
X = fft2(x)/1024;
mesh(0:15,0:15,abs(X(1:16,1:16)));
The graphics outputs are shown on the following two
slides.
Frequency Domain
0.25
0.2
0.15
0.1
0
0.05
0
0
5
5
10
10
15
k2
15
k1
The matrix transformation process can be achieved
(as with signals) using matrix multiplication:
X  W1xW2 .
T
The matrices x and X are N1xN2 matrices. W1 is an
N1xN1 matrix; W2 is an N2xN2 matrix.
The frequency-domain matrix X can then be filtered
filtered using a matrix H.
Y  HX.
The filtered matrix (Y) can then be converted back to
time-domain.
y  W YW2 .
T
1
The filter matrix H is called a mask.
As an example of image processing, we could
transform a picture with “high-frequency”
components, and filter-out these high-frequency
components.
»
»
»
»
»
»
»
n1 = 0:31;
n1 = n1/32;
n2 = 0:31;
n2 = n2/32;
x = zeros(32,32);
for k1=1:2:11
for k2=1:2:11
» x = x + kron(sin(2*k1*pi*n1)/k1,sin(2*k2*pi*2*n 2)'/k2);
» end;
» end;
» mesh(n1,n2,x);
»
»
»
»
»
»
»
»
»
»
»
»
X = fft2(x)/1024;
mesh(0:15,0:15,abs(X(1:16,1:16)));
H = zeros(32,32);
H(1:4,1:2) = ones(4,2);
H(29:32,1:2) = ones(4,2);
H(1:4,31:32) = ones(4,2);
H(29:32,31:32) = ones(4,2);
Y = H.*X;
y = real(ifft2(Y));
mesh(0:15,0:15,abs(H(1:16,1:16)));
mesh(0:15,0:15,abs(Y(1:16,1:16)));
mesh(n1,n2,y);
Frequency Domain: X
0.25
0.2
0.15
0.1
0
0.05
0
0
5
5
10
10
15
k2
15
k1
Frequency Domain: H
1
0.8
0.6
0.4
0
0.2
0
0
5
5
10
10
15
k2
15
k1
Frequency Domain: Y
0.25
0.2
0.15
0.1
0
0.05
0
0
5
5
10
10
15
k2
15
k1
In the MATLAB script, the mask matrix H was
created in the following way:
»
»
»
»
»
H = zeros(32,32);
H(1:4,1:2) = ones(4,2);
H(29:32,1:2) = ones(4,2);
H(1:4,31:32) = ones(4,2);
H(29:32,31:32) = ones(4,2);
The full mask (from k1, k2 = 0 to 31) looks like the
image on the following slide.
Filtering Mask: H
The reason for the extra filtering blocks at the
“corners” is to take into account aliasing. The corner
blocks are like “frequency mirrors.” Aliasing occurs
in two dimensions.
This process of masking a transformed image can
be used in image compression. Suppose an
image consists of mostly “low-frequency”
components. We can apply a mask to the FFT of
the image, and store only the non-zero components
of the masked image.
As an example, suppose we have an image
corresponding to the two-dimensional signal in the
first example: a two-dimensional sinewave with
frequencies 1 Hz and 2 Hz in the first and second
dimensions.
We could take the FFT, and only store the
components k1=0,1 and k2=0,1,2. We can then
perform a “compression” by a factor of 162/(2•3).
(Here we only concentrate on the non-aliased
components of the spectrum.)
The Discrete Cosine
Transform
An easier-to-use transform to perform compression
is the discrete Cosine transform (DCT). The DCT
has only real components, and the mask does not
need to take into account aliases.
The definition of the (one-dimensional) DCT is as
follows:
X [k ]  DCTx[n]
2 N 1
(2n  1)k
 W [k ]  x[n] cos
.
N n 0
2N
The W(k) function is a special “frequency window”:
 1
(k  0),

W [k ]   2
1
(k  1,, N  1).

The inverse DCT is as follows:
x[n]  IDCTX [k ]
(2n  1)k
 W [k ] X [k ] cos
.
2N
n 0
N 1
As with the DFT, the DCT, can be represented by a
matrix multiplication of a discrete-time sample
vector:
X  Wx .
The elements of W correspond to
(2n  1)k
W [k ] cos
.
2N
The two-dimensional DCT can be constructed
using two matrices W1 and W2:
X  W1xW2 .
T
We can then apply a mask H to the matrix (image)
X:
Y  HX.
The non-zero portion of the masked matrix (Y) can
then be stored (using fewer bits or bytes than the
original image x).
The stored image Y can be transformed back into an
image using
y  W YW2 .
T
1
The restored image y generally suffers some
degradation, but not as much degradation if we
simply “throwing away pixels” or going to a lower
resolution in order to achieve compression.
To summarize, the transformation, masking
(compression), decompression process is
X  W1xW2 .
T
Y  HX.
y  W YW2 .
T
1
This process is performed in .jpeg image storage.
In the examples that follow the two-dimensional DCT
transformation was applied to some images with
various factors of compression. The factors of
compression are equal to k2, where k is the
compression factor in one-dimension. Since many
versions of MATLAB do not come with a twodimensional DCT function, a MATLAB script file
dct2.m is available (on my website in the zipped
course file).
Original Image: aerial.tif
aerial.tif Compressed by Pixel Elimination factor of 4
aerial.tif Compressed by DCT factor of 4
Original Image: camera.tif
camera.tif Compressed by Pixel Elimination factor of 9
camera.tif Compressed by DCT factor of 9
Original Image: text.tif
text.tif
Compressed by Pixel Elimination factor of 16
text.tif
Compressed by DCT factor of 16
See also the MATLAB script file imagedemo.m (on
my website in the zipped course file).
Download