2.0 The DCT and Image Compression

advertisement
1
Faculty of Engineering
LAB SHEET
MULTIMEDIA TECHNOLOGY AND
APPLICATIONS
ECP3086
Lab 2) MTA2: Multimedia Data Compression
Notes: Students are required to read the lab-sheet and implement most of the
experiment code before attending the lab session. The lab instructor may request to
see your Matlab code when checking your experiment result. The lab session is
mainly to assist students who face problems doing the experiment.
Revised: May 2009 (Dr Chang Yoong Choon), May 2010 (Haris)
2
Lab 2 Marking Scheme
No Component
Criteria
1
Students able to demonstrate selected
working code and modify the code for
a specific task as requested by the lab
instructor during viva session. This is
to check that the code is not copied
from other students.
Viva
Not answered – 0 marks
Poor – 2 marks
Acceptable – 4 (max)
marks
Typical question
1) Modify the code to collect some
information from the experiment.
2
3
Report
Report
4
Report
Report follow scientific format
The experiment is conducted
thoroughly and result is well analyzed
The report provide good reasons and
justifications to support the conclusion
Total marks = 16
Lab 1 contributes 5% to the course work mark.
Notes
Please familiarize your self with Matlab coding for image processing and
compression by using the prepared tutorial ECP3086 image processing
analysis and compression using matlab 2010.doc available at MMLS website.
You should try all the exercises in the tutorial sheet.
Revised: May 2009 (Dr Chang Yoong Choon), May 2010 (Haris)
3
The lab experiment contributes towards the achievement of the
program outcomes 1-5.
Program Outcome Contribution
% of
contribution
Programme
Outcomes (pls
check the
individual syllabus
for the actual
programme
outcomes)
Assessment Activities
(pls revise to match that of the specific subject)
30
2.
Capability to
communicate
effectively
5
Tutorial /
Assignment
Lab Experiments
 Group assignment
 Focus group discussion at tutorial
 Lab report writing
3.
Acquisition of
technical
competence in
specialised areas of
engineering
discipline
30
Final Exam
Test Quiz
Lab Experiments
 Written exam
 Written exam
 Lab report writing
4.
Ability to identify,
formulate and
model problems
and find
engineering
solutions based on
a system approach
10
Tutorial /
Assignment
Lab Experiments





5
Lab Experiments
Assignment
 Individual and group work
 Lab report writing
 Oral assessment at the end of lab
5. Ability to conduct
research in chosen
fields of
engineering
Final Exam
Test Quiz
Tutorial /
Assignment







1. Ability to acquire
and apply
fundamental
principles of science
and engineering
Lab Experiments
Tutorial /
Assignment
Revised: May 2009 (Dr Chang Yoong Choon), May 2010 (Haris)

Written exam
Written exam
Group assignment
Focus group discussion at tutorial
Work individually
Lab report writing
Oral assessment at the end of lab
Group assignment
Focus group discussion at tutorial
Individual Work
Lab report writing
Oral assessment at the end of lab
Group assignment
4
Lab 2: Multimedia Data Compression
The purpose of this lab to introduce some basic techniques of multimedia data
compression, in particular involving audio and image.
Experiment 1: Audio Compression Using Down sampling and Quantization
1. Introduction
Multimedia data is usually very large in its size due to its high redundancy. The
audio data is composed of a set of discreet sound samples. Each sound sample are
quantized and represented by a binary code. The purpose of this lab is for you to
understand the principles of sampling a continuous time signal, increasing or
decreasing the sampling rate of a discrete time signal and change the quantization
resolution.
In this experiment you will record your own speech and music audio with
different sampling rates and bits per sample, apply sampling rate conversion and
apply quantization on digital signals. By comparing the sound quality obtained
with different filters and quantizers, you should have a better appreciation of the
differences in performances by using different prefilters, interpolation filters and
quantizers.
Please get a headphone from the lab technician so that your can listen to the sound
with better quality, and also without interfering others.
You are required to do the recording of the speech and music audio and store it in
mp3 format before your lab session. Bring the two recorded mp3 files (speech and
music) to the lab during the experiment.
2.0 Experiment Procedure
Part 1) Investigate the effect of sampling rate and bits per sample on sound
quality
1) Audio recording at different sampling frequencies and bit rate.
a. Start the Windows Media Player on the computer.
b. You need to configure the recording control to change the source of
sound input. Open the volume control panel (double-click the
“speaker” icon in system tray), select “options-properties->recording”,
then select “CD Audio, Microphone, Stereo Mix” to all these three
items in recording control panel, as shown in Fig. 1, and click “OK”.
Revised: May 2009 (Dr Chang Yoong Choon), May 2010 (Haris)
5
In recording panel (Fig. 2), select “Stereo Mix” so that you can record
the internal sound of the computer.
Fig. 1. A snapshot of audio properties
Fig. 2. A snapshot of recording control properties
c. Open the sound recorder (accessory -> entertainment -> sound
recorder). A snapshot of the Windows sound recorder is shown in Fig.
3. Adjust the recorder properties to set the sampling rate be 11.025
KHz, 8 bits/sample. This can be done by selecting “file->properties>convert now”, as shown in Fig. 4. Choose “PCM” for “format”,
choose “11.025 KHz, 8 bits, mono” for “attributes”, type “.wav” for
“save as”.
Revised: May 2009 (Dr Chang Yoong Choon), May 2010 (Haris)
6
Fig. 3. A snapshot of sound recorder
Fig. 4. A snapshot of sound selection
d. Push the “record” button. Record 30 seconds of the played audio. Save
it into a file using the “.wav” format.
e. Record the same audio segment using different sample rates and
bits/sample and compare the sound quality. The audio quality can be
subjectively evaluated as follow
No
1
Perceptual
Quality
Poor
2
Acceptable
3
Good
Comment
The sound is corrupted by
excessive noise and the
sound is no longer
understandable
The sound is corrupted by
moderate noise level
The sound quality is
perceived to be similar to
the original sound
f. Repeat the procedure above for the following recording parameter and
comment on the perceptual quality.
The experiment result can be recorded into the following format
Revised: May 2009 (Dr Chang Yoong Choon), May 2010 (Haris)
7
Sampling rate = 11.025 KHz
Effect of sampling rate and quantization resolution on media quality
Type of audio
Bits per sample
Perceptual quality
Music
8 bits/sample
Music
12 bits/sample
Music
16 bits/sample
Speech
8 bits/sample
Speech
12 bits/sample
Speech
16 bits/sample
Sampling rate = 44.1 KHz
Effect of sampling rate and quantization resolution on media quality
Type of audio
Bits per sample
Perceptual quality
Music
8 bits/sample
Music
12 bits/sample
Music
16 bits/sample
Speech
8 bits/sample
Speech
12 bits/sample
Speech
16 bits/sample
Part 2) Sampling rate conversion using different prefilters and interpolation
filters.
You are given two MATLAB programs, down4up4_nofilt(), down4up4_filt().The
“down4up4_nofilt” program down sample a given sound file by a factor of 4,
without prefiltering, and then interpolate the resulting sequence by a factor of 4 by
repeating each sample 4 times. It also computes and displays the spectra of the
original, down-sampled, and interpolated sequences. The “down4up4_filt”
program uses a good low-pass filter for prefiltering before down-sampling, and
uses a good interpolation filter for up-sampling.
You can apply the two programs on a sound file you have recorded earlier using a
sampling frequency of 44KHz sampling frequency, 16 bits/sample.
a. Start the MATLAB program. Change directory to your own working
directory. Copy all the program files and sample audio files to your
directory.
b. Use the “down4up4_nofilt” program on a sound file recorded
using a sampling frequency of 44KHz sampling frequency. For
example, if you want to use an input file with name “myinput.wav”,
and save the interpolated in an output file with name “myoutput.wav”,
you should type
down4up4_nofilt(‘myinput.wav’,’myoutput.wav’)
on the command window of MATLAB.
Revised: May 2009 (Dr Chang Yoong Choon), May 2010 (Haris)
8
c.
The program also computes the mean square error between the
original and the interpolated signal. Compare the original sound and
the one after down-sampling and upsampling, in terms of perceptual
sound quality, waveform, frequency spectrum and mean square error. ?
Tabulate the result as shown in the table 1 below and analyze the result
d.
Use the “down4up4_filt” program on the same input sound file
you used for part 1. Compare the original sound and the one after
down-sampling and upsampling, in terms of perceptual sound quality,
waveform, and spectrum. Also compare the resulting sequence from
part a) with the resulting sequence from part b), in terms of perceptual
sound quality, waveform, spectrum, and reconstruction error (mean
square error). Which program gives better results? Tabulate the result
as shown in the table 1 below and analyze the result
Table 1: Comparison of Mean Square Error and Perceptual Quality of the
Original and Reconstructed Signal based on Two Down-sampling/Upsampling Scheme (Original Sampling rate = 44.1 khz)
Down-sampling/Up-sampling
Down-sampling/Up(No filter used)
sampling (With filter)
Mean square error
Perceptual Quality
Comment on
waveforms
Comment on
frequency
spectrum
Part 3: Comparison of Audio Signal Quantizers
In this experiment you will investigate two types of quantization method which
are the uniform quantization and mu-law quantization. Mu-law quantization is
designed to compress the speech signal. It exploits the nature of the speech signal
which commonly has low or moderate signal magnitude.
You are given two MATLAB programs, quant_uniform.m and quant_mulaw.m.
The “quant uniform” program quantizes an input sequence using a uniform
quantizer with a user-specified number of levels. The “quant_mulaw” program
uses the mu- law quantizer. Basically the program reduces the quantization levels
of the audio signal in order to compress it. By using lesser number of quantization
levels to represent the whole dynamic range of the signal, we can reduce the
number of bits per sample and the data size.
Revised: May 2009 (Dr Chang Yoong Choon), May 2010 (Haris)
9
a. Apply “quant_uniform” function to a sound file recorded with 16
bits/sample. Use the program to quantize it to a lower quantization
levels. For example to use 256 quantization levels (8 bits per sample)
use N=256. For example, if you want to use an input file with name
“myinput.wav”, and save the quantized sequence in an output file
with name “myoutput.wav”, and set N=256, you should type
quant_uniform(‘myinput.wav’,’myoutput.wav’,256)
on the command window of MATLAB.
Compare the original sequence and quantized sequences with different
quantization levels, in terms of perceptual sound quality and the waveform.
Use table 2 as a guideline for the experiment setting. Also record and
compare the quantization error (mean square error between the original
and quantized samples). What is the minimum N to obtain a good sound
quality.
b. Apply “quant_mulaw” function to a sound file recorded with 16
bits/sample. Set the new quantization level to various setting as
indicated in table 2, with mu=16.
Repeat the experiment for both music and speech audio data. Use different
quantization levels N. What is the minimum N to obtain an acceptable
sound quality? Use the table below as a guideline to tabulate the result.
Table 2: Comparison of Mean Square Error and Perceptual Quality of Two
Quantization Sceme (Original recorded sound sampling rate = 44.1 khz, 16
bits/level, type of audio is speech)
N is
Uniform quantization
Mu-Law quantization
quantization
Mu = 16
level
Perceptual
Perceptual
Mean
quality
Mean square quality
N = 2b
square error poor/acceptable error (mse)
poor/acceptable
b = number of
(mse)
/good
/good
bits used per
level
N= 212
= 4096
N= 210
= 1024
N= 28
= 256
N= 26
= 64
N= 24
= 16
Revised: May 2009 (Dr Chang Yoong Choon), May 2010 (Haris)
10
Table 2: Comparison of Mean Square Error and Perceptual Quality of Two
Quantization Scheme (Original recorded sound sampling rate = 44.1 khz, 16
bits/level, type of audio is music)
N is
Uniform quantization
Mu-Law quantization
quantization
Mu = 16
levels
Perceptual
Perceptual
Mean square quality
Mean square quality
N = 2b
error (mse)
poor/acceptable error (mse)
poor/acceptable
b = number of
/good
/good
bits used per
level
N= 212
= 4096
N= 210
= 1024
N= 28
= 256
N= 26
= 64
N= 24
= 16
In your report, you should provide the full analysis of the experiment result that
you have obtained. Write your conclusion based on this analysis.
3.0 Report
Experiment is conducted with the purpose of testing new idea or new solution to a
problem. It is also used to find information in a systematic manner
Write your report for the experiment by addressing all the questions posed in the
lab sheet. You should also address the experiment objectives by answering the
following research questions. Give evidence based on your experiment result and
your own reasoning. Here are some useful questions to be answered in your report.
1) What do you want to find out from the experiment?
2) What have you learned from the experiment?
3) What is the conclusion? Show evidence to support your conclusion.
Reference:
Prof. Yao Wang (Polytechnic Unversity, US) Multimedia Communications 1 lab
experiment.
Revised: May 2009 (Dr Chang Yoong Choon), May 2010 (Haris)
11
Experiment 2: Image Compression Using JPEG
Objectives of Experiment
a) Investigate the effectiveness of JPEG scheme for compressing
photographic image
b) Represent compressed image in frequency domain using DCT
transform
c) Investigate the importance of different DCT coefficients
d) Investigate the tradeoff in the selection and quantization of DCT
coefficients and its effect on compression ratio and image quality.
1.0 Introduction:
This experiment deals with image compression using Discrete Cosine
Transform (DCT). DCT is at the heart of the most popular lossy image
compression standard on the internet, the JPEG compression standard. In
this experiment we will experiment on how to transform an image into a
series of 8 x 8 DCT coefficients blocks, how to quantize the DCT coefficients
and then how to reconstruct the image based on the quantized DCT
coefficients. The comparison between the original image and the
decompressed image can then be performed.
Discrete Cosine Transform
The dct2 function in the Image processing toolbox computes the two
dimensional discrete cosine transform (DCT) of an image. The DCT has the
property that, for a typical image, most of the visually significant information
about the image is concentrated in just a few coefficients of the DCT. For this
reason, the DCT is often used in image compression applications.
The DCT Transform Matrix
The image processing toolbox offers two different ways to compute the DCT.
The first method is to use the dct2 function. dct2 uses an FFT based
algorithm for speedy computation with large inputs. For small square inputs,
such as 8-by-8 or 16-by-16, it may be more efficient to use the DCT transform
matrix, which is returned by the function dctmtx. The M-by-M transform
matrix T is given by:
Revised: May 2009 (Dr Chang Yoong Choon), May 2010 (Haris)
12
T pq







1
M
,
p  0, 0  q  M  1
2
 (2q  1) p
cos
, 1  p  M  1, 0  q  M  1
M
2M
For an M-by-M matrix A, T*A is an M-by-M matrix whose columns contain
the one dimensional DCT of the columns of A. The two dimensional DCT of A
can be computed as:
B=T*A*T’
where T’ is the transpose of T. Since T is a real orthonormal matrix, its
inverse is the same as its transpose. Therefore, the inverse two dimensional
DCT of B is given by T’*B*T.
2.0 The DCT and Image Compression
In the JPEG image compression algorithm, the input image is divided into 8by-8 or 16-by-16 blocks, and the two dimensional DCT is computed for each
block. The DCT coefficients are then quantized, coded, and transmitted. The
JPEG receiver (or JPEG file reader) decodes the quantized DCT coefficients,
compute the inverse two-dimensional DCT of each block, and then puts
blocks back together into a single image. For typical images, many of the
DCT coefficients have values close to zero. These coefficients can be discarded
without seriously affecting the quality of the reconstructed image.
Revised: May 2009 (Dr Chang Yoong Choon), May 2010 (Haris)
13
2.1 Image Compression Experiment Procedures
The following program will perform image compression and decompression
for the image ‘cameraman.tif’ by using 8x8 DCT matrix.
T = dctmtx(8);
Mask=[ 1
1
1
1
0
0
0
0
1
1
1
0
0
0
0
0
1
1
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0 ];
A = imread(‘cameraman.tif’);
A = double(A)/255;
B = blkproc(A, [8 8], ‘P1*x*P2’, T, T’);
C = blkproc(B,[8 8], ‘P1.*x’,Mask);
D = blkproc(C, [8 8], ‘P1*x*P2’, T’, T);
imshow(A), figure, imshow(D);
In the mini-program above, imread is the function used to read the image
into Matlab, double is the function to convert the class type into type double
so that arithmetic operation is possible, dctmtx is the function to create the
DCT matrix, blkproc is the function to transform an image into its DCT
coefficients (and vice versa) by breaking down the image into several blocks,
and combining them back after the DCT transformation. Finally imshow is
the function to display the image.
The variable Mask is the quantization matrix where we can control the
compression ratio of the algorithm and the quality of the decompressed image
by multiplying the high frequency DCT components with 0 (hence removing
them). The larger the number of zeros in the Mask matrix, the larger the
compression ratio will be, but at the expense of lower quality image.
Note that in the lecture notes, we divide the DCT coefficients by the
quantization table coefficients, followed by a round up/down operator.
However here, to show the effect of removing some of the DCT coefficients,
Revised: May 2009 (Dr Chang Yoong Choon), May 2010 (Haris)
14
we can simply multiply the DCT coefficients with 1 if we want to preserve
them and with 0 if we want to remove them. Do not get confused on this!
Observe and understand the output of the mini program above, and proceed
with the following experiments:
1. For the compression and decompression of the grayscale image
‘cameraman.tif’ in the above example:
a. Observe the difference between the original image and the
reconstructed image. Are there any noticeable difference?
b. Try several different quantization matrices, and observe on the
quality of the reconstructed image. What happen when all the
elements in the quantization matrix is set to 1?
c. Can we remove all of the DCT coefficients except for the DC value
without affecting the quality of the reconstruct images by too much?
Explain your answer.
d. Repeat the above experiments on the ‘rice.tif’ image.
2. Load the color image ‘peppers.png’ into Matlab. For the compression of
color images, they are first converted to the YCbCr color format, followed
by sub-sampling of the chrominance (Cb & Cr) channels. To convert to
YCbCr color space and vice versa, use the function rgb2ycbcr and
ycbcr2rgb. To sample the chrominance channels, use the function
imresize. Use the same mask as in the example above for both the
luminance and chrominance channels.
a. Perform the DCT and quantization on all the channels without any
sub-sampling. Reconstruct the image and observe the quality of
the image.
b. Perform the DCT and quantization on every channel using the
4:2:0 chroma sub-sampling. Reconstruct the image and observe the
quality of the image.
c. Are there any significant differences between the two
reconstructed images above? Explain your answer.
3. Write a simple function that takes an image and a mask as the input,
perform compression and decompression on the image (color image
compression if input image is color), and display the original and
reconstructed images side by side. The program should also be able to
compute the SNR of the reconstructed image.
3.0 Report
Experiment is conducted with the purpose of testing new idea or new solution to a
problem. It is also used to find information in a systematic manner. Write your
report for the experiment by addressing all the questions posed in the lab sheet.
You should also address the experiment objectives by answering the following
research questions. Give evidence based on your experiment result and your own
Revised: May 2009 (Dr Chang Yoong Choon), May 2010 (Haris)
15
reasoning. Here are some questions that you should try to answer in order to
produce a good technical report.
1. What do you want to find out from the experiment?
2. Is JPEG an effective scheme for compressing photographic image? Give
supporting evidence.
3. How do you measure compression efficiency?
4. Compressed image is stored as quantized DCT coefficients. Is this a suitable
approach? Give justification.
5. There are low, medium and high frequency DCT coefficients. Which types of
coefficients are perceptually important and thus require fine quantization?
6. What is the effect of reducing the number of DCT coefficients on the
compression ratio and the image quality? What are the methods that can be
used to increase compression ratio.
7. Suggest a JPEG compression scheme in order to achieve the following
objectives
a. Reduce the size of the image moderately but preserve the quality of the
reconstructed image? In other words we would like to ensure the
reconstructed image is perceptually similar to the original image.
b. Maximize the compression ratio but can tolerate a moderate distortion
of the reconstructed image.
8. What have you learned from the experiment?
Revised: May 2009 (Dr Chang Yoong Choon), May 2010 (Haris)
Download