Project report outline - CAE Users

advertisement
Post-processing of JPEG image using MLP
Fall 2003
ECE539
Final Project Report
Department of Electrical and Computer Engineering
University of Wisconsin-Madison
_______________
Data Fok
Submitted to: Professor Hu
Dec 19th 2003
1. Introduction
“A picture is worth than a thousand words.” We all should agree with that. There is an
increasing demand of graphics in computer industry. However, the large file size is
always a concern to the user. Although the widely adopted compression standard JPEG
can pack an image to a rather small size, the image quality is not always acceptable.
Especially in low bit rate (bpp) environment, the blocking artifacts affect the image
quality very much. Other standard like JPEG 2000 can handle image in low bit rate very
well, yet its popularity is far less than JPEG. Thus, this project’s aim is to implement a
method to improve the quality of the JPEG by eliminating the blocking artifacts.
2. Approach
Blocking artifact is inherited from the coding scheme of JPEG, as the image will divide
into blocks (typically 8x8) before coding. Different block will be processed on its own.
Thus, the intensity level of the border pixels might have a large gradient as illustrated in
Figure 1.
Figure 1: the intensity gradient between blocks.
Two approaches were tested in the project to change the gradient of the block border
pixels to the original gradient. Image quality improvement was then justified by peak
signal to noise ratio (PSNR) and human eyes.
 255 
PSNR  10 log 10 

 MSE 
 I ( x, y)  Iˆ( x, y)
2
Mean Square Error, MSE 
x, y
MN 2
where M,N are the height and width of the image respectively.
Multi-Layer Perceptron
First approach is to use multi-layer perceptron model with 15 inputs and 6 outputs. The
15 inputs are the gradient of the pixels near the block border in the R,G, B plane.
i.e. (refer to Figure 1),
input 1 is the different between Iˆ1 and
input 2 is the different between Iˆ and
1
input 3 is the different between Iˆ1 and
input 4 is the different between Iˆ2 and
input 5 is the different between Iˆ and
2
Iˆ2 in the R plane.
Iˆ in the G plane.
2
Iˆ2 in the B plane.
Iˆ in the R plane.
3
Iˆ3 in the G plane.

input 15 is the different between Iˆ5 and Iˆ6 in the B plane.
The 6 outputs are the intensity different of the border pixels between the original image
and the JPEG image.
i.e. (refer to Figure 1),
output 1 is the different between I 3 and Iˆ3 in the R plane.
output 2 is the different between I and Iˆ in the G plane.
3
3
output 3 is the different between I 3 and Iˆ3 in the B plane.
output 4 is the different between I and Iˆ in the R plane.
4
4
output 5 is the different between I 4 and Iˆ4 in the G plane.
output 6 is the different between I 4 and Iˆ4 in the B plane.
Experiments are performed in various images under different numbers of levels, numbers
of hidden nodes, learning rate and momentum.
Polynomial curve fitting
Second approach is to use first order polynomial curve fitting apply to the block border
pixels Iˆ2 , Iˆ3 , Iˆ4 , Iˆ5 of the JPEG image, and then replace Iˆ3 , Iˆ4 with the
~ ~
estimated I 3 , I 4 from the polynomial curve fitting. This was performed as a control
experiment to the MLP postprocessing.
3. Experiments, Results & Discussions
A program named jpegPost.m is written in matlab to model a back-pack propagate MLP
and to implement the polynomial curve fitting. The MLP part of the program is based on
the bp.m and bpconfig.m written by Professor Hu.
Setting up the MLP
Different numbers of layer of the MLP were tested. The result showed that a three level
model is enough, as the PSNR and the quality do not improve much in adding more level.
Structure: 15-5-6
15-5-5-5-6
Different numbers of hidden nodes were tested. The result showed that 5 hidden nodes
are enough to cheat the human eyes.
Structure: 15-5-6
Structure: 15-8-6
Different learning rates were tested. The result showed that 0.01 is sufficient.
Learning rate = 0.01
Learning rate = 0.05
Different momentums were applied. The result showed momentum = 0.5 yield the best
result.
Momentum = 0.7
Momentum=0
Training sets were extracted from different images. One set of training data was extracted
from each image. Then, they were fed to the MLP with the optimal structure obtained
above. Comparison is made by the PSNR and the image quality from human eyes.
Experiment #1: grayscale image (train and test with the same image)
JPEG (0.14 bpp)
PSNR = 41.2044 (dB)
MLP postprocessed
PSNR = 40.2514 (dB)
Polynomial Fitting
PSNR = 40.1896 (dB)
From PSNR and human eyes, we can see the quality of the image decreases after the
MLP postprocess and the polynomial fit.
Experiment #2: color image (train and test with the same image)
JPEG (0.18 bpp)
PSNR = 38.2464 (dB)
MLP postprocessed
PSNR = 37.9718 (dB)
Polynomial Fitting
PSNR = 37.6817 (dB)
The blocking artifacts decrease a little in the MLP image. However, the PSNR also
decrease in the MLP image. It proves that the evaluation of the image quality is different
in PSNR and human eyes.
Experiment #3: grayscale image (train with a high bpp image, test with a low bpp image)
Training JPEG bit rate = 0.255 bpp
JPEG (0.085 bpp)
PSNR = 39.5696 (dB)
MLP postprocessed
PSNR = 39.6552 (dB)
Polynomial Fitting
PSNR = 39.2868 (dB)
Blocking artifact decreases by the MLP postprocess from human eyes. PSNR also
increases after MLP postprocess. Polynomial fitting decreases blocking artifact also but it
blurs the image which lead to the decrease in PSNR.
Experiment #4: color image (train with a high bpp image, test with a low bpp image)
Training JPEG image bit rate = 0.374 bpp
JPEG (0.065 bpp)
PSNR = 37.4064 (dB)
MLP postprocessed
PSNR = 37.3664 (dB)
Polynomial Fitting
PSNR = 37.1856 (dB)
The case of color image is not as good as the one in grayscale image. Both the human
eyes quality and the PSNR decreased after MLP postprocessing. On the other hand,
polynomial fitting did a good job in decreasing the block artifact, yet the PSNR still
decreases.
Experiment #5: train with a high bpp grayscale image, test with a low bpp color image
Training JPEG image bit rate = 0.255 bpp
JPEG (0.065 bpp)
PSNR = 37.4064 (dB)
MLP postprocessed
PSNR = 37.4312 (dB)
Although PSNR is increased in the MLP postprocessed image, it is obvious that the
blocking artifacts are still there.
Experiment #6: train with a high bpp color image, test with a low bpp grayscale image
Training JPEG image bit rate = 0.374 bpp
JPEG (0.085 bpp)
PSNR = 39.5696 (dB)
MLP postprocessed
PSNR = 39.125 (dB)
Not even blocking artifact was not improved; the MLP postprocess introduced more
noise to the image making the PSNR decreases. Moreover, as the MLP was trained by
color image, there was some color shift in the MLP postprocessed image.
5. Conclusion
From experiment #3, we can see that the increases in the image quality and PSNR
suggest the possibility of training the MLP by a high bit rate image can give a set of
weight that can eliminate the blocking artifact. It gives an insight that further fine
adjustments of the MLP structure might increase the image quality even in high bit rate
image.
As blocking artifacts mainly appears in low bit rate image, experiment #3 proves the
ability of the MLP postprocess to decrease the blocking artifacts, which fulfill our main
goal. However, experiment #4, 5 and 6 show that the MLP postprocess only work well if
the train data is from grayscale image. It implies that the current MLP structure may not
fit the training data from color image.
Further study can be done on investigating a suitable MLP structure for color training
data.
6. References
[1] W. B. Pennebaker and J. L. Mitchell, (1992) JPEG Still Image Compression Standard. New York:
Van Nostrand Reinhold.
[2] Martin Boliek, Charilaos Christopoulos, Eric Majani, (2000) JPEG 2000 Image Coding System,
ISO/IEC JTCI/SC29 WGI, http://www.jpeg.org/CDs15444.html
[3] Guoping Qiu, (2000) MLP for Adaptive Postprocessing Block-Coded Images. IEEE Transactions
On Circuits And Systems For Video Technology, Vol. 10, No. 8, December 2000
Appendix
A selected part of the test result is attached.
Download