International Journal of Application or Innovation in Engineering & Management... Web Site: www.ijaiem.org Email: , Volume 3, Issue 2, February 2014

advertisement
International Journal of Application or Innovation in Engineering & Management (IJAIEM)
Web Site: www.ijaiem.org Email: editor@ijaiem.org, editorijaiem@gmail.com
Volume 3, Issue 2, February 2014
ISSN 2319 - 4847
Image Compression using Resilient-Propagation
Neural Network
B. Das1, Dr. S. S. Nayak2
1
Associate Professor, Gandhi Institute of Engineering and Technology
Gunupur, Odisha, India
2
Dean (R & D), Centurion University of Technology and Management
Paralakhemundi, Odisha, India
Abstract
Feed forward Network using Back Propagation Algorithm is directly applied to image compression. Several compression schemes
have been proposed to achieve quick convergence of the network with no significant loss of information. When these images are
compressed using Back-propagation Network, it takes longer time to converge. Resilient-Propagation Neural Network (RPROP)
performs a local adaptation of the weight-updates according to the behavior of the error function. The effect of the RPROP
adaptation process is not blurred by the unforeseeable influence of the size of the derivative but only dependent on the temporal
behavior of its sign.
Keywords-Resilient-propagation Neural Network, Convergence.
1. INTRODUCTION
Image compression research aims at reducing the number of bits needed to represent an image. Image compression
algorithms remove redundancy present in the data in a way which makes image reconstruction possible. However, usages
of the algorithms are dependent mostly on the information contained in images. A practical compression algorithm for
image data should preserve most of the characteristics of the data while working in a lossy manner and should have lesser
algorithmic complexity Artificial Neural Networks have been applied to image compression problems [1] due to their
superiority over traditional methods. Artificial Neural networks seem to be well suited to image compression.
They have the ability to preprocess input patterns to produce simpler patterns with fewer components. Artificial Neural
Networks based techniques provide sufficient compression of the data in question. Many different training algorithms and
architectures have been used. Different types of Artificial Neural Networks have been trained to perform Image
Compression. Feed-Forward Neural Networks, Self-Organizing Feature Maps, Learning Vector Quantizer Network, have
been applied to Image Compression. The Back Propagation Neural Network Algorithm performs a gradient-descent in the
parameter space minimizing an appropriate error function. The weight update equations minimize this error.. The Backpropagation Neural Network takes longer time to converge. The compression ratio achieved is also not high. To overcome
these drawbacks a new approach using Resilient-Propagation Neural Network (RPROP) is proposed in this paper.
2. RESILIENT-PROPAGATION NEURAL NETWORK
2.1 Back propagation learning
Back propagation is the most widely used algorithm for supervised learning with multi-layered feed forward networks.
The basic idea of the back propagation learning algorithm [2] is the repeated application of the chain rule to compute the
influence of each weight in the network with respect to an arbitrary error function E:
Where wij is the weight from neuron j to neuron i, si is the output, and neti is the weighted sum of the inputs of neuron i.
once the partial derivative for each weight is known, the aim of minimizing the error function is achieved by performing
a simple gradient descent:
Obviously, the choice of the learning rate ϵ, which scales the derivative, has an important effect on the time needed until
convergence is reached. If it is set too small, too many steps are needed to reach an acceptable solution; on the contrary a
large learning rate will possibly lead to oscillation, preventing the error to fall below a certain value.
An early way proposed to get rid of the above problem is to introduce a momentum-term:
Volume 3, Issue 2, February 2014
Page 154
International Journal of Application or Innovation in Engineering & Management (IJAIEM)
Web Site: www.ijaiem.org Email: editor@ijaiem.org, editorijaiem@gmail.com
Volume 3, Issue 2, February 2014
ISSN 2319 - 4847
Where the momentum parameter µ scales the influence of the previous step on the current. The momentum-term is
believed to render the learning procedure more stable and to accelerate convergence in shallow regions of the error
function.
However, as practical experience has shown, this is not always true. It turns out in fact, that the optimal value of the
momentum parameter µ is equally problem dependent as the learning rate ϵ1 and that no general improvement can be
accomplished.
3. RPROP
3.1 Description
RPROP stands for ‘resilient propagation’ and is an efficient new learning scheme that performs a direct adaptation of the
weight step based on local gradient information. The effort of adaptation is not blurred by gradient behavior whatsoever.
To achieve this, we introduce for each weight its individual update-value Δij, which solely determines the size of the
weight-update. This adaptive update value evolves during the learning process based on its local sight on the error
function E, according to the following learning-rule:
The adaptation-rule works as follows: Every time the partial derivative of the corresponding weight wij changes its sign,
which indicates that the last update was too big and the algorithm has jumped over a local minimum, the update-value Δij
is decreased by the factor η-. If the derivative retains its sign, the update-value is slightly increased in order to accelerate
convergence in shallow regions.
Once the update-value for each weight –update itself follows a very simple rule: if the derivative is positive (increasing
error), the weight is decreased by its update-value, if the derivative is negative, and the update-value is added:
However, there is one exception: if the partial derivative changes sign, i.e. the previous step was too large and the
minimum was missed, the previous weight-update is reverted:
Due to that ‘backtracking’ weight-step, the derivative is supposed to change its sign once again in the following step.
There should be no adaptation of the update-value in the succeeding step. In practice this can be done by setting
= 0 in the Δij adaptation-rule above. The update-values and the weights are changed every time the whole
pattern set has been presented once to the network (learning by epoch).
Volume 3, Issue 2, February 2014
Page 155
International Journal of Application or Innovation in Engineering & Management (IJAIEM)
Web Site: www.ijaiem.org Email: editor@ijaiem.org, editorijaiem@gmail.com
Volume 3, Issue 2, February 2014
ISSN 2319 - 4847
3.2 Algorithm
The following pseudo-code fragment shows the kernel of the RPROP adaptation and learning process. The minimum
(maximum) operator is supposed to deliver the minimum (maximum) of two numbers; the sign operator returns +1, if the
argument is positive, -1, if the argument is negative and 0 otherwise.
For all weights and biases {
If (
(t - 1)*
(t) > 0) then {
Δij (t) = minimum (Δij (t - 1)* η+, Δmax)
Δwij (t) =- sign (
(t))* Δij (t)
wij (t+1) = wij (t) + Δwij (t)
}
else if
(
(t - 1)*
(t) < 0) then {
Δij (t) = maximum (Δij (t - 1)* η- , Δmin)
wij (t+1) = wij (t) - Δwij (t - 1)
(t) = 0
}
else if (
(t - 1)*
Δwij (t) =- sign (
(t) = 0) then {
(t))* Δij (t)
wij (t+1) = wij (t) + Δwij (t)
}
}
3.3 Parameters
At the beginning, all update-values Δij are set to an initial value Δ0. For Δ0 directly determines the size of the first weightstep, it is preferably chosen in a reasonably proportion to the size of the initial weights. A good choice may be Δ0 = 0.1.
However as the results in the next section show, the choice of this parameter is not critical at all. Even for much larger or
much smaller values of Δ0 fast convergence is reached.
The range of the update-values was restricted to an upper limit of Δmax = 50.0 and a lower limit of Δmin = 1e-6 to avoid
overflow/underflow problems of floating point variables. In several experiments we observed, that by setting the
maximum update-value to a considerably smaller value (e.g. Δmax = 1.0), we could reach a smoothened behavior of the
decrease of error.
The choice of the decrease factor η- and increase factor η+ was lead by the following considerations: if a jump over a
minimum occurred, the previous update-value was too large. For it is not known from gradient information how much the
minimum was missed, in average it will be a good guess to halve the update-value, i.e. η- = 0.5. The increase factor η+ has
to be large enough to allow fast growth of the update-value in shallow regions of the error function, on the other side the
learning process can be considerably disturbed, if a too large increase factor leads to persistent changes of the directions
of the weight-step. In all our experiments, the choice of η+=1.2 gave very good results, independent of the examined
problem. Slight variations of this value did neither improve nor deteriorate convergence time. So in order to get
parameter choice more simple, we decided to constantly fix the increase/decrease parameters to η+ = 1.2 and η-=0.5.
One of the main advantages of RPROP lies in the fact, that for many problems no choice of parameters is needed at all to
obtain optimal or at least nearly optimal convergence times.
3.4 Discussion
The main reason for the success of the new algorithm roots in the concept of ‘direct adaptation’ of the size of the weightupdate. In contrast to all other algorithms, only the sign of the partial derivative is used to perform both learning and
adaptation. This leads to a transparent and yet powerful adaptation process, that can be straight forward and very
efficiently computed with respect to both time and storage consumption.
Another often discussed aspect of common gradient descent is that the size of the derivative decreases exponentially with
the distance between the weight and the output-layer, due to the limiting influence of the slope of the sigmoid activation
function. Consequently, weights far away from the output-layer are less modified and do learn much slower. Using
RPROP, the size of the weight-step is only dependent on the sequence of signs, not on the magnitude of the derivative.
Volume 3, Issue 2, February 2014
Page 156
International Journal of Application or Innovation in Engineering & Management (IJAIEM)
Web Site: www.ijaiem.org Email: editor@ijaiem.org, editorijaiem@gmail.com
Volume 3, Issue 2, February 2014
ISSN 2319 - 4847
For that reason, learning is spread equally all over the entire network; weights near the input layer have the equal chance
to grow and learn as weights near the output layer.
4. EXPERIMENTAL RESULTS
The performance of the Resilient-Propagation Neural Network for image compression has been tested in various types of
images. Experiments were conducted by varying the sub-image size namely 4x4, 8x8, 16x16 and 32x32 pixels. The
Network was also tested by varying the number of Neurons in the Hidden Layer, resulting in Compression Ratios ranging
from 4 to 64. The PSNR values for various combinations were obtained. The comparative results using sub images of size
4x4 pixels and error tolerance of 0.01 for different Compression Ratios tested in various types of images and the PSNR
achieved by RPROP are listed in the table given below.
5. CONCLUSION
The Back Propagation Neural Network has the simplest architecture of the various Artificial Neural Networks that have
been developed for Image Compression. The direct pixel based manipulation, which is available in the field of Image
Compression, is still simpler. But the drawbacks are very slow convergence and pattern dependency. Many research
works have been carried out to improve the speed of convergence. All these methods are computationally complex in
nature, which could be applied only to limited patterns. The proposed approach is a simple method of pre-processing any
type of image.
(a)
(b)
(c)
Figure 1: Lena Images
(d)
(a)
(b)
(c)
Figure 2: Einstein Images
(d)
Table 1: Experimental results of the proposed approach
Sl. No.
Image
1
Lena (b)
2
Lena (c)
3
Lena (d)
4
5
6
Einstein
(b)
Einstein
(c)
Einstein
(d)
CR
MSE
PSNR
16:
2
16:
4
16:
8
16:
2
16:
4
16:
8
1390.
0
2115.
0
1742.
0
16.70
0
14.87
7
15.72
0
23.24
5
19.22
6
25.85
1
308.0
777.0
169.0
References
[1] M.Egmont-Petersen, D.de.Ridder, Handels, Image Processing with Neural Networks – a review”, Pattern
Recognition 35, 2002, pp. 2279-2301, www.elsevier.com/locate/patcog
[2] D. E. Rumelhart and J. McClelland. Parallel Distributed Processing. 1986.
Volume 3, Issue 2, February 2014
Page 157
International Journal of Application or Innovation in Engineering & Management (IJAIEM)
Web Site: www.ijaiem.org Email: editor@ijaiem.org, editorijaiem@gmail.com
Volume 3, Issue 2, February 2014
ISSN 2319 - 4847
[3] W. Schiffmann, M. Joost, and R. Werner. Optimization of the back propagation algorithm for training multilayer
perceptrons. Technical report, University of Koblenz, Institute of Physics, 1993.
[4] R. Jacobs. Increased rates of convergence through learning rate adaptation. Neural Networks, 1(4), 1988.
[5] T. Tollenaere. Supersab: Fast adaptive back propagation with good scaling properties. Neural Networks, 3(5), 1990.
[6] S. E. Fahlman. An empirical study of learning speed in back propagation networks. Technical report, CMU-CS-88162 , 1988.
[7] H.Braun, J. Feulner, and V. Ullrich. Learning –strategies for solving problem of planning using back propagation in
processing of NEURO Nimes,1991.
[8] Bogdan M.Wilamowski, Serdar Iplikci, Okyay Kaynak, M. Onder Efe “An Algorithm for Fast Convergence in
Training Neural Networks”.
[9] Fethi Belkhouche, Ibrahim Gokcen, U.Qidwai, “Chaotic gray-level image transformation, Journal of Electronic
Imaging -- October - December 2005 -- Volume 14, Issue 4, 043001 (Received 18 February 2004; accepted 9 May
2005; published online 8, November 2005.
[10] Hahn-Ming Lee, Chih-Ming Cheb, Tzong- Ching Huang, “Learning improvement of back propagation algorithm by
error saturation prevention method”, Neurocomputing, November 2001.
[11] Mohammed A.Otair, Walid A. Salameh, “Speeding up Back-propagation Neural Networks” Proceedings of the 2005
Informing Science and IT Education Joint Conference.
[12] M.A.Otair, W.A.Salameh (Jordan), “An Improved Back-Propagation Neural Networks using a Modified Non-linear
function”, The IASTED Conference on Artificial Intelligence and Applications, Innsbruck, Austria, February 2006.
AUTHORS
B. Das received his B.Tech (EEE), and M.Tech (ECE) degrees in 2004 and 2008 respectively from Biju Patnaik
of Technology, Odisha. Presently pursuing Ph.D (ECE) in Centurion University of Technology and Management,
Paralakhemundi, Odisha and working as an Associate Professor in EE Department of GIET, Gunupur, Odisha, India.
Dr. S. S. Nayak received MSc. (Physics), MCA, and Ph.D degrees in 1973, 2006 and 2001 respectively. At
Present he is working as Dean(R & D) in Centurion University of Technology and Management,
Paralakhemundi, Odisha, India.
Volume 3, Issue 2, February 2014
Page 158
Download