International Journal of Application or Innovation in Engineering & Management (IJAIEM) Web Site: www.ijaiem.org Email: editor@ijaiem.org, editorijaiem@gmail.com Volume 3, Issue 2, February 2014 ISSN 2319 - 4847 Image Compression using Resilient-Propagation Neural Network B. Das1, Dr. S. S. Nayak2 1 Associate Professor, Gandhi Institute of Engineering and Technology Gunupur, Odisha, India 2 Dean (R & D), Centurion University of Technology and Management Paralakhemundi, Odisha, India Abstract Feed forward Network using Back Propagation Algorithm is directly applied to image compression. Several compression schemes have been proposed to achieve quick convergence of the network with no significant loss of information. When these images are compressed using Back-propagation Network, it takes longer time to converge. Resilient-Propagation Neural Network (RPROP) performs a local adaptation of the weight-updates according to the behavior of the error function. The effect of the RPROP adaptation process is not blurred by the unforeseeable influence of the size of the derivative but only dependent on the temporal behavior of its sign. Keywords-Resilient-propagation Neural Network, Convergence. 1. INTRODUCTION Image compression research aims at reducing the number of bits needed to represent an image. Image compression algorithms remove redundancy present in the data in a way which makes image reconstruction possible. However, usages of the algorithms are dependent mostly on the information contained in images. A practical compression algorithm for image data should preserve most of the characteristics of the data while working in a lossy manner and should have lesser algorithmic complexity Artificial Neural Networks have been applied to image compression problems [1] due to their superiority over traditional methods. Artificial Neural networks seem to be well suited to image compression. They have the ability to preprocess input patterns to produce simpler patterns with fewer components. Artificial Neural Networks based techniques provide sufficient compression of the data in question. Many different training algorithms and architectures have been used. Different types of Artificial Neural Networks have been trained to perform Image Compression. Feed-Forward Neural Networks, Self-Organizing Feature Maps, Learning Vector Quantizer Network, have been applied to Image Compression. The Back Propagation Neural Network Algorithm performs a gradient-descent in the parameter space minimizing an appropriate error function. The weight update equations minimize this error.. The Backpropagation Neural Network takes longer time to converge. The compression ratio achieved is also not high. To overcome these drawbacks a new approach using Resilient-Propagation Neural Network (RPROP) is proposed in this paper. 2. RESILIENT-PROPAGATION NEURAL NETWORK 2.1 Back propagation learning Back propagation is the most widely used algorithm for supervised learning with multi-layered feed forward networks. The basic idea of the back propagation learning algorithm [2] is the repeated application of the chain rule to compute the influence of each weight in the network with respect to an arbitrary error function E: Where wij is the weight from neuron j to neuron i, si is the output, and neti is the weighted sum of the inputs of neuron i. once the partial derivative for each weight is known, the aim of minimizing the error function is achieved by performing a simple gradient descent: Obviously, the choice of the learning rate ϵ, which scales the derivative, has an important effect on the time needed until convergence is reached. If it is set too small, too many steps are needed to reach an acceptable solution; on the contrary a large learning rate will possibly lead to oscillation, preventing the error to fall below a certain value. An early way proposed to get rid of the above problem is to introduce a momentum-term: Volume 3, Issue 2, February 2014 Page 154 International Journal of Application or Innovation in Engineering & Management (IJAIEM) Web Site: www.ijaiem.org Email: editor@ijaiem.org, editorijaiem@gmail.com Volume 3, Issue 2, February 2014 ISSN 2319 - 4847 Where the momentum parameter µ scales the influence of the previous step on the current. The momentum-term is believed to render the learning procedure more stable and to accelerate convergence in shallow regions of the error function. However, as practical experience has shown, this is not always true. It turns out in fact, that the optimal value of the momentum parameter µ is equally problem dependent as the learning rate ϵ1 and that no general improvement can be accomplished. 3. RPROP 3.1 Description RPROP stands for ‘resilient propagation’ and is an efficient new learning scheme that performs a direct adaptation of the weight step based on local gradient information. The effort of adaptation is not blurred by gradient behavior whatsoever. To achieve this, we introduce for each weight its individual update-value Δij, which solely determines the size of the weight-update. This adaptive update value evolves during the learning process based on its local sight on the error function E, according to the following learning-rule: The adaptation-rule works as follows: Every time the partial derivative of the corresponding weight wij changes its sign, which indicates that the last update was too big and the algorithm has jumped over a local minimum, the update-value Δij is decreased by the factor η-. If the derivative retains its sign, the update-value is slightly increased in order to accelerate convergence in shallow regions. Once the update-value for each weight –update itself follows a very simple rule: if the derivative is positive (increasing error), the weight is decreased by its update-value, if the derivative is negative, and the update-value is added: However, there is one exception: if the partial derivative changes sign, i.e. the previous step was too large and the minimum was missed, the previous weight-update is reverted: Due to that ‘backtracking’ weight-step, the derivative is supposed to change its sign once again in the following step. There should be no adaptation of the update-value in the succeeding step. In practice this can be done by setting = 0 in the Δij adaptation-rule above. The update-values and the weights are changed every time the whole pattern set has been presented once to the network (learning by epoch). Volume 3, Issue 2, February 2014 Page 155 International Journal of Application or Innovation in Engineering & Management (IJAIEM) Web Site: www.ijaiem.org Email: editor@ijaiem.org, editorijaiem@gmail.com Volume 3, Issue 2, February 2014 ISSN 2319 - 4847 3.2 Algorithm The following pseudo-code fragment shows the kernel of the RPROP adaptation and learning process. The minimum (maximum) operator is supposed to deliver the minimum (maximum) of two numbers; the sign operator returns +1, if the argument is positive, -1, if the argument is negative and 0 otherwise. For all weights and biases { If ( (t - 1)* (t) > 0) then { Δij (t) = minimum (Δij (t - 1)* η+, Δmax) Δwij (t) =- sign ( (t))* Δij (t) wij (t+1) = wij (t) + Δwij (t) } else if ( (t - 1)* (t) < 0) then { Δij (t) = maximum (Δij (t - 1)* η- , Δmin) wij (t+1) = wij (t) - Δwij (t - 1) (t) = 0 } else if ( (t - 1)* Δwij (t) =- sign ( (t) = 0) then { (t))* Δij (t) wij (t+1) = wij (t) + Δwij (t) } } 3.3 Parameters At the beginning, all update-values Δij are set to an initial value Δ0. For Δ0 directly determines the size of the first weightstep, it is preferably chosen in a reasonably proportion to the size of the initial weights. A good choice may be Δ0 = 0.1. However as the results in the next section show, the choice of this parameter is not critical at all. Even for much larger or much smaller values of Δ0 fast convergence is reached. The range of the update-values was restricted to an upper limit of Δmax = 50.0 and a lower limit of Δmin = 1e-6 to avoid overflow/underflow problems of floating point variables. In several experiments we observed, that by setting the maximum update-value to a considerably smaller value (e.g. Δmax = 1.0), we could reach a smoothened behavior of the decrease of error. The choice of the decrease factor η- and increase factor η+ was lead by the following considerations: if a jump over a minimum occurred, the previous update-value was too large. For it is not known from gradient information how much the minimum was missed, in average it will be a good guess to halve the update-value, i.e. η- = 0.5. The increase factor η+ has to be large enough to allow fast growth of the update-value in shallow regions of the error function, on the other side the learning process can be considerably disturbed, if a too large increase factor leads to persistent changes of the directions of the weight-step. In all our experiments, the choice of η+=1.2 gave very good results, independent of the examined problem. Slight variations of this value did neither improve nor deteriorate convergence time. So in order to get parameter choice more simple, we decided to constantly fix the increase/decrease parameters to η+ = 1.2 and η-=0.5. One of the main advantages of RPROP lies in the fact, that for many problems no choice of parameters is needed at all to obtain optimal or at least nearly optimal convergence times. 3.4 Discussion The main reason for the success of the new algorithm roots in the concept of ‘direct adaptation’ of the size of the weightupdate. In contrast to all other algorithms, only the sign of the partial derivative is used to perform both learning and adaptation. This leads to a transparent and yet powerful adaptation process, that can be straight forward and very efficiently computed with respect to both time and storage consumption. Another often discussed aspect of common gradient descent is that the size of the derivative decreases exponentially with the distance between the weight and the output-layer, due to the limiting influence of the slope of the sigmoid activation function. Consequently, weights far away from the output-layer are less modified and do learn much slower. Using RPROP, the size of the weight-step is only dependent on the sequence of signs, not on the magnitude of the derivative. Volume 3, Issue 2, February 2014 Page 156 International Journal of Application or Innovation in Engineering & Management (IJAIEM) Web Site: www.ijaiem.org Email: editor@ijaiem.org, editorijaiem@gmail.com Volume 3, Issue 2, February 2014 ISSN 2319 - 4847 For that reason, learning is spread equally all over the entire network; weights near the input layer have the equal chance to grow and learn as weights near the output layer. 4. EXPERIMENTAL RESULTS The performance of the Resilient-Propagation Neural Network for image compression has been tested in various types of images. Experiments were conducted by varying the sub-image size namely 4x4, 8x8, 16x16 and 32x32 pixels. The Network was also tested by varying the number of Neurons in the Hidden Layer, resulting in Compression Ratios ranging from 4 to 64. The PSNR values for various combinations were obtained. The comparative results using sub images of size 4x4 pixels and error tolerance of 0.01 for different Compression Ratios tested in various types of images and the PSNR achieved by RPROP are listed in the table given below. 5. CONCLUSION The Back Propagation Neural Network has the simplest architecture of the various Artificial Neural Networks that have been developed for Image Compression. The direct pixel based manipulation, which is available in the field of Image Compression, is still simpler. But the drawbacks are very slow convergence and pattern dependency. Many research works have been carried out to improve the speed of convergence. All these methods are computationally complex in nature, which could be applied only to limited patterns. The proposed approach is a simple method of pre-processing any type of image. (a) (b) (c) Figure 1: Lena Images (d) (a) (b) (c) Figure 2: Einstein Images (d) Table 1: Experimental results of the proposed approach Sl. No. Image 1 Lena (b) 2 Lena (c) 3 Lena (d) 4 5 6 Einstein (b) Einstein (c) Einstein (d) CR MSE PSNR 16: 2 16: 4 16: 8 16: 2 16: 4 16: 8 1390. 0 2115. 0 1742. 0 16.70 0 14.87 7 15.72 0 23.24 5 19.22 6 25.85 1 308.0 777.0 169.0 References [1] M.Egmont-Petersen, D.de.Ridder, Handels, Image Processing with Neural Networks – a review”, Pattern Recognition 35, 2002, pp. 2279-2301, www.elsevier.com/locate/patcog [2] D. E. Rumelhart and J. McClelland. Parallel Distributed Processing. 1986. Volume 3, Issue 2, February 2014 Page 157 International Journal of Application or Innovation in Engineering & Management (IJAIEM) Web Site: www.ijaiem.org Email: editor@ijaiem.org, editorijaiem@gmail.com Volume 3, Issue 2, February 2014 ISSN 2319 - 4847 [3] W. Schiffmann, M. Joost, and R. Werner. Optimization of the back propagation algorithm for training multilayer perceptrons. Technical report, University of Koblenz, Institute of Physics, 1993. [4] R. Jacobs. Increased rates of convergence through learning rate adaptation. Neural Networks, 1(4), 1988. [5] T. Tollenaere. Supersab: Fast adaptive back propagation with good scaling properties. Neural Networks, 3(5), 1990. [6] S. E. Fahlman. An empirical study of learning speed in back propagation networks. Technical report, CMU-CS-88162 , 1988. [7] H.Braun, J. Feulner, and V. Ullrich. Learning –strategies for solving problem of planning using back propagation in processing of NEURO Nimes,1991. [8] Bogdan M.Wilamowski, Serdar Iplikci, Okyay Kaynak, M. Onder Efe “An Algorithm for Fast Convergence in Training Neural Networks”. [9] Fethi Belkhouche, Ibrahim Gokcen, U.Qidwai, “Chaotic gray-level image transformation, Journal of Electronic Imaging -- October - December 2005 -- Volume 14, Issue 4, 043001 (Received 18 February 2004; accepted 9 May 2005; published online 8, November 2005. [10] Hahn-Ming Lee, Chih-Ming Cheb, Tzong- Ching Huang, “Learning improvement of back propagation algorithm by error saturation prevention method”, Neurocomputing, November 2001. [11] Mohammed A.Otair, Walid A. Salameh, “Speeding up Back-propagation Neural Networks” Proceedings of the 2005 Informing Science and IT Education Joint Conference. [12] M.A.Otair, W.A.Salameh (Jordan), “An Improved Back-Propagation Neural Networks using a Modified Non-linear function”, The IASTED Conference on Artificial Intelligence and Applications, Innsbruck, Austria, February 2006. AUTHORS B. Das received his B.Tech (EEE), and M.Tech (ECE) degrees in 2004 and 2008 respectively from Biju Patnaik of Technology, Odisha. Presently pursuing Ph.D (ECE) in Centurion University of Technology and Management, Paralakhemundi, Odisha and working as an Associate Professor in EE Department of GIET, Gunupur, Odisha, India. Dr. S. S. Nayak received MSc. (Physics), MCA, and Ph.D degrees in 1973, 2006 and 2001 respectively. At Present he is working as Dean(R & D) in Centurion University of Technology and Management, Paralakhemundi, Odisha, India. Volume 3, Issue 2, February 2014 Page 158