Multiplierless Algorithms for High-speed Real-Time Onboard Image Processing1 Kelvin Rocha, Anand Venkatachalam, Tamal Bose and Randy L. Haupt Electrical and Computer Engineering Utah State University Logan, Utah 84322-4120, USA Abstract-This paper presents a new approach for restoring noisy images with a substantial number of missing samples. The system proposed is based on the linear prediction theory. The filters used are multiplierless since they have power-of-2 coefficients. This makes the algorithms fast and low cost for VLSI implementation. The system is composed of two stages. In the first one, the lost samples are recovered using the Least Mean Square (LMS)-like algorithm in which the missing samples are replaced by their estimates. In the second phase, noise is removed from the image using a genetic algorithm based linear predictor. This algorithm yields power-of-2 coefficients of the filter. The results are very promising and illustrate the performance of the multiplierless system. TABLEOF CONTENTS 1. 2. 3. 4. 5. 6. INTRODUCTION LINEAR PREDICTION LOST SAMPLE RECOVERY ADDITIVE NOISE REDUCTION RESULTS which reduce the filtering operations to simple shifts instead of multiplications. The results obtained using the new algorithms are comparable to those obtained using algorithms with multipliers [3]. I I I Noise reduction sampfc b I Figure 1. Block diagram of the image recovery system The paper is organized as follows. The concept of linear prediction is explained in Section 2. The lost sample recovery algorithm based on the algorithm in [3] is presented in Section 3. Section 4 deals with additive noise reduction using the concept of the genetic algorithm. The experimental results and the concluding remarks are given in Section 5 and 6, respectively. 2. LINEARPREDICTION 4ij) CONCLUSION 1. INTRODUCTION Images acquired by electronic means are likely to be degraded by the sensing environment. The degradations may be in the form of sensor noise, camera misfocus, random atmospheric turbulence, sample loss and so on. The use of an adaptive filter offers an attractive solution for image restoration. A wide variety of algorithms for this problem has been devised over the years. The choice of an algorithm is determined by various factors such as rate of convergence, robustness, computational requirements, numerical properties and algorithmic structure. Multiplierless filters have received widespread attention in the literature because of their high computational speed and low implementation cost. Several design methods for these filters have been also developed [1][2]. In this paper we propose a system (Figure l), which uses multiplierless filters for recovering the lost samples and removing additive noise. Multiplierless filters use power-of-2 coefficients, I I Figure 2. Adaptive linear prediction model for 2-D signals Linear prediction plays a prominent role in many theoretical, computational, and practical areas of signal processing including noise removal and lost sample recovery [5][6].It deals with the problem of estimating or predicting the value of a signal x(i, j ) , by using a set of other samples from the same signal. The most frequently used structure is a transversal filter with coefficients adjusted to minimize some cost h c t i o n based on the error 'This work was supported in part by NASA contract #NAG5-3997 0-7803-723 1-X/Ol/$l O.OO/O2002 IEEE IEEEAC paper #391, Updated Oct 1,2001 4-1891 between the output and the desired signal. The classical model for the adaptive linear predictor for 2-dimensional signals is given in Figure 2 and described as follows. Given the signal d ( i , j ) ,the signal sample filter input is of the form x(i, j ) = d ( i - 0,j - 0 ) x ( i , j )at the wherep is the step size and the input pixels, given by 4i-M+l,j--N+l) ... x(i-M+l,j) - n) (3) where ~ ( i , j are ) the coefficients of a Finite Impulse Response (FIR) filter and is defined as the set of the index pairs of pixels taking part in the prediction. In our case, we use Non Symmetric Half Plane causality (NSHP) (see Figure 3), for which @ is given as ( 4) @=@&JQ2 al and Q2 0 ... (1) (n1,n)Ecp where ... $i,j-N+l) is the delay. The signal estimate {i} is calculated as c(i,j ) x ( i - nz, j ... x(i-M+l,j+N-l) 4,) = 3. LOSTSAMPLE RECOVERY As mentioned before, the concept of linear prediction can be used to recover lost samples in a given set of data. Each lost sample in the image is replaced by its estimate obtained from the linear prediction (3). When one or more lost samples take part in the prediction we use their estimates as the input samples [3]. Rewriting (3) to satisfy the conditions mentioned, we have (6) O U O Q O O O O O c(i,j)h(i--nz,j-n) i(i,j) = where h( k, I) is defined by x(k,Z) if sample is available h( k, 1) = (12) s^(k,Z) if sample is lost. To make the algorithm multiplierless we simply quantize the error e ( i , j ) and the elements in the coefficient matrix C to the closest power-of-2. The step size p is also a power-of-2 value so that we can replace all the multiplications involved in the update equation to simple shifts. Thus we get W ,j)= Q2 [e(i,A1 = Q2K1 i - where Figure 3. Pixels used in a Non Symmetric Half Plane causality for M=N=3. . c=[ i c(0,N-1) ... 0 '.. 1. (8) The update equation for the coefficient matrix, given the optimization criteria of minimizing the square error [ 3 ] , can be approximated as cl+l = c,+ 2p4L j)xi,, (14) represents the quantization operator. Now we s^(i,j:) = ... c(M-40) * - . C(M-J-N+l) 0 Q2 (13) can rewrite (11) and (9) as The output error sample is (7) e(i,j ) = d(i,j)- ?(i, j ) . The simplest way of updating the coefficients for adaptive linear prediction is to use the Least Mean Square (LMS) algorithm. Now, let C be the MX(2N-1) matrix of coefficients. At sample t, we have c(M-lJv-I) (11) (m,nW are defined as ={0 Si IM-1,O I j I N - l , i + j > 0} ( 5 ) m2 = (1 I i IM-l,-N+l I j I -l},M 2 2,N 2 2. 0 (10) d ( i , j ) = s(i,j)+w(i,j) (2) where { w(i,j ) } is an additive uncorrelated noise and 0 i(i,j ) = X z , , is the matrix containing (9) Z(i,j)h(i-nz,j-n) (15) (m,n)E@ c,+~ = E,+ 2 p ~ ( ij)x2,, , where (16) c(k,l) represents the (k,/)'* element in the quantized coefficient matrix c. The algorithm can be summarized as follows: (1) For any input sample, its estimate is computed using (15). (2) If the sample is lost, then it is replaced by its estimate. (3) If the sample is available, then: 0 The error is computed using (7) and then quantized using (13). The coefficient matrix is updated by means of (16) and then quantized using (14). 4-1892 - 4. ADDITIVE NOISE REDUCTION Noise reduction is concerned with filtering the observed image to minimize the additive noise. The effectiveness of the additive noise filters depends on the extent and the accuracy of the knowledge of the degradation process as well as on the filter design criterion. The algorithm which we are proposing below uses the concept of linear prediction as given in Section 2. The inputs are defined as in (1). For calculation simplicity we will represent all the coefficients in the matrix given in (8) in a vector form c, c, where = {c(M - 1,N -1),c(M - l , N - 2 ) ......0, O}. (17) Also, we define Q as the total number of pixels used in the filtering process. For a multiplierless filter all the elements - c of the vector have to be power-of-2 values. In [4] a new algorithm based on the genetic algorithm was proposed for adaptive linear prediction in I-D filtering. Applying the concepts given in [4] on 2-D filtering, we use the genetic algorithm to search for an optimal power-of-2 coefficient vector for filtering. This section briefly explains the idea of using the genetic algorithm for designing multiplierless filters. The Genetic Algorithm produces new chromosomes (offspring) based on the crossover operator { R } . The parameters defming the crossover operation are the probability of crossover (p,) and the crossover point. Two chromosomes from the population based on the selection parameter S are subject to crossover operation to produce new offspring. The new population comprising of the offspring produced by crossover operation is used for further genetic operations. There can be one or more crossover points. Experimental results have shown that multiple crossover points is beneficial to the search process [IO]. Mutation is a process by which a gene in the chromosome is randomly replaced with a randomly generated gene with mutation probability Each chromosome is assigned a fitness value { F ) . The fitness function for the qthchromosome in the gth generation { A 4 , g }is evaluated by the expression value is calculated using the fitness function 1,kcB (19) where 4.1 Genetic algorithm B is the block of pixels over which we evaluate the filter and calculate the fitness and c,,~ (m,n ) E The genetic algorithm (GA) emulates the evolutionary behavior of biological systems based on the mechanics of The GA can be represented by a natural selection [7][8][9]. 5-tuple as GA=(il,F,S,R,M) (18) where 2 : Population sue F :Fitness/ Objective function S :Selection operation R :Crossover operation M :Mutation operation. The population of GA is comprised of the sets of chromosomes where each chromosome contains the powers of two used to represent the filter coefficients. For example, the set of filter coefficient can be represented as 2;= (~,.2"~,s~.2"~ ,s2.2"',..., f4,g. The fitness ,0} where (s,} stands for the sign and {e,} is the set of integers - c,,g, the quantized filter coefficient vector obtained from the chromosome Aq,g 4.2 Algorithm Description The basic idea of this algorithm is to embed the evolution concept of the GA to provide a search during the adaptation period. The algorithm comprises of the following steps. Step I : The GA forms its initial population with size A . Step 2: The chromosomes of a population have to be evaluated for their fitness values so that we can fmd the best chromosome in the population. For evaluating the fitness we take into consideration the absolute value of the difference between the desired sample and the corresponding filtered output for a block of B pixels. The Block B consists of b, rows and b, columns. Consider the set of output pixels and the blocks 1, 2 & 3 in Figure 5. The values of b, and b, are 2 in this case. For this given representing the power. The corresponding chromosome ,...,g,,-,,} where each used in the GA is A = {g,,gl,g2 Block 1, we find the fitness of each chromosome using (19). {g,}, called the gene of the chromosome, contains the sign Basically, the bit ( s q ) and the power-of-2 {e,} as shown in Figure 4. is used to filter the output at - Cq,gcorresponding to the chromosome Aq,g (i-b,-l,j-b,), Figure 4. Representation of a gene (i - b, -1,j - b, -I), (i-br-l,j-bc+l), ..., (i,j) pixels. Now the sum of the absolute value of the difference between the filtered outputs and desired samples over the 4-1893 Block B gives the fitness value A q , g .Block 2 in Figure fq,g 5 represents the data points considered for evaluating the fitness of the chromosomes + for the (g 1)"' generation. Thus for every generation we slide OUT block by one column of pixels. Since the cq,gare power-of-2 values the I coefficients of the vector computations in (19) reduces to simple shifts and additions. 0 0 0 0 0 5 . RESULTS for each chromosome 0 0 The system for the simulation is given in Figure 1. The images "lena", "sf-b", "skylab" and "Viking" are used as the input images. After adding additive white noise, certain samples of the resulting image are assumed to be lost. The input sample first goes through the lost sample recovery algorithm. If the sample is lost, the algorithm given in Section 3 recovers it. The output of this lost sample recovery 0 ............q ................ 1 0 0 0 0 i3 q */0 0 0 0 0 - .- - - - - - - 0 b, J- Step 2:Evaluate fitness I 0 0 [Step 3:Store Bett Chromosome 4 Figure 5. Representation of a block scheme. Blocks 1 , 2 & 3 represent the pixels considered for fitness evaluation in the g", (g 1)" and (g 2)'h generations, respectively. + 1 Step 4:Filter the Output + I After the fitness evaluation, the chromosomes are arranged in an ascending order of their fitness value fq,g, i.e. the Step 6: Mutate first chromosome in the population has the lowest fitness value. Note that the lesser the fitness value, the better the chromosome. The flow chart for the entire algorithm is given in Figure 6. Step 3: The best chromosome Ab,g is extracted from the population and stored in memory. The chromosome remains until it is replaced with another one whose fitness value is better. Step 4: Now the coefficient vector F u s e d for filtering the signal takes the values from the best chromosome Ab,gstored in memory and filters the signal according to (3). Note that for every filtered sample we have one generation. Step 5: In the crossover operation two chromosomes are selected according to the selection parameter s and are subject to multiple crossover. The multiple cross points are random integers between 0 and Q.The chromosomes produced by the crossover are called the offspring. Step 6: The offspring produced are subject to mutation as follows. A random number is generated between 0 and 1. If this number is less than the mutation probability ( p n 2,) then the offspring are mutated. After mutation we go back to Step 2, where we evaluate the fitness of our offspring. After determining the fitness values of these offspring, we select our new population of size 2 by picking the best chromosomes available in our old population and the offspring. I 4 Figure 6. Flowchart for the genetic algorithm system is then subjected to noise removal using the algorithm mentioned in Section 4. Table 1 gives the power of the additive white noise, the percentage of samples missed and the !signal to Noise Ratio Improvement (SNRI) of the entire multiplierless system. Table 1. Experimental results The results are given in Figure 7, 8, 9 & 10 along with the result obtained by processing the image using the LMS algorithm with multipliers. The step size for the lost sample recovery algorithm is p = 24 . The filter used in the image recovery and noise removal is a non causal filter given in Figure 2. The filter coefficients for the lost sample recovery algorithm are initially zero. The total number of pixels involved in the filtering process is = 12 . The GA parameters arc: iz=12, p , =0.1, p , =1 & B(b, = 2, b, = 2). The genes of the chromosomes are 4-1894 represented simply by an integer which represents the powers-of-2 coefficients. The multiplierless filter coefficients lie in the set of {-2',-2-', ..., 4',0,24 ,..., 2-',2'}. The genes are encoded in four bits as {O, I , 2, 3, 4, 5, 6, 7, 8, 9,10,11,12,13,14,15) for the corresponding power-of-2 value. The Selection operator S is a random number between 0 and 2 . Note that the number of computations involved in the lost sample recovery reduces only to 2Q additions in case of a lost sample and @-I) additions otherwise. In the restoration algorithm, we have Ab,.b,Q Q z 2b,.bcQ additions per filtered output. + [7] H. Holland, Adaption in natural and artzjkial systems, Ann Arbor: The University of Michigan Express, 1975. [8] D.E. Goldberg, "Genetic and evolutionary algorithms come of age," Communications of the ACM, Vol 37, No.3, Mar. 1994, pp.113-119. [9] R.L. Haupt and S.E. Haupt, Practical genetic algorithms, John Wiley & Sons, Inc., 1998. [lo] G. Syswerda, "Uniform crossover in genetic algorithms", Proc of the 3rd Intl. Con$ on Genetic Algorithms, Morgan Kaufman Publishing, San Mateo, California, 1989. 6. CONCLUSION In this paper we presented a method to restore images using multiplierless filters. This algorithm uses a power-of-2 quantized version of LMS to recover the lost samples and the genetic algorithm for finding and updating the powerof-2 coefficients of the adaptive filter for noise removal. The experimental results given in Section 5 are comparable to those obtained with filters involving multiplications. It can be proved theoretically that the error in the quantized LMS algorithm converges in the mean. The details will appear in a future paper. The rate of convergence for the lost sample recovery algorithm is dictated by the step size p , whereas for the restoration algorithm it is dictated by the population size 100 200 300 400 500 100 200 300 400 Figure 7(a). Original a and the block size B . 500 100 REFERENCES [l] R. Thamvichai, T. Bose, and D.M. Etter, "Design of multiplierless 2-D filters," Proc.IEEE Intl. Symposium on Intelligent Signal Processing and Communication Systems, Nov. 2000. [2] P. Xue, "Adaptive equalizer using finite-bit power-oftwo quantizer," IEEE Trans. on ASSP, vol. ASSP-34, No. 6, Dec. 1986. 200 300 400 500 200 300 400 500 Figure 7(b). Noise + lost samples 100 [3] K. Assawad, E. Lahalle, J. Oksman, Real-time reconstruction of 2D signals with missing observation European Signal Processing Conference, vol IV,pp. 20732076, Tampere, Finland, 05-08 September 2000. I' 'I, [4] T. Bose, A. Venkatachalam, and R. Thamvichai "Multiplierless adaptive filtering", Digital SignaZ Processing, October 200 1. [5] B. Widrow and S.D. Steams, Adaptive signal processing, Englewood Cliffs, NJ: Prentice Hall, 1985. [6] S.S. Haykin, Adaptive jlter theory, Englewood Cliffs, NJ: Prentice Hall, 1986. 4-1895 Figure 7(c). Processed image (Multipliers) 100 200 300 400 500 100 200 300 400 500 Figure 7(d). Processed image (Multiplierless) 500 1000 500 1000 1500 Figure 8(d). Processed image (Multiplierless) 100 200 300 400 500 600 1500 Figure 9(a). Original Figure 8(a). Original 500 100 1000 200 1500 300 400 500 1000 100 200 300 400 500 600 1500 Figure 8(b). Noise + lost samples Figure 9(b). Noise + lost samples Figure 8(c). Processed image (Multipliers) Figure 9(c). Processed image (Multipliers) ,4-1896 100 200 300 400 200 100 200 300 400 500 600 100 200 300 400 200 400 600 Figure 10(a). Original 100 200 300 400 200 400 600 Figure 10(b). Noise + lost samples Figure 1O(c). Processed image (Multipliers) 400 600 Figure 10(d). Processed image (Multiplierless) Figure 9(d). Processed image (Multiplierless) Kelvin Rocha received the B.S. degree Electronics Engineering from the Pontificia Universidad Catolica Madre y Maestra, Santiago, Dominican Republic, in 1998. He is currently a research assistant with the Electrical and Computer Engineering Department at Utah State University, where he is pursuing the M.S. degree in Electrical Engineering. His main research interests are in the area of adaptive signal processing, adaptive control and numerical analysis. His current interests focus on adaptive processing of nonuniformly sample signals with applications to lost sample recovery. Anand Venkataclzalam is currently a hhsters student in Electrical & Computer Engineering at Utah State University. He received his Bachelors degree in Electrical and Electronics Engineering from Birla Institute of Technology and Science, Pilani, India in 1998. He worked with AB Bangalore, India from 1998 to 2000. His research interests are in the areas of adaptive filtering algorithms and image processing. Tamal Bose received the Ph.D. degree in electrical engineering from Southern Illinois University in 1988. He is currently an Associate Professor at Utah State University at Logan. Dr. Bose served as the Associate Editor for the IEEE Transactions on Signal Processing from 1992 to 1996. He is currently on the editorial board of the IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, Japan. Dr. Bose received the Researcher of the Year and Service Person of the Year awards at the University of Colorado at Denver and the Outstanding Achievement award at the Citadel. He also received two Exemplary Researcher awards from the Colorado Advanced Software Institute. Dr. Bose has published over 70 papers in technical journals and 4-1897 conference proceedings in the areas of signal processing and communications. He is a Senior Member of the IEEE. Randy Huiipt is Department Head and Professor of Electrical and Computer Engineering, Director of the Utah Center of Excellence for Smart Sensors, and Codirector of the Anderson Wireless Laboratory at Utah State University. He was Dept. Chair and Professor of Electrical Engineering at University of Nevada, Reno. Professor of Electrical Engineering at the USAF Academy, research engineer at RADC, and project engineer for the OTH-B Radar Program. He has a PhD from the University of Michigan, MS fiom Northeastern University, MS from Western New England College, and BS from the USAF Academy. He is co-author of the book Practical Genetic Algorithms, has 8 antenna patents, is an IEEE Fellow, and is recipient of the 1993 Federal Engineer of the Year Award. 4-1898