Optical Flow Using Phase Information for Deblurring by Cheryl Texin Submitted to the Department of Electrical Engineering and Computer Science in Partial Fulfillment of the Requirements for the Degree of Master of Engineering in Electrical Engineering and Computer Science at the Massachusetts Institute of Technology May, 2007 ©2007 Massachusetts Institute of Technology All rights reserved. 4.j Author Department of! lect ical 1igineering and Computer Science May 17, 1998 Certified by "- -. "* _f ,¼qeyo. ,2001 Mi hael Matranga ad r, Charles Stark Draper Laboratory VI-A Company Thesis Supervisor 1441-5Q C oup Certified by_ 7). ri4fr k0aae0or) S. 6'im " SProfessor -' I..........'. Thesli~ s Supervisor Accepted by by Acceped C... Arthur C. Smith Professor of Electrical Engineering Chairman, Department Committee on Graduate Theses MASSACHUSETTS INSTiTUTE OF TECHNOLOGY OCT 0 3 2007 LIBRARIES ARCHIVES "· · ;.··· i: .n ;" Optical Flow Using Phase Information for Image Deblurring by Cheryl Texin Submitted to the Department of Electrical Engineering and Computer Science on May 30, 2007, in partial fulfillment of the requirements for the degree of Master of Engineering in Electrical Engineering Abstract This thesis presents a method for reconstructing motion-degraded images by using velocity information generated with a phase-based optical flow calculation. The optical flow method applies a set of frequency-tuned Gabor filters to an image sequence in order to determine the component velocities for each pixel by tracking temporally separated phase contours. The resulting set of component velocities is normalized and averaged to generate a single 2D velocity at each pixel in the image. The 2D optical flow velocity is used to estimate the motion-blur PSF for the image reconstruction process, which applies a regularization filter to each pixel. The 2D velocities generally had small angular and magnitude errors. Image sequences where the motion varied from frame to frame had poorer results than image sequences where the motion was constant across all frames. The quality of the deblurred image is directly affected by the quality of the velocity vectors generated with the optical flow calculations. When accurate 2D velocities are provided, the deblurring process generates sharp results for most types of motion. The magnitude error proved to be a larger problem than the angular error, due to the averaging process involved in creating the 2D velocity vectors from the component velocities. Both the optical flow and deblurring components had difficulty handling rotational motion, where the linearized model of the motion vector is inappropriate. Retaining the component velocities may solve the problem of linearization. Thesis Supervisor: Jae S. Lim Title: Professor Thesis Supervisor: Michael Matranga Title: Group Leader, Charles Stark Draper Laboratory Acknowledgments This work would not have been possible without the generous support and guidance of many people from both Draper Labs and the Electrical Engineering and Computer Science Department of MIT. First and foremost I would like to thank Mike Matranga of Draper Labs. I could not have asked for a better coach. Mike suggested this thesis topic among others and connected me with others at Draper Labs that have expertise in this field. Having launched me down this path, his enthusiasm, encouragement, and abundant assistance over the past two years propelled me to this target. Paul DeBitteto, Rich Madison, and Greg Andrews also provided much needed advice and guidance that helped shape this thesis. Rich in particular gave freely his extensive knowledge, based on his own thesis research, that was invaluable to me. I would like to thank Jae Lim of MIT for intriguing me and stimulating my interest in image processing while attending his course and for his support as my faculty advisor. Also, to Anne Hunter, who is always knowledgeable and supportive, thank you for your kind attention to me and for genuinely caring for each student undergoing the entire thesis process. And finally, I would like to thank my family and friends, without whom I would not have survived this past year. For their unwavering support and encouragement, I am extremely grateful. Contents 1 Introduction 15 1.1 M otion Estimation ............................ 15 1.2 Image Restoration ............................ 17 2 Relevant Work on Optical Flow 2.1 21 Brightness-Constancy .......... ....... .......... .. 21 2.1.1 Horn and Schunck ........................ 23 2.1.2 Lucas and Kanade ........................ 24 2.2 Phase-Based Methods: Fleet and Jepson . ............... 2.3 Method Analysis and Evaluation 2.3.1 Image Sequences 2.3.2 Parameter Choice ........ 25 . .................. . ......................... 28 ................. 29 3 Relevant Work on Deblurring 4 31 3.1 Motion Blur as Convolution ....................... 3.2 Noise ..................... 3.3 Blind Deconvolution 31 .............. 33 ........................... 36 Approach 4.1 28 37 Optical Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.1 Gabor Filters ..................... 4.1.2 Component Velocities ....................... 4.1.3 2D Velocity .. ... ...... 7 ...... .. 37 37 39 ... .. . .. ....... 39 4.1.4 4.2 40 Error Calculation ......................... D eblurring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . 41 4.2.1 Regularization Term r(w) .................... 41 4.2.2 Generating the Filter H(w) ................... 41 4.2.3 Applying the Deblurring Filter H(w) . ............. 42 4.2.4 Measuring the Deblurring Error . ................ 43 5 Results 5.1 5.2 5.3 6 45 O ptical Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 5.1.1 Optical Flow Test Sequences . ............... 5.1.2 Parameter Selection ........................ 53 5.1.3 Evaluating Error ......................... 61 . . . 45 66 D eblurring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . 66 5.2.1 Deblurring Test Sequences . .................. 5.2.2 Determining r(w) ......................... 68 5.2.3 Interpolation 69 5.2.4 Evaluation of Deblurring .... . .......................... . . . . . ... ... . . . .. 69 Combining Optical Flow and Deblurring . ............... 75 5.3.1 Artificially Blurred Images .................... 76 5.3.2 Real Im ages ............................ 79 87 Summary and Future Work 6.1 Optical Flow Summary .......................... 6.2 Improving the Optical Flow Method 6.3 Deblurring Summary ........................... 6.4 Improving the Deblurring Method . .................. 6.5 Future Work and Applications ...................... 87 88 . ................. 89 . .. 90 90 A Calculating Component Velocities 93 B Phase Identity Derivation 97 C Fleet and Jepson: Computing 2D Velocity D MATLAB Code 99 101 D.1 Optical Flow Code ............................ 101 D.1.1 function OF ............................ 101 D.1.2 function OF3Dsep 105 ........................ D.1.3 function gabor3Dsep ....................... 106 D.1.4 function compvel ......................... 109 D .1.5 function g3D 111 ........................... D.1.6 function validvel ......................... 112 D.1.7 function stdize_ dim 115 ....................... D .2 Deblur Code ................................ 117 List of Figures 2-1 Yosemite: Frame #9 and Ground-Truth Optical Flow Field ...... 29 3-1 Example PSF and its Fourier Transform . . . . . . . . . . ....... 3-2 Inverse Filter and Restoring Function. .................. 3-3 Inverse Filter with Threshold and Corresponding Deconvolution Kernel. 4-1 Real Parts of Gabor Filter and its Fourier Transform 5-1 Yosemite . . . . . . . . . . . . . • . .. 5-2 . . . . . . . . . . . . . . . .• . . . . . . . . . . . . . . . . . . . . . . . . . . Sphere 5-3 Office . 5-4 Street 5-5 Blocks 5-6 Vcbox ................ . . . . . . .o......°..ooo.. 5-7 Simple ............ o... 5-8 Medium ......... 5-9 •...... Complex 5-10 Mean Angular Error vs. Frequency . 5-11 Mean Magnitude Error vs. Frequency . . . . . 5-12 Mean Angular Error vs. Frequency; Difference between Original and Interpolated Tests ................ 5-13 Yosemite Mean Angular Error, Clouds Masked 5-14 Yosemite: Angular Error Map ......... . . . . . 5-15 Mean Angular Error vs. Temporal Support Size part 1 11 . . . . .o 55 ° • . . .o 56 . . . . . . . . 57 5-16 Mean Angular Error vs. Temporal Support Size part 2 5-17 Mean Angular Error vs. 0 . 5-18 Density vs. Frequency 5-19 Discontinuity Error . 5-20 Magnitude Comparison 5-21 qimnleP Sqh••nw Errnr 5-22 Images for Deblurring Tests ......... 5-23 Effects of Different r Values ......... 5-24 Office: Effect of Interpolation ........ 5-25 Deblurring Test Results . . . . . . . .... 5-26 Office: Zoom on Picture Frame ....... 5-27 Street: Zoom on Car . ............ 5-28 Vcbox: Zoom on Box Label ......... 5-29 Medium: Zoom on Car ............ 5-30 Deblurring Results Using Calculated Optical Flow 5-31 Effect of Magnitude Error on Deblurring .. 5-32 Effect of Angular Error on Deblurring . ... 5-33 Real Sequence 1 .............. 5-34 Real Sequence 2 .............. 5-35 Real Sequence 3 .............. 5-36 Real Sequence 4 .............. 5-37 Real Sequence Results . ........... 5-38 Real Sequence Results Using Scaled Velocity Vectors List of Tables 5.1 Mean Error............ . .. .. . .. . . ........ . . ... 62 Chapter 1 Introduction This thesis proposes a methodology to correct images that are blurred by motion. Many applications require sharp images, but motion of either the camera or objects in the field of view can cause the image to be fuzzy or blurred. The basic premise of the thesis is to estimate the motion vectors and use this information to restore the photo. The motion estimate is determined using a phase-based optical flow method. The image restoration process applies current deblurring techniques, utilizing a point spread function generated from the optical flow field. This chapter gives a broad overview of motion estimation and image restoration. The following chapters provide detailed descriptions of relevant work done in the fields of optical flow and deblurring. The two concepts, motion estimation and image restoration, are discussed separately as they developed independently historically. 1.1 Motion Estimation Motion estimation is the process of determining the direction of motion within an image. The resulting motion estimates are useful in many image processing and computer vision (CV) applications. The level of accuracy and resolution that is necessary in the motion estimates depends on the application for which the estimates are being used. This section describes some basic motion estimation methods and applications. One use of motion estimates is to compress video sequences, as in the MPEG encoding system [6], [24]. Where possible, MPEG encoded video streams transmit only the motion estimates and the prediction error, rather than the entire frame. This limited transmission decreases the amount of data that is transmitted and therefore allows for lower bit rates [24]. The encoding scheme transmits either the motion information or the intensity values based on which method requires the smaller number of bits to adequately represent the frame. Since the goal of using the motion information is to reduce the number of bits transmitted, and because increased resolution in the motion information does not necessarily improve the result, pixel or subpixel accuracy in the motion estimation is not desired. For video, the frame rate is fast enough that small distortions between frames are not noticeable to the naked eye and coarse motion estimates are acceptable [19]. Motion estimation can also be used for video to interpolate a frame at a time between given frames. Temporally interpolating an image in this manner can be useful to adjust between frame rates, as for going from a 30 frames per second format to a 60 frames per second formats in TV [24] as well as to generate slow-motion effects in video [16]. Many motion estimation methods are generally rather simple, including, for example, region or block matching. The region matching method involves searching for the best match to a section of the image [10], [24] . This method of motion estimation and many other similarly simplified methods make broad assumptions concerning the constancy in shape and size of the objects in the sequence [19]. These assumptions reduce the effectiveness of the restoration when objects in the image change over time due to reflections, deformations or occlusions. This type of estimation also works only with a relatively slow-varying image where each region has identifying characteristics. However, for many cases these assumptions are adequate for the application. Again, in video, the frame rate is fast enough that the artifacts from the processing are generally unnoticeable to the human eye. Motion estimation is also useful in Artificial Intelligence (AI) applications. These applications often require a higher degree of resolution. Within a given frame, this type of processing can help with segmenting and tracking objects, which is beneficial in computer vision for automatic recognition and tracking of visual input. Object tracking can help to reduce noise and enhance the image at occlusion boundaries [16]. This technique can be applied to machine recognition of facial expressions and aid in machine learning [5]. Another use of motion estimation is to reconstruct threedimensional descriptions from a flat 2D image [7], [16], [6]. In these varieties of applications, pixel or subpixel accuracy can be important, as the image will be interpreted at each frame and should be visually correct as a still. Due to the more stringent accuracy requirements, robustness with respect to variations in lighting and deformation of objects is important. Therefore the coarse motion estimation algorithms mentioned above are poor choices for these applications. The optical flow method of motion estimation can result in very dense, highly accurate velocity flow fields on the order of subpixel resolution [13]. This capability makes optical flow useful for those applications in which high resolution is necessary, in contrast to the low resolution estimators previously mentioned. Optical flow methods have also been used to create special effects in movies such as What Dreams May Come and The Matrix [26],[16]. Chapter 2 provides details on methods of generating optical flow. 1.2 Image Restoration The purpose of image restoration is to enhance or correct a degraded image. Image degradation occurs for many reasons including noise, quantization through digitization, or blurring. The correction process depends on the type of degradation that occurred. This thesis focuses on restoring images that were degraded by blurring due to motion, and this section describes the process through which the motion blur occurs. Blurring can be caused by many factors, including lens misfocus, atmospheric distortion, or movement during the image capture. [24],[10]. Blurring smoothes high frequency components, reducing detail. This blurring is essentially a low-pass filter effect, and so some details may be unrecoverable. Motion blur is captured in a given frame due to the finite exposure time necessary for the light to travel to the film or photosensors. This phenomenon is a consequence of the physics involved in taking a picture, regardless of whether the camera is film or digital . For simplicity, the concepts will be discussed in terms of digital cameras (pixels versus film grains), although the concepts apply to film cameras as well. A lens focuses light to the photosensors, which are activated by the light photons. The amount of light received at each sensor determines the degree of that sensor's contribution to the final image. The longer the sensor is exposed to light, the greater the degree of its activation. A large sensor can capture more light over the same time period than a small sensor can. However, the resolution of the final image will be lower, because there are fewer independent pixels. Motion during the exposure time causes the incoming photons to shift relative to the receiving photosensors. Therefore, neighboring sensors can receive similar visual information, and involvement of multiple pixels causes the blurring effect in the resulting image. The amount of motion captured, due to the spread of light across the pixels, depends on the length of the exposure time of the image and the velocity of relative motion between the camera and the image being captured. Exposure times are constrained by the light available as well as the mechanical limits of the shutter. The exposure time cannot be shorter than the speed at which the shutter can operate. The required exposure time is a factor of the sensitivity and size of the imaging sensors, as well as the external light conditions. Low light conditions require a longer exposure time in order to achieve acceptable contrast in the image. The motion blur is a relative effect, and therefore can be caused by movement in either the subject or the camera. When the camera is the source of movement, the blur field is generally consistent throughout the image [10]. However, motion in front of the camera may also contribute to the blurring effect. When the motion lies in the field of view of the camera, the motion contributing to the blurring may vary at points across the image, making this situation more difficult to analyze. Many cameras come with stabilization hardware to reduce handling motion on the part of the photographer. Hardware solutions have also been developed to track camera motion during the image capture and provide a motion estimate [3]. Sensors on the subject of the image can be used to track motion in front of the lens as well, but this sensor tracking requires cooperation from the subject, as well as foreknowledge of the moving objects to be photographed. This thesis assumes that there is no prior knowledge of the direction of motion from either the camera or the subject. In many cases the motion is undesirable from both a visual and practical standpoint. In addition to being more visually appealing, unblurred images are important for the future processing the image may undergo. Images with sharper edges and more distinct features and textures improve the ability of many AI applications, such as those mentioned in section 1.1, to track and segment objects. Furthermore, identification processes such as iris recognition require clear, high-resolution images to perform the identification . Acquiring adequately detailed images without postprocessing involves cooperative interaction between the subject and the imager which can be difficult in many cases. Estimating the motion blur is critical for accurately restoring the image to remove the motion. More details on the restoration process appear in Chapter 3. Chapter 2 Relevant Work on Optical Flow For the past 20 years, optical flow has become a popular technique for motion estimation in many AI applications as well as visual special effects. However, optical flow methods are far from perfect. There is a fundamental limitation in that the method requires the velocity to be calculated for two dimensions, while there is only a single value at each spatial position on an image. This results in a single equation with two variables. This deficiency is commonly referred to as the aperture problem. Assumptions must be made to generate a uniquely solvable system of equations. The assumptions generate limits on the quality of the results and often focus on brightness-constancy or phase-constancy. This chapter discusses these common approaches to solving this aperture problem. The first two methods illustrate brightness-constancy approaches, while the third method described here represents a phase-based approach. The chapter concludes with a discussion on the analysis and evaluation of optical flow methods. 2.1 Brightness-Constancy The common optical flow method of brightness-constancy assumes that the intensity of a given pixel does not change as the pixel translates from one frame to the next. This assumption is violated by changing light sources, rotation, and specular reflection, among other things [15]. However, in many cases this assumption is accurate over the time frame of the image sequence and can be used to generate an optical flow field. Brightness-constancy can be written as in Equation 2.1 below, with x = (x, y) and v = (, 2) = (u, v). I(x, 0) = I(x + vt, t) (2.1) This is equivalent to stating that the brightness is conserved over time, which can be expressed as: I(x, t) = c (2.2) Solving the above equation therefore becomes the fundamental problem in brightnessconstancy optical flow. Taking the first-order derivative of Equation 2.2, we get: dI(x, t) dt =0 (2.3) Using the chain rule for differentiation, Equation 2.3 becomes: 8I(x, t) dx fI(x, t) dy + + + az dt 0y dt I(x, t) dt = Ixu + Iv+It Iv+ at dt = 0 (2.4) Equation 2.4, called the gradient constraint equation in the literature [2], is the main equation for first-derivative gradient-based optical flow methods. Solutions to the brightness-constancy equation have also been developed using second order derivatives. However, these methods have been demonstrated to be less effective in practice [2], and therefore will not be discussed in this thesis. The deviation of this equation from zero will be called the error in brightness constancy Eb. Minimizing the error 6 b is the main objective in choosing values for the velocity vector. However, the aperture problem still needs to be solved, since there is a single equation in I with two variables u and v. Two methods provide the basis for much of the work done in optical flow using brightness-constancy methods. These are the first-order differential methods developed by the pairs Horn and Schunck and Lucas and Kanade. Sections 2.1.1 and 2.1.2 will describe these methods, respectively. 2.1.1 Horn and Schunck Horn and Schunck proposed a method in which they applied a global constraint to the brightness-constancy equation to arrive at a solution. They assume that the brightness is smoothly varying over the image, and use a minimum of least squares method to solve for the flow velocity. This generates a system of 2 equations involving Eb from Equation 2.4 above, and the smoothness constraint 68 given in Equation 2.5 below. This system of equations is then solved for the velocity vector. Smoothness constraint: du 2 d S+ E I- dv 2 2a a -x 2 + 2 + 2 2 (2.5) This leads to the total error E=2 (2E2 + e dxdy (2.6) where a is a weighting factor which helps mitigate noise effects, especially in areas where the intensity gradient is small. This total error e is minimized to solve for the velocity components u and v. The solution can be found by utilizing iteration techniques in order to simplify the computation. The combination of the global constraint on smoothness and the iteration techniques used in implementation generate a 'fill-in' effect for the resulting optical flow output [21]. In regions where the brightness gradient is small, the local velocities are propagated into the interior of the constant region. This can be useful to recover motion vectors for a smooth, translating object. However, this filling-in effect also illustrates some of the restrictions on using the global smoothness constraint, as rapidly changing velocities will be absorbed into the smoothing function as well and possibly lost in the final output. Still, for appropriate input images satisfying the assumptions used in this method, the optical flow output can be very accurate. Please see [21] for more specifics on the Horn and Schunck optical flow method. 2.1.2 Lucas and Kanade Lucas and Kanade used a similar approach to that of Horn and Schunck to calculate optical flow velocities. However, instead of applying a global constraint and minimizing the error over the whole image, Lucas and Kanade chose to apply their constraint to local regions Q, using a window function W(x). The assumption used here to solve the aperture problem is that within the region Q, the velocity vector is constant [2], [8]. Therefore, minimizing with a weighted least-squares solution, the Lucas and Kanade equation can be written: SW2 (x) [I + + It]2 (2.7) XEQ In most cases, the localized method of Lucas and Kanade is considerably more accurate than the global method of Horn and Schunck [2], [17]. Additionally, Lucas and Kanade provide a method for evaluating the accuracy of the velocity flow at each estimate. Eliminating the poor estimates contributes to greater overall accuracy, although it also decreases the density of the optical flow output [2]. Also, due to the windowing process of Lucas and Kanade, there is no gradient information for areas where the brightness is constant, and the velocity flow at that location cannot be determined. In contrast, Horn and Schunck's global smoothness approach is successful at providing a flow vector in those regions due to the fill-in effect. Therefore, both methods have practical usage depending on the image sequence. Work has also been done to combine these methods to exploit the strengths of each. These combination methods, such as the one described in [8], can mitigate the adverse effects from various differential systems by integrating components of both, to generate a dense, accurate optical flow field. Many other methods have been derived starting from the brightness-constancy assumption. These methods include affine-motion models, several 2nd order differential approaches, as well as region-based matching (described in section 1.1) and energy-based methods. However, these methods are generally less effective than those previously described [2], [17]. Additionally, the basic methods have been used much more frequently in comparison studies for new methods, and therefore provide a better basis for evaluation. 2.2 Phase-Based Methods: Fleet and Jepson Another approach to the optical flow calculations, based on phase gradients, was developed by Fleet and Jepson [13]. This approach assumes phase-constancy. This method is similar in derivation to those using the brightness assumption, and the approach is also similar in concept to frequency-based methods [13]. However, Fleet and Jepson demonstrated the robustness of phase gradients with respect to noise and variations in brightness [14]. This stability indicates that phase-based methods should perform better than local differential ones when the basic assumption of constant brightness is invalid. The phase gradients used in Fleet and Jepson's approach are developed by convolving the image with a set of complex Gabor filters. Each filter within the set is tuned to a specific frequency and orientation. A component velocity in the direction normal to the filter orientation can be determined for each filter tuning. The use of the normal direction solves the aperture problem for this approach. The combination of the component velocities for all filters generates a final 2D velocity. The calculation for Fleet and Jepson's phase-based optical flow method is explained below. Component Velocity Generating a component velocity corresponding to filter starts by convolving the image with a complex Gabor filter to generate a complex output R(x, t). This complex filter Gabor(x,t) is composed of a sinusoid modified by a Gaussian G(x, t), such that Gabor(x, t) = ej(Xt)'(fo~wo)G(x, t; C) (2.8) where the values fo and wo represent the frequency and orientation of the filter, respectively. Section 4.1.1 will describe the Gabor filter in greater detail. The output is the result of the convolution of the image with the complex filter given above. R(x, t) = I(x, t) * Gabor(x,t) (2.9) The complex output R(x, t) can be written in terms of its magnitude and phase, j (xt). The magnitude component p is subject to fluctuations in R(x, t) = p(x, t)ei brightness, and therefore the phase € is used to track motion over time in the image. The assumption of phase-constancy leads to the initial equation: O(x, t) = c (2.10) where c E R. Taking the derivative with respect to time, Equation 2.10 becomes: ¢xu + Cyv + ¢, = 0 (2.11) which is analogous to the gradient constraint equation 2.4 given above. Appendix B demonstrates one way to calculate the phase gradient from a discrete input. The normal to the gradient is defined as: xt)= n(x, t) x(x, t) x(x= ,t) (2.12) Writing the velocity in the direction of the normal as: v, = an(x, t) (2.13) solves the aperture problem. The value of the speed coefficient a can be derived from the phase gradient constraint. The results given in [13] indicate that the component velocities vn provide an accurate measure of the flow in the direction of the corresponding filter for a variety of image sequences, with the error measure being relative to the normal. The use of component velocities also allows multiple motions to be estimated at a given spatial location [12]. Multiple velocities can occur at places like an occlusion boundary, or where there are transparent objects present [9], and accounting for these conditions can greatly increase overall accuracy. 2D Velocity To construct the 2D velocity from the component velocities, Fleet and Jepson applied a least-squares solution to local estimates, which is further explained in Appendix C. However, the use of local estimates assumes smoothness of the optical flow field, which is one issue inherent in brightness-constancy methods that the use of phase was meant to circumvent. This implicit assumption of smoothness led to increased error around regions such as occlusion boundaries. Using this 2D velocity field also eliminates the capability of representing multiple motions [12]. Despite these inefficiencies, the 2D velocity derived from the phase-based approach still performed well, achieving similar results to those of Lucas and Kanade [2]. Like Lucas and Kanade, Fleet and Jepson provide accuracy measurements that can eliminate invalid velocity estimates to improve the performance. Additionally, Fleet and Jepson did work to demonstrate the robustness of phase-based methods to variations in brightness in [14], indicating that an improved method for developing the 2D velocity from the component velocities could further increase the accuracy of the optical flow output. After Fleet and Jepson's work on phase-based optical flow, various other phase-based methods were implemented. One of these implementations uses neural networks to combine the component velocities into a single resulting velocity flow [18]. Another implementation uses phase techniques to generate edge maps, which are then tracked to determine velocity estimates at those edge points [2]. Work has also been done to increase the computational efficiency of the phase-based method, which is generally quite computationally expensive [6]. Many of these methods claim similar accuracy to that of the Fleet and Jepson method but have not been included in large objective studies, so it is difficult to make an objective comparison. 2.3 Method Analysis and Evaluation Evaluating each optical flow method presents a problem, as there are no formally standardized methods of comparison. A few studies have compared different optical flow methods, such as in [2] and [17]. These studies have focused more on the brightnessconstancy methods than the phase-based methods. Additionally, even within these studies, there are some difficulties with the analysis. This section describes a few of the issues that arise when testing the performance of a given optical flow method. 2.3.1 Image Sequences One issue that arises is in the generation of the "true" flow field. Natural images do not provide a corresponding flow field, and it is difficult to create a field to match a natural image for any but the simplest of motions. Ray-tracing programs are used to approximate the ground-truth optical flow, but their capabilities can limit the complexity of the scene. For example, the ray-tracing program used in [11] does not take into account the motion of the shadows. Shadow effects are a cause for both illumination changes and multiple motions [12]. Neglecting these effects can lead to variation between the ground-truth and the calculated motion fields. Synthetic images are therefore frequently used to evaluate the success of a given optical flow method. The synthetic generation of the image allows for the creation of a flow field to match the motion in the image. Even these synthetic images tend to be simplified, containing broad regions of similar motion patterns. For example, in the comparison study by Barron [2], each test sequence analyzed a specific type of motion, such as translational or dilating motion. These broad motion fields often contribute to better optical flow results, as assumptions such as brightness-constancy are more likely to hold than in a more complex motion pattern. However, synthetically created images are not entirely reliable, either. The Yosemite sequence, one frame of which is shown below alongside its ground-truth flow field, in particular has been the subject of much debate, due to the motion in the cloud region. Professor Black from Brown University claims, "There is no ground truth for . . .. . . f . . 1 1111 d441 ei e //e . . . . . 11 /l// l C '///,////i////////l/ ///I'///////// //////////// / . . . . . . . . . . . . ......... ''... /e/ IeI / // . / / il / / ''' . . .. o . . . . . . . . . . . I .4,,411114 , 1111 £4 114i 1• 1 t111 . . . . ., ' I ///l//////////I/ll III llr ////////////l//// tllllil•r 1 llllr Figure 2-1: Yosemite: Frame #9 and Ground-Truth Optical Flow Field the cloud motion. [4]" If this is the case, there is no way to measure optical flow accuracy in that region, and therefore the region should not be used to evaluate error. Removing the cloud motion from the scene simplifies the flow field significantly by eliminating the occlusion boundary between the sky and the land [1]. However, since the image sequence is commonly used in the field, it remains useful for comparison among optical flow techniques. Hopefully more intricate synthetic sequences will be designed and standardized in the future. 2.3.2 Parameter Choice The parameter choice for each algorithm provides another point of contention in reviewing optical flow results. For most optical flow methods, there are adjustable parameter choices made in the implementation. These parameters include the size or shape of a window kernel or filter, as well as the thresholds for discarded invalid velocity values. However, since there is no predetermined optimal choice for those parameters, the selection of parameters for a given test can have a large impact on the results. In many papers, the reasons for using the parameters chosen are not given, and it is unclear that the parameter choice was optimal. Experimentally, it can be shown that there are different optimal parameter choices depending on the image sequence, as well. Work done in [7] attempted to define a set of guidelines 29 to assist in parameter selection for gradient-based optical flow methods, as well as demonstrating that results can be significantly improved with different parameter settings. However, this thesis will use experimentally determined optimal parameters tested over a variety of image sequences. Chapter 3 Relevant Work on Deblurring Deblurring is a method of image restoration aimed at sharpening edges and bringing out details in the image. A point spread function (PSF) is a mathematical descriptor of the way the blurring occurred. The accuracy of the modelled PSF is highly correlated to the capability of restoring the image. The following sections give a mathematical description of the blurring problem for motion and propose solutions for restoring the image assuming the PSF is known. In this chapter, as for the rest of the thesis, the common notation a(x) denotes the spatial function and A(w) denotes its corresponding frequency domain function. The Fourier transform property of convolution will be used. This property states that a(x) * b(x) in the spatial domain becomes A(w)B(w) in the frequency domain and a(x)b(x) in the spatial domain becomes A(w) * B(w) in the frequency domain. 3.1 Motion Blur as Convolution To model motion blur in one dimension, under the assumption that the sensors are linearly receptive to light, we can write that the response to a specific point of light with no motion at point x is T u(x) = P Idt 0 (3.1) where I is the intensity of the light, T is the exposure time of the pixel to the light, and t is the sensitivity of the sensor to light. This equation shows that the response of the sensor to a static light source is simply the total amount of light received at the sensor, scaled by the sensitivity of the sensor. Under the assumption that the intensity of the light is constant during the exposure time T, this equation becomes u(x) = pIT (3.2) Extending this concept for moving light sources, the response of a pixel to the moving light source is dependent on the amount of time the pixel is receiving light from said source. If f is the size of the sensor, and the velocity is constant with v = L/T, the light will travel across N pixels, such that N = L/IE. Therefore, for a single point of light on one sensor n, the effective exposure time of the pixel tn = T x EI/L. This relationship leads to the equation below: tn k, = / Idt (3.3) 0 This equation describes the total amount of light received at pixel n from one point of light. Assuming uniform light intensity during the exposure time, this equation is also the same as stating that the light received at the pixel n is the initial response u(x) scaled by the fraction of time that the light source affected the pixel, or the value E/L. kn = PI x tn = (lI x T) x EIL = u(x) x EIL (3.4) Therefore the total response u(x) is equivalent to the sum of the response at each N sensor along the light path, E kn [10], [24]. This response can be extended into two dimensions and be written as a convolution of the ideal image y(x) with the blurring kernel k(x) [22], [10], [24]. b(x) = y(x) * k(x) (3.5) Using a block diagram to indicate the convolution process, the blurring function can be shown as: y(x) -- k(• -- b(x) The process of deconvolution is therefore determining the estimate ý(x)by filtering the blurred image with some filter h(x). - b(x) -- y(x) (x) The estimated image is then: y(x) = b(x) * h(x) (3.6) Converting into the frequency domain, this becomes Y(w) = B(w)H(w) After substituting B(w) = Y(w)K(w) from equation 3.5, the estimated image in frequency is: Y(w) = Y(w)K(w)H(w) (3.7) To recover Y(w) from Y(w), a simple solution to this deconvolution problem sets the filter H(w) equivalent to 1/K(w). 3.2 Noise This method of inverse filtering for deconvolution is highly noise-sensitive. This sensitivity is because the blurring function tends to exhibit a low-pass filter effect, with small values at high frequencies. When the inverse is taken, the small values become large, amplifying error that exists at those frequencies [10], [24]. These frequency regions where the signal is small indicate that the signal-to-noise ratio is also very low, and therefore these regions will tend to amplify mostly noise [22]. Figure 3-1 shows an example low-pass filter PSF and its Fourier transform K. Figure 3-2 shows the 0.3 0.2 0.15 i 0.1 i o.05 4 -4 - a 2 _ Figure 3-1: Example PSF and its Fourier Transform. inverse filter H and its inverse Fourier transform, the deconvolution kernel h. Note that the scale for H shoots up to 1011. 4.Sxo" 4. I I Z5 2 0 r 1 1 1 ~ 2 rl I~ I .I 4 442*24*I Figure 3-2: Inverse Filter and Restoring Function. This noise amplification effect can be demonstrated mathematically by including the noise in the system model. Clearly, the type of noise modeled will affect the deconvolution solution. When the system includes some additive noise v(x), which could be contributed by quantization [23], for example, the block diagram becomes v(x) -1 where the blurring function b(x) and the recovered image Y(w), given below, include the additive noise. b(x) = y(x) * k(x) + v(x) (3.8) Y(w) = Y(w)K(w)H(w) + V(w)H(w) (3.9) The result after applying the inverse filter H(w) = 1/K(w) becomes V(w) Y(w) = Y(w) + K(w) K(w) (3.10) after inserting the new equation for B(w), corresponding to Equation 3.8. This expression mathematically explains the noise amplification described above. One method of mitigating this problem of noise is to apply a threshold to cut off the large high-frequency values. The effect of a simple threshold is shown in Figure 3-3. I Im 1400 -low as 13 4 -4 -2 0 2 4 s Figure 3-3: Inverse Filter with Threshold and Corresponding Deconvolution Kernel. It is easy to see that the kernel resulting from the thresholded H is much smoother, and therefore less likely to amplify high-frequency noise, although this smoothing also affects the accuracy of the resulting deconvolution kernel. One form of a threshold can be written using a regularization term r(w), such that H(w) K*(w) K*(w) IK(w) 2 + r(w) (3.11) For r(w) << JK(w) 2, this filter approaches the ideal inverse filter. However, when JK(w) 12 is small, the regularization factor r(w) helps to reduce the magnitude of H(w) and prevent the domination of the noise term. Choosing the regularization factor is important. The Wiener filter for deconvolution is a special case of regularization, in which the filter r(w) is the squared noise-to-signal ratio [10]. However, calculating the noise-to-signal ratio can also present a problem, as the power spectrum is not always easy to estimate. There are many variations on regularization filtering to improve its functionality. One major concept, used in many image processing applications in addition to Wiener filtering, is the idea of making the filter adaptable. Adaptive processes alter the filter for local signal properties to generate a more accurate result [24]. For example, the Wiener filter requires a spectral estimate of the signal and its noise properties. In solid background regions, the signal-to-noise characteristics will differ greatly from the characteristics in densely detailed regions of the image, and generating separate statistics for various portions of the image can help generate a more accurate PSF at each point. Additionally, deblurring operations can be iterated to increase the accuracy of the output. Other methods, such as the Lucy-Richardson algorithm, use iterated solutions to generate the deblurred image. 3.3 Blind Deconvolution Blind deconvolution is a method of deblurring the image when the blurring kernel is unknown. In contrast, all of the methods described above require that the blurring kernel is known. The blind deconvolution process involves applying constraints to the system in order to solve for the original image y(x). Information about the imaging system, as well as properties of the signal itself, can be used to provide constraints [23]. Assumptions concerning the original signal or the PSF can also be used to solve the blind deconvolution problem [24]. Once the constraints have been determined, the system is minimized to find the solution that most closely matches the blurred result. The MATLAB deconvblind function uses a maximum-likelihood approach to minimize the error between the restored image convolved with the predicted PSF and the initial blurred image. This function assumes a set PSF size, which is entered by the user in calling the function. This thesis uses the deconvblind function for comparison with the deblurring results. Chapter 4 Approach This thesis focuses on correcting motion blur using optical flow velocity fields. All of the image sequences used for testing were in grayscale. The optical flow and deblurring portions were developed separately and then combined at the end. 4.1 Optical Flow The optical flow method used closely follows that of Fleet and Jepson described in [13]. Phase-based methods were chosen because of their robustness with respect to changing illumination and deformations [14], [6]. This section describes the specific implementation developed in this thesis. 4.1.1 Gabor Filters In Section 2.2, the Gabor filter used for processing the image was briefly discussed. To review, this filter is a sinusoid modified by a 3-dimensional Gaussian envelope with covariance C. Gabor(x,t) = ej(Xt)'(fO~wo)G(x, t; C) (4.1) The values fo and w0 represent the frequency and orientation of the filter, respectively. For separability, which will speed computation, it is convenient to use a zero-mean Gaussian that is separable in 3 dimensions, each dimension with a standard deviation of a so that the covariance matrix C becomes 0 0 0 a2 0 a2 0 0 a2 The covariance argument to the Gaussian will now be referred to as the corresponding a. In the frequency domain, G(x, t; a) transforms into G(x,t;Uk) with aUk equal to 1/a. The exponential component represents a frequency shift, and the resulting filter is a Gaussian of ak located at fo, wo in the frequency plane. The value of a was determined as in [13], with an octave bandwidth Uk f such that fo(2/+1) (2 - 1) (2 + 1) (4.2) infrequency and a = 1/ak. Figure 4-1 presents an example Gabor filter, showing both the time and frequency domains on the left and right, respectively. For this filter, fo = .15, / = .8 and wo = 45'. For both domains, the center of the image is set to the coordinates (0, 0). The Fourier transform is shown from -7r to the x and y directions. Figure 4-1: Real Parts of Gabor Filter and its Fourier Transform 7r inboth 4.1.2 Component Velocities Using the basic equations from Section 2.2, we can derive a simple method for explicitly calculating the component velocities from the phase q(x, t) of the filtered image R(x, t). The basic equations are given again below: O.u + Oyv + Ot = 0 n(x,t) = 'X(X, t) t) SIlx (x,t)l| v, = an(x, (4.3) (4.4) (4.5) t) After simplification, shown explicitly in Appendix A, the final equations for the velocity components u, and v, are: The derivatives of ¢, ,x, O,, and un = + v- + ¢• C qt V(x, t)- = z (4.6) (4.7) can be determined using the phase identity [R*(x, t)VR(x, t)] (4.8) p2 (x, t) derived in Appendix B, and therefore the component velocities un and v. can be determined. 4.1.3 2D Velocity The 2D velocity was calculated differently than the method of Fleet and Jepson in [13]. As was mentioned in Section 2.2, Fleet and Jepson used a method of leastsquares over the local region, which implied an assumption of a smooth surface. They acknowledged that this solution required further work [12], as this assumption subjects the 2D velocity to the same constraints as the brightness-constancy method. In this thesis, the component velocities were combined using a normalization pro- cess across all filters, eliminating those velocities that fell outside a normalized distance of 2 from the mean, where there was a standard deviation of 1.5. The values of standard deviation and distance from the mean were determined experimentally, to generate accurate results while maintaining adequate density in the final velocity field. In order to completely cover the frequency space, the filter set was designed to cover 360 degrees. The filters were tuned to frequencies 2 a apart, so that each filter spanned a radius of a. The frequency extent of the filter was limited to one u. Due to the inability to simultaneously localize in the space and time domains [20], [25], this frequency restriction causes spreading in the space domain. However, limiting the frequency range of the filter reduces the error for the corresponding component velocity and is used as one of the measures for eliminating invalid velocity values in [13]. The choice of a distance of a between filters was used because it sets a narrow frequency band while limiting the spreading in the space domain. 4.1.4 Error Calculation Assuming the ground-truth flow field is known, there are multiple methods of calculating the error between the ground-truth flow field and the output field of the optical flow method being tested. One of the error calculations used in this thesis is the angular error measure described in [12], given below in Equation 4.9. This error measure was also used in the comparison study performed by Barron et al. [2]. Since many newer methods reference the Barron study frequently, using the angular error measure allows the results to be analyzed in reference to the previous work. Using the notation c for the correct velocity vector and e for the velocity estimate, the angular error measure is given by: TIEZ = arccos (C - e) (4.9) The second error measure is a simple magnitude measure, also used in [17]. PElll- = II - ell (4.10) These two error measures completely specify the vector by including the direction and magnitude. Additionally, these error measures are useful because of their compatibility with the fspecial function used in the deblurring process that will be described in Section 4.2. 4.2 Deblurring A regularization filter, as described in Section 3.2, was used on each point in an image to deblur it. The filter is based on the estimated motion blur and an experimentally determined value to adjust the extent of its effect on the image. The motion blur estimate is produced by the MATLAB fspecial function with the 'motion' argument. This section provides further details on the implementation of the filter and its application to the image. 4.2.1 Regularization Term r(w) The regularization term r(w) is a constant that was experimentally determined to achieve adequate image sharpening while maintaining reasonable magnitudes. The use of a constant effectively sets a bound on the magnitude of the filter at all frequencies. The term r(w) is frequency-independent because signal and noise properties are assumed to be unknown in this thesis. A frequency-dependent term, such as the one used in Wiener filtering, could be applied to images where information on signal and noise properties is available. 4.2.2 Generating the Filter H(w) To create the regularization filter H(w), MATLAB's fspecial function was used with the 'motion' argument, which simulates the effect of camera motion. The other two arguments to this function are an angle Z and a length L. The values of these arguments were determined for each point on the image using the velocity vector v for that point. For a given point, the angle value Z was determined using the 4 quadrant arctan of v, while the length L corresponded to the magnitude of the v. The output of the fspecial function using Z and L corresponds to the motion blur kernel k(x) for the input image. The filter H(w) was then computed directly from K(w), the Fourier transform of the PSF, and r(w), using the regularization filter equation 3.11 repeated below. H(w) =w) K*(w) IK(w)|2 +r(W) (4.11) Discretization affects the calculation and processing of the deblurring filter. The conversion of the PSF into the deblurring filter occurs in the frequency domain and the resolution of the discrete fourier transform is limited to the size of the function in either dimension. In addition, the calculation of the PSF requires an angle and a length. The discretization of the pixels means that the angles are fit to the closest approximation, and that the lengths are rounded to integer values. As the lengths of the vectors increase, there are more pixels available to achieve a more accurate angle approximation, as well as to reduce the effect of rounding to the nearest integer. Therefore the value of L passed into the fspecial function was a scaled value of the original magnitude. The image was interpolated a corresponding amount prior to filtering to maintain the relative velocity-per-pixel ratio. Interpolation increases the computational power needed but helps to minimize error due to discretization. 4.2.3 Applying the Deblurring Filter H(w) Filtering the image occured in the space domain, due to the space/frequency simultaneous localization problem. To maintain localization in the space domain, the degraded image was processed by the inverse Fourier transform signal h(x) of H(w). Because convolution is the inverse process of multiplication when switching between domains, this would normally increase the computation significantly. However, since the convolution is only being calculated at a single point for each velocity vector [PSF], the computational increase is small. The signal h(x) is flipped over both axes, aligned with the interpolated image, and the two signals are multiplied. The sum of the results of this multiplication is equivalent to the convolution of the two signals at the point in question, and this value becomes the value of the output image at that point. Conveniently, the error measures applied in the optical flow component (see Section 4.1.4) are also in the form of an angle and a length, and therefore the amount of error generated in the optical flow portion of the processing should correlate to the error in the deblurring portion of the processing. 4.2.4 Measuring the Deblurring Error Specifically calculating an error value for the deblurring process is difficult. Therefore, the margin of error is determined subjectively, by visually comparing the reconstructed image to its original. The specific points of interest in the comparison were image sharpness and detail. Additionally, the output image was compared to the results of a blind deconvolution, as well as the results of a LR filter using the average PSF of the image. Chapter 5 Results This section evaluates the performance of the system. An explanation of the testing procedures applied is given first, and the results of the tests follow. The optical flow and deblurring components are initially analyzed separately, and at the end the combined results are given. 5.1 Optical Flow This section gives the experimentation and analysis for the optical flow component of this thesis. 5.1.1 Optical Flow Test Sequences The test sequences in this thesis include the Yosemite sequence mentioned earlier, as well as some other synthetic images. These synthetic sequences were retrieved from [11], where the ground-truth was calculated using a ray tracing program. For areas where a section of an image remains constant in intensity despite the motion, the ray tracer returns a value of approximately zero for the velocity. This is noticeable where large, opaque, and smooth objects translate across the image; the motion of the object's interior is not captured. Additionally, in regions where the magnitude of the velocity approaches zero, it is possible to get large angular errors, as the angle of a vector with zero magnitude has no practical meaning. Therefore, these regions with near zero velocities will be masked out for error calculations. Another important point to note is that the given ground truth is only relevant for that specific frame of the test sequence. In the case that the direction of motion altered between frames, succeeding frames of the same sequence have different ground-truths. The values used to calculate error consisted of an average over the ground-truth values from all of the frames used in the test. In the series of figures below, a frame of a given test sequence will be presented along with its corresponding ground-truth optical flow field and a description of the motion. The optical flow image shown is subsampled for ease of viewing and represents the relative, rather than the absolute, magnitudes of each vector. The Yosemite sequence is the one described in Section 2.3.1. This sequence contains a landscape which includes dynamically changing clouds on the top. The motion in the sequence is that of a camera approaching the scene, with the focus of expansion towards the middle on the right side. The sky component presents motion generally to the right, but with other motions occurringv within the clouds themselves. .......... , , ,, ~~ . -------------_--_ ---------- Figure 5-1: Yosemite The Sphere sequence consists of a checkered sphere rotating in place over a constant background. The rotation axis of the sphere is tilted forward slightly, so that the rotation of the far side can be seen. ·____ \ ~\_______I ~_____Lrl~l i I \\~~-------~~~I \\~~5---\~~\~-------riill ~rl ~~-·~-----~'~C~ ,---- Figure 5-2: Sphere The Office sequence represents a situation in which the camera is approaching the scene. The focus of the dilation is in the center of the image. \\\\\\~\ ( ............... II~~~ ......... Figure 5-3: Office The Street sequence contains a large constant background, with a car translating across it. The motion of the car varies between the wheel rotation and the translation. The camera pans slowly towards the right as the scene progresses. The car is partially occluded in the initial frames. ---------------------------------- ------------------------------ ------------------------------------------------------------------------------------------------------------------------------ -------------------------------:--------------------------------- ------------ - ---------------------------------------- --- ------------- --------------------------------------------------------------------------- Figure 5-4: Street The Blocks sequence illustrates rotational and dilational motion of the camera simultaneously. A group of blocks sit on a textured background, while the camera moves in and around the scene. Unfortunately, the motion for this sequence varies drastically between frames. This variation can be seen in Figure 5-5, where the three flow fields shown (corresponding to the first, second, and third frames of the sequence) are considerably different. The effects of this variation in motion will be discussed with the results. ......... ()IIIIIII:...........~1/ ·\\~\\~~\~\\~~\~(L)L) ..... ...... ..... ...... lll",ý zzllll ~ ~ýý.. LLL) ·\\~\\\\\~\~ ............ 1~11 ............ .\~\\~\ ~ \II\ L)IIIIII( II 1~1 11 111 .. ............. .......... ......-.............. ........... \~~\i~\\\\\\\\\ ;i4\i~\\\~~\~\~~~~~\\\\\\\\Mills;~:I \I\\ .2~~-~-~\ ~c~~~\ ~ ~ ........... ,,,,,.. ............. ., r\\\\ --- . .\~~\\~,\, --------- ~I \~i~ I I~~ .. I Figure 5-5: Blocks In the Vcbox sequence, the camera approaches a labeled Visual C++ box, positioned on a textured background. However, the focus of the expansion shifts between frames. This can be seen by observing the flow fields (again corresponding to the first, second, and third frames of the sequence) in Figure 5-6. The area containing the short velocity vectors represents the focus of expansion. 49 1_____________·_~~··~~·1~··1·1111··_llr -T----~5-·~-~~·--~\~~\~~\r~~·~Ltl)llllll ----------~---~~'\\~\~~"~)'(11111(11~1 · ) I ( I I I I I I I I 1 1111111 I · ( I I I I I I I I 1III1 Y----r----r----llrrrrrrl VZL~rr-rrrrrr-rrlrrrrr·······.·-rr-----Wcr~ZccccccccrllllllI)l······-··r__r------------------- 1141~111·1·1·1·.··-~·--C~-~~~~-~~CCrllllll~111111111~( C~rrrC~~CrCrCrr~l~l C~~---~---~~r,,rrrrl~1111111~111111\~~~~ CC/-Cr~~llllCl~llrll111111111111(111\~~\ r~~~CC/IC~II~~III~I~II ( I I I 1 1 I I ) I 1 I I I I ~~~\\~~~~ r/~/~~/~~~l~~l~~~l~l~~~1111 11111(1~~~~\\\~~~~~ C~~~~~~~~~II~I~I~~II I I I ) ) I f I I LI I ( ) I I \ \~\\\~~~~ C/~~~C~~~~~~~~I~III~~I~( \\\\\\~~~~ C~~~~~r·~ll~l~ll~l I II I I I I I)II (1 I1 I( II II II (It( II\~ (\\\\\~~~~ 1//11~11111111~111 rll I I I I I I f 1 1 I I ( 1 I \ \\\\\~\\\~ r//~/~lll~~llllll 1 1(11 I) ( ( I t t I I ( I I \ I I\ \\\\\~\~ ~~////111/1111111 111) ) I) I ( ( t t I ( I I I\ \\\\\~~~\\~ 1//~~/111~111~111111I 11) I I I t 1 1I I \ 1 \ ~ \\\\\\\\~~ l~llllrlllllllllllllI ""' · "' ' ' ' ' · ' ' ' ' ' ' · 1) ' · )· )' )' )· f' I' I' (· \' \· I· \· 'I \\\\\\\\~~ ' \ ' ' · ' ' "' ·I-~\\\~~\\\T~~-rr\\\r7\1 l~rT111111111 III111111(11 -. ~\-r\\\\\\~\\\\\\\\~\\\\~lllr(rlllrl/ll/ ·~\\~~~~·\\~\\\\\\\\\\\~~L~~II1111111/1/ ·~\~\·~~~\\~\\\\\\\\~\~~~LllllllIlll/~l/ ~~~~~\\~\\\\\\\\\~\ II)III I 11111//////111/1// ·~\~\~\\~~\\\\\\~\\\)LIL\)~IIIII((//~III ·\\~~~\\~\\\\\\\\ \ \ ) I LI ) ( I I I I I ) ( ( I I I I 1 1~111~/~ ·~\~~~~~~~~\\~\\\~\~\L)))I~I(IIII(II~/I/ ·-- ~-~\~\\\\~\\\\~\~LI\I(II1(II()IIIIIII ~~~~~\~\~\~\\\\~\L( (III I I i I i 111)11/1111~1111 ~~~~~~~~\\\\\\\\\~ LIII I I I I 111 11)111111~111~~ ··~--~-·-·-·5~~~~5~\\~\~ \-·\\~~~~\\\\\\\\\~~(\\~(~lllll(l)llllr ~~~1·1 · I II II · I·rllrrrr-llre ·~~-·-·~~~-~\\~~~~~~~\~~~~llllllll,·rrrr I · i _ . r - - r r r---r-I · · I · r · r -- - - r r-----rC15~-~-~-·5--~~-\r-- ~-~-)))I I-· I-II---rCCC------------------ _---~~L)I~)·------------- ~- ) ) I ~---~------~~-~-------------··~I -·----------------------~..· ~·~~----~-..........\ ............ . . i ii j:: :: UCC-2-------r~~ I-rrC~rCCCCC~ZII~III4111·11·1·11 ~CCCCCCrCr~-CIIIIIIIIll~llllll·~tl\ll\~\ ·------- ~C-~ll-~~l~~~111(·)111~·~1~~~1~~ ·CC~CC~C~~~I~~~~II~~11~11((1111(11\·~\\\ · ~~~~~~~~rCIII II III I r · · I I · I I ( ( · I I I I I I I I \ 1~1\~~ Figure 5-6: Vcbox The Simple sequence contains two objects, each rotating in different directions. One object is a checkered sphere, as in the Sphere sequence, but with a vertical axis of rotation. The second object is a solid square close in color to that of the background. Additionally, the objects move into occluding positions during the sequence. One drawback to this sequence is that the motion of the shadows is not given in the ground truth, although the shadows clearly follow the translation of the objects. Figure 5-7 shows the first and last frames along with their corresponding velocity flow fields. Observe the object shadows in the images on the left, and the lack of a corresponding velocity vector in the flow fields on the right. 50 ~ -•- - ------------ ---- • ..... i ii , :: ......... m ! Figure 5-7: Simple In the Medium sequence, a car sits on a checkered floor. As the camera approaches the car, the camera swings around to view the car side-on, combining dilational and translational motion. This sequence raises a point of contention about whether the car is in motion, since the location of the car on the checkered pattern changes. This change in location causes the car to appear to move, as can be seen in the frames shown in Figure 5-8. However, the ground-truth provided by the ray-tracing program indicates that the floor beneath the car is in motion. Figure 5-8: Medium The Complex scene is viewed from the windshield of a car driving along a road, passing buildings and trees on either side. A car pulls out of a side street, and another car approaches from the front. .. .. .. .. .. Figure 5-9: Complex 5.1.2 Parameter Selection Using the test sequences described above, a series of experiments was run to determine the optimal choice of parameters for a given type of sequence. For the plots in this section, frequency is given in radians, angular error is given in degrees, and the density is given in percent coverage of the image. The values for / and the magnitude error are scalars and have no units. Frequency The optimal operating frequency was determined experimentally. Frequencies between .02 and .15 were tested with / = .8 and using 3 temporally separated frames of a given sequence. The results for the average angular error at each frequency are given in the plot below, Figure 5-10. mean angular Error 10 16 14 Sphere -Office Street -Blocks -Vcbox ---- Simple Medium Complex Yosemite SlO SIC a Figure 5-10: Mean Angular Error vs. Frequency The average angular error is mostly constant over the range of frequencies, trending higher at the extreme low and high frequencies. Most of the power in the image is concentrated at low frequencies, and therefore it is difficult to separate out the direction at those frequencies. At higher frequencies, noise becomes more prominent among the values selected by the filter. Additionally, at extreme high frequencies, implementation in a discrete environment becomes more difficult. Aliasing becomes an issue in the frequency domain, and the effect of the aliasing in the time domain is that the image must be interpolated in order to represent an adequate number of pixels. In general, frequencies around .06 Hz seem to be successful. Meanwhile, the mean magnitude error remains constant across all frequencies. This effect is shown in Figure 5-11. For the majority of images, the magnitude error is less than approximately 3. However, for the medium and complex sequences, the magnitude error blows up. This effect can be explained due to some irrationalities in the magnitudes provided for these test sequences. In the Complex test sequence, the velocities approach a magnitude of 150 pixels per frame. The image dimensions are 300 x 400, and such high velocities would completely distort the image within very few frames. Therefore, the magnitudes provided appear to be in error. In the Medium sequence, the motion between the car and the ground could be disputed, as explained in Section 5.1.1. mean magnitude Error IO 14 12 --- Sphere Office Street ---- Blocks -- Vcbox Simple Medium Complex Yosemite 8 w 10 --4 2 a, 602 d'.02~ 00 0.4 0.04 0.06 .0 0.08 f9 . .2 .4 0.1 0.12 0.14 01 0.16 Figure 5-11: Mean Magnitude Error vs. Frequency Effect of Interpolation For the frequencies tested here, the results from interpolating the image in preprocessing are similar to the results without any interpolation. At higher frequencies, interpolation would be necessary to avoid aliasing. The difference in average angular error between the interpolated image and the original is plotted in Figure 5-12. A mean angular Error ---- 0' 1 F w a-. -U 'U 1 - E Sphere Office Street Blocks Vcbox Simple Medium Complex Yosemite 16 ;6 Figure 5-12: Mean Angular Error vs. Frequency; Difference between Original and Interpolated Tests positive difference indicates that the interpolated test for the corresponding sequence performed worse than the original. A negative difference indicates that the interpolated test was preferred over the original. However, while the negative differences are within a negligible two degrees of the original results, the positive differences can contribute up to almost 18 degrees of additional error. The conclusion is therefore that overall, for these low frequencies, interpolation is unnecessary and may be detrimental to the accuracy of the results. This outcome can be explained by the side effects of interpolating. Interpolation assumes smooth changes between points in order to estimate the value between the points. The interpolation smoothes the gradients that the optical flow process uses to determine the flow, and therefore can degrade the quality of the optical flow results. Yosemite Cloud Region As discussed in Section 2.3.1, there is conflict over the validity of the cloud region in the Yosemite sequence. If this region is masked, the angular error drops approximately 10 degrees for all frequencies. The dashed line in Figure 5-13 represents the masked version. mean angular Error 16 1n _ Figure 5-13: Yosemite Mean Angular Error, Clouds Masked This effect can be observed qualitatively by viewing the angular error map of the Yosemite sequence, presented below (Figure 5-14) for fo = .06 and 0 = .8. Bright regions indicate higher angular errors; note the prevalence of these bright areas in the sky. Figure 5-14: Yosemite: Angular Error Map Temporal Support The testing above was run using 3 frames. Further testing was done to compare the average error when 5 and 7 frames were used. In the plots in Figures 5-15 and 5-16, the solid, dashed and circled lines represent 3, 5 and 7 frames, respectively. The horizontal axis for each of these images is fo, and the vertical axis is the mean angular error in degrees. As one can see from the results in Figures 5-15 and 5-16, the average angular error generally remains constant regardless of the number of temporal images used to generate the optical flow output. However, in most of these cases the flow was approximately constant across all images in the sequence. In the images where the direction of motion changed within the time period of the sequence tested, such as the Blocks image, variation in the angular error for different numbers of images is observed. For these rapidly varying motion patterns, the use of more temporally separated images allows for greater averaging of the flow in order to settle on the overall trend of the motion and to determine the flow path. For the test cases, the correct flow fields were also averaged, and therefore this result demonstrates that the optical flow implementation determines the average motion over the image sequence provided. However, for use in deblurring, the shorter image sequences should provide a better K ,~ 4., 4 nl " 0.05 0.15 0.05 0.15 Office Sphere - 74--------- 7 . .. 9 i' 0.05 0.15 0.05 Street '--'-.4 0.15 Blocks ~" ~ .. 0.05 0.15 Vcbox 0.05 0.15 Simple Figure 5-15: Mean Angular Error vs. Temporal Support Size part 1 ,.. .------.. ----,,, . --------- .------- 0.15 0.05 Medium 0.05 ..- 0.05 0.15 Complex 0.15 Yosemite Figure 5-16: Mean Angular Error vs. Temporal Support Size part 2 representation of the specific motion from frame to frame. This result indicates that for most cases, where the motion is relatively consistent over the time period throughout which the image sequence was captured, a small number of images can be used. Using fewer images is beneficial in conserving operation time. In cases where there is inconsistent motion between frames, the assumptions of linearity made in the temporal constancy motion model are violated, and optical flow is a poor solution. Beta The value of / is the final parameter to be determined. Values of 0 between .4 and 1.2 were tested, with fo = .06 and 3 temporal images. Figure 5-17 plots the results. mean angular Error Sphere -Office SStreet Blocks -Vcbox w I - Simple Medium Complex Y--osemite U C Up 3 Figure 5-17: Mean Angular Error vs. P These results show that 3 has little effect on the average angular error; the error is approximately constant across the plot for all images. The original value used for testing, 3 = .08, will be the final choice for this parameter. This value was also used in [13], which is convenient for comparison. Densities In all of the tests, the density of the optical flow result was greater than 99 %. This result is shown in Figure 5-18. The scale on this figure is from 99.75 to 100 %. The individual component velocities had lower densities, but the invalid values were generally at different locations for each filter tuning. Therefore, the combination into the 2D velocity successfully produced a value for nearly all locations, only rejecting values where less than 5/6 of the component velocities had valid results. Varying / generated similar density results. mean Density Sphere SffiOce -Street -Blocks Vcbox -Simple Medium Complex Y-osemite C 16 Figure 5-18: Density vs. Frequency 5.1.3 Evaluating Error This section presents the error data and analysis for the selected parameters. These parameters are fo = .06, / = .8, and a temporal support of 3 images. Table 5.1 gives the mean errors for these parameters. The average angular error is largest in the Blocks and Vcbox images. The error Image Sequence Sphere Office Street Blocks Vcbox Simple Medium Complex Yosemite Yosemite (masked) MAE 5.4606 15.0868 10.0560 65.9089 47.3316 8.3123 21.2902 21.8977 32.6134 20.1809 MME 0.2937 0.3049 0.4921 0.7165 3.1848 1.6156 14.5334 15.3410 1.4687 1.7264 Table 5.1: Mean Error for these sequences is significantly higher than any of the other sequences. These two sequences were also the most at fault for variation in flow field from image to image. This result implies that the accuracy of the optical flow calculation reflects the quality of the input sequence. Flow Discontinuities Viewing these results, the primary observation is that the regions around flow discontinuities contain the most error. This effect is noticeable in the Sphere and Office sequences. ! Figure 5-19: Discontinuity Error In the former sequence, shown on the left of Figure 5-19, the largest error occurs near the axis of rotation, where the neighboring velocity vectors are in nearly opposite directions. The Office sequence, on the right of Figure 5-19, illustrates this effect near the focus of dilation where the velocity vectors point outwards from the center. This effect can be partially explained by the combination of the component velocities into a single 2D velocity. Each component velocity selects a flow tuned to a specific frequency and orientation and covers a certain spatial area. Averaging the flows from all orientations allows neighboring vectors of different orientations to influence each other and generate error at the boundary. Magnitude Error The process of combining the component velocities into a single 2D velocity can account for much of the magnitude error, as well. The magnitudes calculated by the optical flow routine are generally flat across the image, and therefore the magnitude error is approximately proportionate to the magnitude of the flow field. Averaging the component velocities meant that the filter orientations with weak responses mitigated the responses of the filter orientations with strong responses. Figure 5-20 illustrates this effect for the Sphere, Office, Street, and Yosemite sequences. Shadow Regions As expected, the shadowed regions of the Simple sequence are picked up by the optical flow process. These regions were ignored by the ray tracer that created the groundtruth flow field and therefore appear as error in the result. Figure 5-21 shows the magnitude of the flow fields, as well as that of the angular error. The magnitude of the calculated flow is scaled so that the magnitude of the shadow is visible. The calculated optical flow is reasonable because the shadow does create a noticeable moving gradient, which can be seen in frames from the sequence shown in Figure 5-7. Aside from the conflict with the ray tracer flow field, the shadows are significant because of their transparency. Gradients covered by the shadows can still be computed in the optical flow calculations, and these can lead to multiple motions. Although this effect was not noticeable here, as the region beneath the shadows was static, the generation of multiple motions from transparent shadowing could cause a lot of error when combining the combining velocities into the 2D velocity. Calculated Magnitude Ground-Truth Magnitude Magnitude Error Figure 5-20: Magnitude Comparison Ground-Truth Flow Calculated Flow Angular Error Map Figure 5-21: Simple: Shadow Error 64 Handling of Different Motion Types Overall, areas of unchanging gradient are handled well. This situation occurs in static regions such as in the background of the Sphere sequence, and in regions of the Office sequence such as in the window in the upper left corner. In the latter example, the optical flow calculation agrees with the ray tracer in ignoring the internal motion. For the purpose of deblurring, this version of the motion field should be acceptable as long as the motion vector is shorter than the edge boundary created by the optical flow. Essentially, as long as the motion of the constant gradient region is smaller than the region captured in the optical flow calculation, the deblurring process should be unaffected by ignoring the motion in that region. This implementation handles dilation and translation well. This effect is illustrated by the good performance of the images containing these motions. The Yosemite sequence demonstrates an approach into the scene, while the Office sequence demonstrates an expansion out of the scene. The Street scene captures the translational motion to the right, although there is some difficulty measuring the motion of the car. Low average error is observed in all of these sequences. This optical flow implementation has the most difficulty handling rotational motions. The difficulty with rotation is evident in the Simple sequence. The rapid rotation of the square object contains a lot of error. The fact that the slower rotations both towards the inside of the square object in this sequence and on the broad side of the sphere in the Sphere sequence are generated correctly indicates that the speed of the rotation affects the accuracy of the results. This effect is a consequence of the assumption of linear motion made in the optical flow implementation. For a short velocity vector, the motion approximates the model. However, as the magnitude of the vector increases, the model fails since it cannot account for the curvature of the motion. 5.2 5.2.1 Deblurring Deblurring Test Sequences For the deblurring portion of the testing, many of the sequences from the optical flow experiments were used, in modified form. Three frames of a given test sequence were averaged to generate the blur and the given ground-truth flow fields for those frames were averaged to generate the ground-truth for that artificial image. For image sequences with larger velocities, the blurred image was also smoothed by a Gaussian to reduce the discontinuities from the image averaging and generate a closer approximation to an image smoothed by motion. Applying the Gaussian blur may adversely affect the results by violating the assumption that the blur was strictly caused by motion, but the artifacts generated by the averaging process also reduced the quality of the motion approximation. The Gaussian smoothing was applied to the Vcbox, Simple, and Medium sequences, which have maximum velocities of 10.1, 19.8, and 22.0 pixels per frame, respectively. The Sphere, Office, Street, and Yosemite sequences have maximum velocities of only 2.6, 1.4, 5.1, and 5.5, respectively, and therefore need no additional smoothing. The Blocks and Complex sequences were not used for the deblurring tests. The extreme motion variation between frames in the Blocks sequence made it a poor representation of motion blur, since the motion direction was changing. Vcbox had this problem to a lesser degree but was kept in the tests to show results for both the textured background and the irregular motion field. The large velocity magnitudes of the Complex sequence, up to 150 pixels per frame, made it an impractical sequence to deblur due to the size of the PSF that would be required for reconstruction of the image. The images in Figure 5-22 (page 67) show the center frame of the original image set next to the constructed image simulating the motion blur for each of the test sequences used in the deblurring experiments. Original Blurred Figure 5-22: Images for Deblurring Tests 67 5.2.2 Determining r(w) The value of the regularization term r(w) was determined experimentally. A constant value was chosen to provide consistency across the entire image. Since this term will be a constant, from now on it will be referred to simply as r. The constant effectively sets an upper bound of 1/r on the magnitude of the inverse filter in the frequency domain. Various values of this constant were used to deblur the test images described above, and the resulting images were compared to select a value that would provide adequate sharpness enhancement without enhancing the noise too much. Using the Office sequence as an example, the effects of varying r can be shown. Figure 5-23 ~1 r =.1 r =.01 I 1II- r =.001 Figure 5-23: Effects of Different r Values illustrates very clearly how r influences the contrast in the result. For small values of r, the regularization filter approximates the ideal inverse filter, which enhances high-frequencies as explained in Section 3.2. Increasing r smoothes the filter response and limits the high-frequency enhancement. In practice, the deblurring process could be run for many values of r and the best result selected. For this thesis, the value of r will be set to simplify the analysis. The value chosen for r is .02, which retains enough contrast for the sharpening properties of the deblurring to be easily viewed in the small pictures shown without losing all details in the image. For other applications, a higher value for r providing more subtle results may be desired. 5.2.3 Interpolation As discussed in Section 4.2.2, interpolation is necessary to reduce the negative effects of discretization. These effects can be clearly seen in the Office sequence. The focus of the dilation at the center of the image corresponds to velocity vectors with very small magnitudes. These magnitudes are scaled with the degree of interpolation to maintain the correct ratio of velocity per pixel. Figure 5-24 illustrates how the size of the distortion decreases as the degree of interpolation increases and the magnitudes of the velocity vectors scale upward proportionately. To show the distortion more clearly, the images are cropped and the value of r used is 0. Interpolation = 4x Interpolation = 5x Figure 5-24: Office: Effect of Interpolation Unfortunately, as the degree of interpolation increases, so does the computational intensity of the process. The computation uses more memory and takes a considerably longer time to evaluate the output image for each additional interpolation step. A degree of 4 for the interpolation was chosen to achieve accurate deblurring while maintaining a reasonable processing time and manageable memory usage. 5.2.4 Evaluation of Deblurring This section discusses the results from the optical flow based deblurring process. Additionally, this section provides a comparison against two other standard methods, blind deconvolution and Lucy-Richardson deconvolution. The blind deconvolution results were generated from the deconvblind function using a square input PSF, the size of which was chosen to provide the perceived optimal results. The Lucy-Richardson Blurred Thesis Output I3lind Figure 5-25: Deblurring Test Results Lucy-Richardson results were generated from the deconvlucy function using the average PSF calculated for the image. For both of these functions, the optional input parameters are set to the MATLAB defaults. Figure 5-25 shows the results for all of the image sequences. From the left to the right, the images displayed are the original, the thesis result, the blind deconvolution and the Lucy-Richardson deconvolution. Viewing these results, it is apparent that the blind deconvolution method is least effective. For all of the test images, the blind deconvolution generates the most distortion in the output. This result agrees with the previous work, described in Section 3, which concludes that knowledge of the PSF is essential for an accurate derivation of the original from a blurred image. The deblurring output will now be discussed for each test image. Sphere The Sphere image contained very little blurring, due to the motion having very low velocities. The pattern on the object in motion, the sphere, also has very soft transitions in color. This makes the analysis of the deblurring somewhat difficult, since the visual difference between the original and blurred image (see Figure 5-22) is small. In general, the performance of the optical flow deblurring for this image is poor in comparison to that of the Lucy-Richardson method. Distortion is evident across the smooth-varying surface of the sphere in the optical flow result, and not in the Lucy-Richardson result, or even the blind deconvolution result. However, the optical flow result does not exhibit the ringing that appears around the high-frequency components for both the blind and Lucy-Richardson deconvolution methods. The ringing occurs because the edges of the image contribute artificial high-frequencies that get processed in the MATLAB deconvolution functions; operating in the space domain, as was done for the optical flow method, prevents this ringing effect. Office At broad glance, and recalling that a value of r was chosen to generate high contrast in the optical flow deblurring output, the results from the Lucy-Richardson and optical flow methods seem comparable. The Lucy-Richardson method again exhibits ringing around object edges. Focusing on certain regions of the image, however, illustrates how the optical flow method can generate a sharper result than that of the LucyRichardson method. The region shown below in Figure 5-26 is the picture frame at the top of the Office image. This close-up view shows that the shapes within the I Original Thesis Output Figure 5-26: Office: Zoom on Picture Frame picture frame are resolved more clearly in the optical flow result than in the LucyRichardson result. Neither method was able to recover the extreme high-frequency components entirely. Street In the Street sequence, it is useful to focus on the car and the man on the bench, as in Figure 5-27. The sharp edges of the man's shoulder are captured clearly in the Original Thesis Output Lucy-Richardson Figure 5-27: Street: Zoom on Car optical flow deblurring result, while the edges remain smooth in the Lucy-Richardson result. The car is similar in color to the background, which makes it difficult to see, especially after the blurring blends the color. Both the Lucy-Richardson and optical flow methods recover the general shape of the car, although the former method handles the wheel rotation better. Rotation is not linear, and therefore the linear model of the motion vector fails to appropriately recover the initial rotating image when the optical flow method is applied. Another point to note on the optical flow method is that the image recovery is somewhat limited by the motion itself. The motion, and correspondingly the blurring, of the car is horizontally oriented. When the image brightness varies little, as in this case, the blurring leaves little information to be extracted by the deblurring process. The result is that the recovered image of the car still appears stretched horizontally. Vcbox The Vcbox image was deblurred surprisingly well by the optical flow method. The issues with the motion assumption caused by the shifting focus of expansion, discussed in Section 5.1.1, were expected to adversely affect the results of the deblurring. Not only does the optical flow based deblurring handle the textured background with reasonable results, it also generates a sharper label on the front of the box, which can be seen in Figure 5-28. The deblurring result of the Vcbox image does contain Original Thesis Output Lucy-Richardson Figure 5-28: Vcbox: Zoom on Box Label distortion, in the form of high-frequency stripes. The problem with handling small velocity values is also evident towards the center section. However, the edges of the logo, especially near the upper left, are considerably clearer than in the LucyRichardson deconvolution result. The edges of the box are cleaner, as well. Simple The motion of the checkered sphere in this image was significantly faster than that of the Sphere image, and here it was translating in the direction of rotation to increase the velocity further. Unfortunately, most of the detail was lost on the surface of the sphere in the optical flow recovery of this image. The Lucy-Richardson method manages to enhance the surface detail that was evident in the blurred image (see Figure 5-22), but does not recover the original pattern either. Therefore, the optical flow method may be preferred over the Lucy-Richardson method, depending on the application, since it provides a closer approximation to the original image despite having a blurrier appearance. Observations of the square in this image indicate that the rotating motion is difficult to process. This difficulty was also encountered in the wheel rotation of the Street image. As with the optical flow calculations, this type of motion violates the linearity assumptions used to develop the motion model. The image recovered with the optical flow vectors presents rounded edges on the square. The recovery with the Lucy-Richardson algorithm isolates the square images into the original three squares used to generate the blur. This result indicates that the Gaussian smoothing was inadequate to eliminate the artifacts created by simulating the motion through frame averaging. The checkered surface beneath the objects represents a region in which the optical flow method clearly stands out as the better method. The optical flow is zero in this static area, and therefore the blurred image is identical to the original for this portion of the image. The optical flow method leaves this region unchanged, while the LucyRichardson method attempts to deconvolve the static region in the same manner as the rest of the image, introducing distortion. Medium None of the deblurring methods handled the Medium image well. This image contained large motion vectors that were not averaged smoothly even after the additional Gaussian smoothing, an effect which is noticeable in the checkered floor surface (see Figure 5-22). Because of this poorly simulated blur, the floor is poorly recovered in all of the deblurred images, which can be seen in Figure 5-25. However, taking a closer look at the car, as in Figure 5-29, the optical flow method clearly performs Original Thesis Output Lucy-Richardson Figure 5-29: Medium: Zoom on Car better than the Lucy-Richardson method. The shape of the car has been retained, and sharpening has occurred around edges, such as, for example, the window borders and the outside of the wheels. Yosemite The Yosemite deblurring results for the optical flow and Lucy-Richardson methods are comparable. Both outputs sharpen the hills in the background, as well as the river on the right. For the mountain surface in front on the left, the Lucy-Richardson result alters the image little from the blurred image. For the same region, the optical flow method does provide increased contrast. 5.3 Combining Optical Flow and Deblurring This section discusses the results from applying the optical flow output to the deblurring process. Results from processing the artificial image sequences are discussed in Section 5.3.1. Section 5.3.2 will discuss the results from processing a set of real images. 5.3.1 Artificially Blurred Images The first round of tests used a set of the blurred images from Section 5.2.1. The optical flow was calculated for these images, and the resulting velocity vectors were used in the deblurring process. The Sphere sequence was not used because the velocity vectors were so small the deblurring has little effect. The Medium sequence was not used because the artificial blurring poorly approximated the desired motion blur effect. Figure 5-30 (page 77) shows the results of deblurring with the calculated optical flow values. The images deblurred with the calculated optical flow vectors appear visually identical to the blurred image. This is due to the error in the magnitude of the calculated velocities. As was shown in Section 5.1.3, the magnitudes returned from the optical flow process are generally constant across the image, making the error in magnitude approximately proportional to the magnitude of the flow field. For the deblurring process, this means that regions containing the largest velocity magnitudes will have deblurring PSFs far shorter than the actual blur function. The short PSFs will not cover enough area to be able to capture the information from the original image, and therefore the unblurred image will not be recovered well. Figure 5-31 (page 78) demonstrates how deblurring with a corrected magnitude generates similar results to deblurring with the ground-truth vectors. The images shown are, from left to right, the blurred image, the image deblurred with the calculated optical flow, the image deblurred with the angle of the calculated optical flow and the ground-truth magnitude, and finally the image deblurred with the complete ground-truth flow field. This result indicates that the magnitude error, rather than the angular error, caused the poor deblurring results in Figure 5-30. The angular error accounts for the remaining differences in the deblurring results. This can be seen most clearly in the images from the Vcbox and Simple tests. Viewing the Vcbox images in Figure 5-32, it is clear that the corners of the deblurred images vary the most between the image deblurred with the corrected magnitude and the image deblurred with the ground-truth values. This difference corresponds to the Blurred Image Deblurred Image Figure 5-30: Deblurring Results Using Calculated Optical Flow 77 Blurred Image Calculated Velocity Corrected Magnitude Ground-Truth Figure 5-31: Effect of Magnitude Error on Deblurring 78 Corrected Magnitude Ground-Truth Angular Error Figure 5-32: Effect of Angular Error on Deblurring regions of large error in the angular error map on the right. Similarly, a large portion of the deblurred image differences appears around the outside of the rotating square, as the rotation was difficult to capture in the optical flow calculations. These results support the intuition that the quality of the optical flow calculation, in terms of both magnitude and angular error, strongly corresponds to the quality of the subsequent deblurring. 5.3.2 Real Images Finally, the full process of calculating the optical flow and using the velocity vectors for deblurring was applied to real image sequences. For these sequences, there are no ground-truth values available to correct for the error in magnitudes. Therefore, the error in the optical flow component will be determined by observing the motion in the image sequence and comparing that motion to the calculated 2D velocities. Real Image Test Sequences The real images were captured by placing a camera in a hallway. Pictures were continuously taken over a long period of time and appropriate sets were selected for testing. The resulting image sequences are of people in motion, walking around the hallway. The majority of the motion occurs around arms and legs, as well as the head. The reflective floor picks up the shadows, which are also affected by the movements of the subjects. All of the real image sequences contain 3 frames, and the center frame is the one that will be deblurred. Because of the large image sizes in these sequences, the interpolation factor for the deblurring component was lowered to 3. The first image sequence, shown in Figure 5-33, shows two people walking towards the camera. Both of the individuals in these images exhibit separate motions. The man in the front has drastic motion of the legs as he walks forward, and his right arm shows movement. The badge hanging around his neck also moves. The woman in back has smaller perceived motions, due to her increased distance from the camera. She also shows motion in the arms and legs, as well as rotation of her head. Figure 5-33: Real Sequence 1 The second image sequence, shown in Figure 5-34, shows a stationary man zipping his jacket. The motion for this sequence is near the zipper, including the hands, and his head which shifts during the sequence. Figure 5-34: Real Sequence 2 The third sequence, shown in Figure 5-35, is a man walking towards the camera. His motion is mostly around his arms, with some slight turning of the head. Figure 5-35: Real Sequence 3 Figure 5-36 shows the fourth and final image sequence. These images again depict two people walking towards the camera. The two subjects here are walking closer together than in the first image sequence. Again, there is a lot of motion around the arms and legs. Both of the heads turn slightly, and the position of the badges shift over the image sequence. Figure 5-36: Real Sequence 4 Analysis of Results for Real Image Sequences Figure 5-37 shows the results of the optical flow based deblurring for the real images. The original blurred frame is shown on the left. The image on the right is the blurred image overlaid with the optical flow velocity vectors. The center image is the deblurred image. The edges of the image were not deblurred to ensure that the PSF fit inside the image dimensions for all points and also to eliminate regions affected by convolution error from the optical flow process. The edges are therefore not shown in the deblurred image. These results indicate that the optical flow process performed well. The velocity vectors are most prominent around the edges of the subjects, as expected based on the observation of their motion. The motion of the subject's arm swinging forward in sequence 3 is captured in the velocity vectors pointing down and to the right of the image. The slow rocking backwards and the slight dip of the head of the man zipping his jacket in sequence 2 is shown with the direction of the velocity vectors. The vectors around the hands in this sequence represent the zipping motion well, showing both hands moving vertically in opposite directions. Image sequences 1 and 4 illustrate again that shadows are captured in the optical flow, as evidenced by the large velocity vectors on the floor in front of the subjects' feet. The window in the background of sequences 1 and 3 shows some error, as the velocity vectors should be zero. Possibly Afý Blurred Deblurred AN Velocity Vectors Figure 5-37: Real Sequence Results shadowing effects, caused by changing light conditions outside the window, affected this portion of the result. The accuracy of the leg motions in sequences 1 and 4 is difficult to determine because the leg position changes over the time period of the sequences. Distinguishing between the original image on the left and the deblurred image in the center is difficult. This result is a consequence of restrictions imposed by the limitations of the camera and error in the calculated velocity vectors. In order to get large blurs on the image, the subject had to be moving quickly. However, the camera that was used had a significant time delay following each frame to write the image data to memory. The time delay between frames meant that the subject tended to be displaced in the succeeding frame. This effect causes the calculated motion vectors to represent the temporal displacement rather than the blur-generating movement. Also, in the real images, the subjects vary in their motion from frame to frame. This variation was shown in Section 5.1.2 to generate error in the optical flow calculations. Image sequences capturing slower motions were chosen to reduce the discontinuities between the frames and therefore the observable blur is small. This effect makes the analysis by observation more difficult. Compounding this problem is the issue of the small magnitude vectors returned by the optical flow process. The magnitude error was shown in Section 5.3.1 to cause minimal deblurring effects, due to shortened velocity vectors. To increase the difference between the original and deblurred images, the tests were rerun with velocity vectors scaled by a factor of ten. Figure 5-38 shows these results for the second and third sequences. Scaling the velocity vectors leads to higher contrast in the output image. The boundary between the subjects and their backgrounds is more defined, for example. Also, in sequence 2, the inside of the jacket, the buttons on the man's shirt and the separation between the fingers are sharper. However, since the scaling factor was an arbitrary choice, the resulting image also exhibits increased distortion. The distortion occurs because the longer PSF now covers too much area, and the resulting pixel value includes extra information from surrounding pixels. This result again demonstrates the need for an optical flow method that generates accurate Blurred Deblurred Scaled Deblurred Figure 5-38: Real Sequence Results Using Scaled Velocity Vectors velocity magnitudes in addition to accurate directions. Chapter 6 Summary and Future Work The work presented in this thesis provides a method for reconstructing images degraded by motion using velocity information generated with a phase-based optical flow calculation. The thesis demonstrates that the quality of the velocity vectors from the optical flow computations directly affects the quality of the deblurring. Overall, the method works well when the input image sequence is appropriate. Appropriate image sequences have continuous, consistent motion for the duration of the sequence, without discontinuities between frames. These restrictions can be limiting in terms of the camera capabilities required and the speed and direction of the subject motion, as was demonstrated in Section 5.3.2. This chapter will review the implementation and results for both the optical flow and deblurring processes. 6.1 Optical Flow Summary The optical flow method implemented for this thesis closely followed that of Fleet and Jepson given in [13] based on their efforts in [14] to demonstrate the robustness of phase with respect to variations in image brightness. The method applies a set of frequency-tuned Gabor filters to an image sequence in order to determine the component velocities for each pixel by tracking temporally separated phase contours. The method implemented here differed from Fleet and Jepson's mainly in the generation of a 2D velocity from the component velocities. The 2D velocity for this thesis did not require values from local pixels, therefore bypassing the assumption of smoothness that was implicit in Fleet and Jepson's combination method. Fleet and Jepson also failed to specify their choices in parameters, so the values used in this thesis were determined through experimentation, explained in detail in Section 5.1.2. The resulting optical flow implementation was largely successful in determining the direction of the motion. The most angular error was observed in regions of rotational motion, where the deviation of the model of linear motion from the actual motion was greatest. Angular error was also observed where the motion varied over the duration of the image sequence, another place where the linear motion model was a poor representation of the true motion. The magnitude error was significantly more important than the angular error when the optical flow velocities were used in the deblurring process. Although the average magnitude error was generally small, the location of the errors caused a dramatic reduction in the quality of the image restoration that followed. This effect demonstrates the fallacy in many of the studies in including a single error measure when analyzing the results. The Barron study, for example, uses only the angular error measure [2]. Many new methods that use the results of the Barron study for comparison also use only the one error measure, and the problem of inadequate evaluation of new optical flow methods is propagated. 6.2 Improving the Optical Flow Method A simple improvement to the optical flow calculation would include more filters, tuned to a variety of frequencies. This extension would capture information from a larger range of the frequencies in the original image. Additionally, computation considerations in the optical flow component could be increased by using methods such as the Complex Discrete Wavelet Transform in place of the Gabor filters, as described in [6]. Many of the problems with the optical flow method are derived from the combination of the component velocities into the final 2D velocity. Keeping the component velocities separate could maintain more information about non-linear motions. This idea follows from the fact that the component velocities can express multiple motions at a single pixel, as with transparent objects or effects such as shadows [12]. Nonlinear motions such as rotation could similarly be expressed as a sum of smaller linear motions. Reducing the linearity in this manner may improve both the angular and magnitude error in the optical flow output which would carry over into the deblurring stage of the processing. 6.3 Deblurring Summary The deblurring method implemented for this thesis applied a regularization filter at each pixel on the image. The velocities generated by the optical flow calculations were used to create the PSF that represented the blur-inducing motion vector. The regularization term was an experimentally determined constant. Figure 5-23 (page 68) illustrates how different values of the regularization constant have a dramatic effect on the contrast in the restored image. Overall, the deblurring process worked well in comparison to the blind and LucyRichardson deconvolution methods when accurate velocities were provided. The deblurring method implemented here had trouble with areas containing rotational motion, as did the optical flow implementation. Again, rotational motion violates the linear motion model, which accounts for the error seen in the output. This implementation also had difficulty handling small motions, due to the discretization effects described in Section 4.2.2. However, this version of an adaptable deblurring technique was still successful in sharpening the appearance of edges and other high-frequency regions. The results from the Vcbox test demonstrate this success very well in Figure 5-28. One significant issue with the deblurring process implemented here is the large amount of time and memory necessary for the computation of the restored image. 6.4 Improving the Deblurring Method The method of deblurring implemented in this thesis uses a simple regularization filter to generate the deblurring kernel. Applying a more sophisticated technique, such as the Lucy-Richardson method, to the process for each PSF could yield more accurate results. However, the process should still remain in the space domain to avoid including information from neighboring pixels. Altering the filter in this manner could significantly lengthen the already considerably time-consuming process of pixel-bypixel deblurring, even while providing better results. A simpler way to increase the effectiveness of the deblurring kernel would be to improve upon the constant regularization term. This term could be improved either by selecting a more appropriate constant or by generating a dynamic term that was based on extractable properties of the image. Additionally, improvement in the analysis process would be extremely helpful in evaluating the deblurring results. An analytic method for comparing the blurred and unblurred images would provide a quantitative result for comparison and allow a larger range of test sequences to be tested. Exploiting the image sequence further is perhaps the most effective way to improve the image restoration. Currently, only the optical flow component utilizes the multiple frames available for each image sequence. Applying information from previous and future frames in the sequence should greatly enhance the results of the restoration in the deblurring component. This improvement should be seen most noticeably at places such as occlusion boundaries, where separate objects may be blurred together. Information obtained from preceding or succeeding frames could help to define the borders of each object and recover each separately. 6.5 Future Work and Applications In addition to the improvements suggested above, future work could be done to improve the methods used in this thesis. Combining the technique presented here with others could improve the results significantly. Superresolution, for example, could be used to generate more highly resolved images in the initial sequence. The increase in resolution would help to mitigate the effects of discretization in addition to providing greater accuracy in the results. This benefit would affect both the optical flow and deblurring portions of the process. This process also has the possibility of providing benefit to many other applications. Clearer imaging would significantly benefit many fields, in addition to the AI applications mentioned previously in Section 1.2. This potential applications include medical imaging such as Magnetic Resonance Imaging, as mentioned in [22]. Appendix A Calculating Component Velocities Using the basic equations from section 2.2, we can derive a simple method for explicitly calculating the component velocities. The basic equations are given again below: qxU + qyv + Ot = 0 n(x, t) = X(X, t) Ibx(x, t)0l v, = an(x, t) (A.1) (A.2) (A.3) Expand (A.3) into its components, un and vn. vn = cny Now plug these values into (A.1). xzanx + Ocan, + Ot = 0 (A.4) Expanding (A.2) into its components, n, and ny: Il~x(x, t~ll n n=l x(x, t)ll Equation (A.4) then becomes: 0 xa Ox + IIOx(x, t0 11 Y + Ot = 0 Y x(x, t)|| i ) +±2kt ICx(x, t)l ( + =0 However, IlIx(x, t)ll is the 2 norm of Ox, and this is equal to V we get: + ¢. Therefore, a .( + +2 ¢Y+ 0/2 Ot = 0 + ,= ±20+0t =0 e (A.5) Solving for a, equation (A.5) becomes: = -a (A.6) Returning to our definitions of u, and v , we end up with: Un= nX x~ + 02Y The term on the left, ( ),is the magnitude of the velocity component. Now substituting nx and ny: -x+q +¢+ After simplification, the final equations for the velocity components un and vn are: + ul - vn - (A.7) (A.8) The derivatives of 0, 0., qy, and ~t can be determined using the phase identity V(x, t) = t) [R*(x, = t)VR(x, t)] (A.9) derived in Appendix B, and therefore the component velocities un and v. can be determined. Appendix B Phase Identity Derivation Deriving the phase relation from Fleet and Jepson [13]: V4(xt) = [R*(x, t)VR(x, t)] Vp(x,(x, t)t) (B.1) Letting A, = °A(x,t), and with x = (x, y) we can write: VO(x, t) = (OX, Y,7 qt) If we only look in the x dimension, this phase identity becomes: Q[R*(x, t)Rx] p2(x,t) R and R* are defined as below, for p, (B.2) E R: R(x, t) = p(x, t)ejo(xlt) (B.3) (x,t) (B.4) R*(x, t) = p(x, t)e-i j Start by solving for the derivative of R using the chain rule. R = [p(x, t)ejo(xt)] R, = p(x, t) a Ox _ a Ox t)] + ej (x t) [ejO(x,t)] + ejO(x,t) x = [p(x, t)] 8 a [p(x, [x t)0 We can rewrite this equation as: 4 (x,t)px Rx = jp(x, t)e~(xt)Ox + ei (B.5) Find R*(x, t)R, by plugging in equations (B.4) and (B.5) : R*(x, t)Rx = p(x, t)e-Jo(x't) [jp(x, t)ejo(x't)O + eji(x't)px] R*(x, t)Rx = jp 2 (x, t)~, + p(x, t)pX (B.6) Since p, 0 E R and using equation (B.6) we get: .[R*(x, t)Rx] = p 2 (x, t)¢O (B.7) Therefore, 0 [R*(x, t)Rx] pO (x, t) (B.8) which is equivalent to (B.2). This can be extended to all 3 dimensions, to get the phase identity given in (B.1). Appendix C Fleet and Jepson: Computing 2D Velocity Fleet and Jepson suggest a least-squares solution to resolve a complete 2D velocity field from the component velocities. This solution unfortunately assumes some smoothness over a local region, since it utilizes velocity neighborhoods to reach a final velocity. Let the matrix a be the vector of unknowns, (ao, al, a2, bo, bl, b2 )T. The vector x is equal to x - xo, which is the spatial distance between the location of the current estimate x and that of the pixel in question, xo. Also recall that the velocity estimate fr can be written as bii, where fl& is the speed and ii is the direction of the estimate. For each component velocity in the neighborhood, equation C.1 is solved. (, nxx nx, nY, y, ny)a = On (C.1) If s is a Mxl column vector of the speeds bn for each local component velocity, we reach a system of equations Na = s, where N is a Mx6 matrix. Minimization by least-squares is then performed on lINa - sjl. M must be greater than 6 to fully solve the system of equations. 100 Appendix D MATLAB Code D.1 Optical Flow Code D.1.1 function OF % This function calculates a 2D velocity for each pixel in an image % sequence. function [I Vx VyJ = OF(Iinit,fO, int_flag, beta) % INPUTS % linit = input image sequence. 3D array % fO = tuning frequency for filters % intflag = boolean to indicate if image should be interpolated prior to % processing. % beta = parameter used to determine coverage in frequency space % OUTPUTS % I = image sequence at end % Vx, Vy = 2D velocities in x, y directions Iinit = double(Iinit); %typecast -> double %% remove portions that dont move 101 to [dl d2 dT] = gradient(Iinit); clear dl d2; dT2D = sum(abs(dT),3); dT2D(abs(dT2D)>O) = 1; 20 dTin3 = repmat(dT2D,[1,1,size(Iinit,3)]); Iinit = linit.*dTin3; clear dTin dTin3 dT; %% interpolate if intflag [sl s2 s3] = size(linit); if intflag I = zeros(2*s1,2*s2,s3); for i = 1:s3 I(1:2*sl-1,l:2*s2-1,i) = interp2(Iinit(:,:,i)); 30 end; else I = Iinit; end; clear Iinit sl s2 s3; %%filter params tSTART = clock; if int_flag % scale frequency if interpolated fO = f0/2; end; 40 % calculate sigma and sigmak; sigmak = fO*(2^beta-1)/(2^beta+1); sigma = 1/sigmak; %% set thresholds thresh = .00001; % min val of velocity 102 pmin = 0; %min val of p (if small, negligible gradient); vmin = 0; %min val of phase normal gradient %% set up filterbank 50 % space s.t. each filter ctr is at least 2sigmak away in freq; % dTHETA = tan(sigmak/fO); wstep = 2dTHETA; % spacing 1/(sigma*fO) = (2 ^beta-1)/ (2 ^beta+l) wstepmin = 2*atand(1/(sigma*f0)); %%run OF3D tl = clock; [Vxn Vyn ifilt] = OF3Dsep(I, sigma, fO, wstepmin); % get component vels t2 = clock; tRUN = etime(t2,tl); 60 fprintf('\n TIME OF3Dsep: %g\n',tRUN); clear t1 t2 tRUN; if int_flag % resize if image was interpolated Vxnout = Vxn(1:2:size(Vxn,1),1:2:size(Vxn,2),:,:); Vynout = Vyn(1:2:size(Vyn,1),1:2:size(Vyn,2),:,:); lout = I(1:2:(size(I,1)),1:2:size(I,2),:); clear I Vxn Vyn; I = lIout; Vxn = Vxnout; Vyn = Vynout; 70 clear lout Vxnout Vynout; Ifilt = 1/2*lfilt; %scale back end; dTin4 = repmat(dT2D, [1,1,size(Vxn,3) ,size(Vxn,4)]); Vxn = Vxn.*dTin4; 103 Vyn = Vyn.*dTin4; clear dT dT1 dTin dTin4; %% final output (from Vxn, Vyn) so tl = clock; % set normalization parameters dlmt = 2; stdlmt = 5; Imt = 10/12; clear beta dT2D fO i int_flag Ifilt ; clear pmin sigma sigmak st thresh vmin wstepmin; [Vx Vy] = validvel(Vxn, Vyn, dlmt, stdlmt,lmt); % get 2D velocity t2 = clock; tRUN = etime(t2,tl); 90 fprintf(' \n TIME valid_vel: %g\n',tRUN); clear dlmt stdlmt; clear tl t2 tRUN; tEND = clock; tTOT = etime(tEND,tSTART); fprintf('\n TIME testOF3D: %g secs, (%g min)\n',tTOT,tTOT/60); clear tEND tTOT; clear img imgE Enan ctr ss; clear Vxc Vyc Cxc Cyc Vmag Cmag; tEND = clock; tTOT = etime(tEND,tSTART); fprintf('\n TOTAL TIME running: %g secs, (%g min)\n',tTOT,tTOT/60); clear tSTART tEND tTOT; 104 100 D.1.2 function OF3Dsep % This function returns the component velocities based on the input % parameters. function [Vxn Vyn ifilt] = OF3Dsep(I, sigma, fO, wstepmin) % INPUTS % I = image sequence % sigma = standard deviation of the Gaussian % fO = tuning frequency % wstepmin = spacing between filters % OUTPUTS SVxn, Vyn = 4D velocity outputs 10 % Ifilt = length of the filter used warning off MATLAB:divideByZero; nfilts = floor(360/wstepmin); % number of filters that will fit wOdeg = 0:360/nfilts:359; fprintf('number of filters: %d\n',nfilts); fprintf('filter spacing: 73.2f\n',360/nfilts); %% define output vectors [srow scol st] = size(I); 20 Vxn = zeros(srow,scol,st,nfilts); Vyn = zeros(srow,scol,st,nfilts); R = zeros(srow,scol,st,nfilts); %%run compvel code for i = 1:nfilts wO = w0deg(i)*2*pi/360; fprintf('\nfilter #%d: w0deg = 73.2f\n',i,w0deg(i)); lo run Gabor function 105 [R(:,:,:,i) Ifilt] = gabor3Dsep(I,sigma,fO,wO); %%run compute velocity code 30 [Vxn(:,:,:,i) Vyn(:,:,:,i)] = compvel(R(:,:,:,i)); end; D.1.3 function gabor3Dsep % This function filters the image with the Gabor filters function [R ssigmak] = gabor3Dsep(I,sigma,fO,wO) % INPUTS % I = image sequence % sigma = standard deviation of Gaussian % fO = tuning frequency % wO = filter orientation % OUTPUTS %oR = filtered image % ssigmak = size of the filter to tl = clock; %% generate Gaussian [sy sx st] = size(I); ct = ceil(st/2); cy = sy/2+1; cx = sx/2; [G X Y T] = g3D(sx, sy, st, sigma,0); % create 3D G3 = G(:,:,ct); 20 sv = round(sigma); Glog = zeros(sy,sx,st); % mask values outside stddev for i = l:sy 106 for ji = 1:sx for k = 1:st yd = abs(cy-i); xd = abs(cx-ji); td = abs(ct-k); Glog(i,ji,k) = (sqrt(xd^2+yd^2+td^2)<sv); end; 30 end; end; clear yd xd td; clear i ji k sv; G3 = G3.*Glog(:,:,ct); %% create sinusoid component sinmodx = exp(j*(2*pi*fO.*X*cos(wO))); sinmody = exp(j*(2*pi*f0.*Y'*sin(wO))); 40 %% make gabor filters gx = normpdf(X,0,sigma); gy = normpdf(Y',O0,sigma); gt = normpdf(T,0,sigma); gt = gt/(sum(gt)); gabx = sinmodx.*gx; gaby = sinmody.*gy; g = gaby*gabx; 50 clear gx gy gabx gaby sinmodx sinmody; %% convert to final 2D signal 107 g = g.*Glog(:,:,ct); g = g/(sum(g(:)));%normalize, so sum = 1; ft = fft2(g); fts = fftshift (ft); cX = floor(sx/2+1)+round(sx*fO*cos(wO)); cY = ceil(sy/2+ 1)-round(sy*f0*sin(w0)); ssigmak = ceil(sx/sigma); 60 % split into x, y gmask = zeros(size(g)); for i = cY-ssigmak:cY+ssigmak; % ROW/ Y direction for k = cX-ssigmak:cX+ssigmak; % COL/X direction gmask(i,k) = (sqrt((i-cY) 2+(k-cX)^2)<ssigmak); % 1 if dist < sigmak away from ctr end; end; clear i k; 70 ftsmask = fts.*gmask; imask = ifftshift(ftsmask); g = ifft2(imask); clear cY cX srow scol txt; clear ft fts ftsmask gmask r c; %% 2D multiplication in FOURIER sr = 2=sy; sc = 2*sx; 80 dr = (sr-sy); dc = (sc-sx); st = size(I,3); 108 gft = fft2(g,sr,sc); I2D = zeros(size(I)); for i = 1:st; Ift = fft2(I(:,:,i),sr,sc); I2Dmult = ifft2(Ift.*gft); I2D(:,:,i) = I2Dmult(sy-(sy/2-1):sy+sy/2,sx-(sx/2-1):sx+sx/2); end; 90 clear i sr sc st dr dc; clear gpad gft Ipad Ift I2Dmult; %% convolve -> R R = I2D; for irow = 1:size(I,1) It = I2D(irow,:,:); It2D(:,:) = It(l,:,:); R(irow,:,:) = conv2(It2D,gt, 'same'); 100 end; clear irow; clear It I2D It2D; t2 = clock; tRUN = etime(t2,tl); fprintf('TIME gabor3Dsep: %g\n',tRUN); clear tl t2 tRUN; D.1.4 function compvel % This function finds the component velocities from the output of the image % convolved with Gabor filters. function [Vx Vy] = compvel(R); % INPUTS 109 % R = output of the image convolved with Gabor filters % OUTPUTS % VX, Vy = component velocities tl = clock; %% find phase gradient: calc gradient w/ method from Fleet to [srow scol st] = size(R); fit = 1/18*[-1 8 0 -8 1]; % filter for i = 1:st dX(:,:,i) = conv2(R(:,:,i),fit,' same'); dY(:,:,i) = conv2(R(:,:,i),flt' , 'same'); end; for irow = 1:srow Rt = R(irow,:,:); Rt2D(:,:) = Rt(1,:,:); dT(irow,:,:) = conv2(Rt2D,flt,' same'); 20 end; clear fit; %% calc phi_x,y,t phiX = imag(conj(R).*dX)./(abs(R).^2); phiY = imag(conj(R).*dY)./(abs(R).^2); phiT = imag(conj(R).*dT)./(abs(R).^2); %% find velocities Pnorm = phiX.^2+phiY.^2; 30 Vx = -phiT.*(phiX./Pnorm); Vy = -phiT.*(phiY./Pnorm); t2 = clock; 110 fprintf(' TIME compvel: %g secs\n',etime(t2,tl)); clear tl t2; D.1.5 function g3D % This function creates a centered 3D Gaussian of mu = 0; function [g x y t] = g3D(lx, ly, It, sigma) % INPUTS % Ix,ly,lt = lengths in x, y and time % sigma = standard deviation of the Gaussian % OUTPUTS % g = 3D Gaussian % x,y,t = corresponding vectors %% set up vectors if (mod(lx,2)) % lx is odd 10 xmin = -floor(lx/2); xmax = floor(lx/2); else xmin = -lx/2+1; xmax = lx/2; end; if (mod(ly,2)) % ly is odd ymin = -floor(ly/2); ymax = floor(ly/2); else 20 ymin = -ly/2+1; ymax = ly/2; end; if (mod(lt,2)) % It is odd tmin = -floor(It/2); 111 tmax = floor(It/2); else tmin = -lt/2+1; tmax = It/2; end; 30 dx = 1; dy = 1; dt = 1; x = xmin:dx:xmax; y = ymin:dy:ymax; t = tmin:dt:tmax; y = -y+l; [X Y] = meshgrid(x,y); 40 T = t; inp = sqrt(X.^2+Y.^2); g2D = normpdf(inp, 0 , sigma)/(sqrt(2*pi)*sigma); gt = normpdf(T,0,sigma); g = zeros(size(X,1),size(X,2),length(T)); for i = 1:length(T) g(:,:,i) = g2D*(gt(i)); end; clear gt g2D; D.1.6 50 function valid_vel % This function computes the normalized 2D velocities from a set of % component velocities. 112 function [Vx Vy] = validvel(Vxn, Vyn, dlmt, stdlmt,lmt) % INPUTS % Vxn, Vyn = component velocities (4D vectors, Vx, Vy, frame, filter) % dlmt = max distance from mean % stdlmt = max std deviation % lmt = fraction of vals needed to form sample % OUTPUTS % Vx, Vy = 2D velocities [sl s2 s3 s4] = size(Vxn); %•%find valid velocities by taking mean (over small std) % standardize over filters [Vxd4 Vyd4] = stdize_dimSUB(Vxn,Vyn,4,dlmt,stdlmt); fprintf(' \nvel4'); % outputs -> nans from Vxn, or outliers Vxnan4 = sum(~isnan(Vxd4),4); %sum # non-nans over 4th dim (filters) Vynan4 = sum(-isnan(Vyd4),4); % sum over s4 values -> % sum velocity values over filters, frames % scale by # nans at each 2D point VxO = Vxd4; VxO(isnan(Vxd4)) = 0; VyO = Vyd4; VyO(isnan(Vyd4)) = 0; Vxns = sum(Vx0,4)./Vxnan4; %generates 3-D vect, SCALED Vyns = sum(Vy0,4)./Vynan4; % to scale, divide by # non-nan filters (V-nan4) 30 Vxns(Vxnan4<s4*lmt) = nan; % location where all of values were nan Vyns(Vynan4<s4*1mt) = nan; % location where all values were nan 113 % V-nan indicates # valid pts (NOT nan-> want high for good avg, sml std dev) clear Vxnan4 Vynan4; clear Vxd4 Vyd4; clear VxO Vy0; % standardize over frames 40 [Vxd3 Vyd3] = stdize_dimSUB(Vxns,Vyns,3,dlmt,stdlmt); fprintf( ' \nvel3\n'); Vxnan3 = sum(~isnan(Vxd3),3); % sum # non-nans over 3rd dim (time frames) Vynan3 = sum(-isnan(Vyd3),3); VxO = Vxd3; VxO(isnan(Vxd3)) = 0; Vy0 = Vyd3; VyO(isnan(Vyd3)) = 0; 50 Vxavg = sum(Vx0,3)./Vxnan3; %generates, 2-D vect, SCALED Vyavg = sum(Vy0,3)./Vynan3; %scaling by # of non-nan filts Vx = Vxavg; Vy = Vyavg; Vx(Vxnan3<s3*lmt) = nan; Vy(Vynan3<s3*lmt) = nan; clear Vxnan3 Vxd3; clear Vynan3 Vyd3; clear VxO VyO; 60 clear Vxavg Vyavg; clear Vxns Vyns; 114 D.1.7 function stdize_ dim % This function performs the standardization function [Vxd Vyd] = stdizedimSUB(Vx,Vy,dim,dlmt,stdlmt) % INPUTS % Vx, Vy = component velocities % dim = dimension over which to standardize (3 or 4) % dlmt = max distance from the mean % stdlmt = max std dev % OUTPUTS % Vxd, Vyd = standardized velocities 10 VxO = Vx; VxO(isnan(Vx)) = 0; Vy0 = Vy; VyO(isnan(Vy)) = 0; %% DIM= 3 if dim == 3; Vxstd = std(Vx0,0,dim); for i = 1:size(Vx,dim); vx = Vx(:,:,i); 20 dx = (Vx(:,:,i) -mean(VxO,dim))./Vxstd; vx((abs(dx) >dlmt)&(abs(Vxstd)>stdlmt)) = nan; Vxd(:,:,i) = vx; clear vx dx; end; clear Vxstd VxO; Vystd = std(Vy0,0,dim); for i = 1:size(Vy,dim); 115 vy = Vy(:,:,i); dy = (Vy(:,:,i)-mean(VyO,dim))./Vystd; 30 vy((abs(dy)>dlmt)&(abs(Vystd)>stdlmt)) = nan; Vyd(:,:,i) = vy; clear vx dx; end; clear Vystd VyO; %% DIM = 4 elseif dim == 4; Vxstd = std(VxO,O,dim); for i = 1:size(Vx,dim); 40 vx = Vx(:,:,:,i); dx = (Vx(:,:,:,i)-mean(VxO,dim)) ./Vxstd; vx((abs(dx)>dlmt)&(abs(Vxstd)>stdlmt)) = nan; Vxd(:,:,:,i) = vx; clear vx dx; end; clear Vxstd VxO; pack; Vystd = std(VyO,O,dim); for i = 1:size(Vy,dim); 50 vy = Vy(:,:,:,i); dy = (Vy(:,:,:,i) -mean(VyO,dim)) ./Vystd; vy((abs(dy)>dlmt)&(abs(Vystd)>stdlmt)) = nan; Vyd(:,:,:,i) = vy; clear vy dy; end; clear Vystd VyO; end 116 D.2 Deblur Code % This function takes an image and deblurs it using a regularization % filter. PSFs are created using velocities which are provided. function [I] = OFdeblur(I,n,r,Vx,Vy) % INPUTS % I = 2D image that needs to be deblurred % Vx, Vy = velocity vectors % n = degree of interpolation % r = value of regularization term (constant) 10 % OUTPUTS % I = deblurred image %% pre-allocate arrays; [sr sc] = size(I); 14 = zeros(sr*n+1-n,sc*n+1-n); Vx4 = zeros(sr*n+1-n,sc*n+1-n); Vy4 = zeros(sr*n+1-n,sc*n+1-n); Vmag4 = zeros(sr*n+1-n,sc*n+1-n); Vang = zeros(sr,sc); %% interpolate to degree n 20 n2 = 2^(n-1); img = interp2(I,n-1); Vx4 = 2^(n-1)*interp2(Vx,(n-1)); Vy4 = 2^(n-1)*interp2(Vy,(n-1)); Vmag4 = sqrt(Vx4.^2+Vy4.^2); Vang = atan2(-Vy,Vx)*180/pi; [sr4 sc4] = size(img); clear Vx Vy; 117 RE = 10; % remove edges 30 Iout = zeros(sr, sc); h2 = zeros(size(I4)); tl = clock; for i = RE:sr-RE; % for each row tli = clock; for j = RE:sc-RE; % for each column fprintf (' . ); % extract corresponding location in interpolated image 40 indi = n2*i-(n2-1); indj = n2*j-(n2-1); % determine length, angle parameters len = Vmag4(indi,indj); theta = Vang(i,j); rlen = round(len); %throw out invalid values; rlen = 0 or nan if (isnan(rlen) II rlen) lout(i,j) = img(indi,indj); else % valid values 50 PSF = fspecial('motion',rlen,theta); % create PSF [pl p2] = size(PSF); if length(PSF)==l % do not deblur if PSF size = 1 Iout(i,j) = img(indi,indj); else K = fft2(PSF); % fourier transform H = conj(K)./(K.*conj(K) + r); % regularizationfilter h = ifft2(H); % convert to space domain 118 h2 = O*h2; % clear array h2(1:pl,1:p2) = h(pl:-1:1,p2:-1:1); %flip over axes 60 ctrr = floor((pl+l)/2); ctrc = floor((p2+1)/2); h2 = circshift(h2, [indi-ctrr,indj-ctrc]); % shift array Imult = img.*h2; %operform multiplication Iout(i,j) = sum(Imult(:)); % perform sum end % length(PSF) end % rlen end %j t2i = clock; Y fprintf(\nT(i=\%d) = \%og\n:'i, etime(t2i,tli)); end %i clear tli t2i; t2 = clock; %fprintf(•\nT[min] = \ %g\n',etime(t2,tl)/60); 119 70 120 Bibliography [1] Ivar Austvoll. A study of the yosemite sequence used as a test sequence for estimation of optical flow. In Image Analysis, volume 3540 of Lecture Notes in Computer Science, pages 659-668, 2005. [2] J.L. Barron, D.J. Fleet, and S.S. Beauchemin. Performance of optical flow techniques. InternationalJournal of Computer Vision, 12(1):43-77, 1994. [3] Moshe Ben-Ezra and Shree K. Nayar. Motion-based motion deblurring. IEEE Transactionson PatternAnalysis and Machine Intelligence, 26(6):689-698, 2004. [4] Michael J. Black. Frequently Asked Questions. Department of Computer Science, Brown University, 2005. http://www.cs.brown.edu/people/black. [5] Michael J. Black and Yaser Yacoob. Recognizing facial expressions in image sequences using local parameterized models of image motion. InternationalJournal of Computer Vision, 25(1):23-48, 1997. [6] Atanas Boev, Chavdar Kalchev, Atanas Gotchev, Tapio Saramiiki, and Karen Egiazarian. Efficient motion estimation utilizing quadrature filters. unk, unk(unk):unk, unk. [7] Jonathan W. Brandt. Improved accuracy in gradient-based optical flow estimation. InternationalJournal of Computer Vision, 25(1):5-22, 1997. [8] Andres Bruhn, Joachim Weickert, and Christoph Schn6rr. Combining the advantages of local and global optic flow methods. In Pattern Recognition : 24th 121 DAGM Symposium, Zurich, Switzerland. Proceedings, volume 2449 of Lecture Notes In Computer Science, pages 454-462, 2002. [9] Jesus Chamorro-Martinez, Javier Martinez-Baena, Elena Galin-Perales, and Been Prados-Suarez. Dealing with multiple motions in optical flow estimation. In Pattern Recognition and Image Analysis, volume 3522 of Lecture Notes in Computer Science, pages 52-59, 2005. [10] Tony F. Chan and Jianhong Shen. Image Processing and Analysis. Society for Industrial and Applied Mathematics, 2005. [11] Department Dunedin, of New Computer Science, University Computer Vision Zealand. of Research Otago, Group. http://www.cs.otago.ac.nz/research/vision/Resources/index.html. [12] David J. Fleet. Measurement of Image Velocity. Kluwer Academic Publishers, 1992. [13] David J. Fleet and Allan D. Jepson. Computation of component image velocity from local phase information. International Journal of Computer Vision, 5(1):77-104, 1990. [14] David J. Fleet and Allan D. Jepson. Stability of phase information. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(12):1253-1268, 1993. [15] David J. Fleet and Yair Weiss. Optical flow estimation. unk, unk. [16] fxguide.com. Art of Optical Flow, Feb 2006. http://www.fxguide.com/article333.html. [17] B. Galvin, B. McCane, K. Novins, D. Mason, and S. Mills. Recovering motion fields: An evaluation of eight optical flow algorithms. In British Machine Vision Conference, 1998. 122 [18] Temujin Gautama and Marc M. Van Hulle. A phase-based approach to the estimation of the optical flow field using spatial filtering. IEEE Transactions on Neural Networks, 13(5):1127-1136, 2002. [19] Hsueh-Ming Hang, Yung-Ming Chou, and Sheu-Chih Cheng. Motion estimation for video coding standards. Journal of VLSI Signal Processing, 17:113-136, 1997. [20] Fredric J. Harris. On the use of windows for harmonic analysis with the discrete fourier transform. Proceedings of the IEEE, 66(1):51-83, 1978. [21] Berthold K.P. Horn and Brian G. Schunck. Determining optical flow. Artificial Intelligence, 17:185-203, 1981. [22] Jiff Jan. Medical Image Processing, Reconstruction and Restoration. Taylor & Francis Group, 2006. [23] Deepa Kundur and Dimitrios Hatzinakos. Blind image deconvolution. IEEE Signal Processing Magazine, 13(3):43-64, 1996. [24] Jae S. Lim. Two-Dimensional Signal and Image Processing. Prentice Hall PTR, 1990. [25] Alan V. Oppenheim and Alan S. Willsky. Signals & Systems. Prentice Hall, 2nd edition, 1997. [26] RE:Vision Effects, Inc. RE: Vision Effects founders to receive Academy Award, Jan 2007. http://www.revisionfx.com/company/press-releases/01222007. 123