Optical Flow Using Phase Information for Deblurring
by
Cheryl Texin
Submitted to the Department of Electrical Engineering and Computer Science
in Partial Fulfillment of the Requirements for the Degree of
Master of Engineering in Electrical Engineering and Computer Science
at the Massachusetts Institute of Technology
May, 2007
©2007 Massachusetts Institute of Technology
All rights reserved.
4.j
Author
Department of! lect ical 1igineering and Computer Science
May 17, 1998
Certified by
"- -.
"*
_f
,¼qeyo.
,2001
Mi hael Matranga
ad r, Charles Stark Draper Laboratory
VI-A Company Thesis Supervisor
1441-5Q
C oup
Certified by_
7).
ri4fr
k0aae0or)
S.
6'im
"
SProfessor
-'
I..........'.
Thesli~ s Supervisor
Accepted by
by
Acceped
C... Arthur C. Smith
Professor of Electrical Engineering
Chairman, Department Committee on Graduate Theses
MASSACHUSETTS INSTiTUTE
OF TECHNOLOGY
OCT 0 3 2007
LIBRARIES
ARCHIVES
"· ·
;.···
i:
.n
;"
Optical Flow Using Phase Information for Image Deblurring
by
Cheryl Texin
Submitted to the Department of Electrical Engineering and Computer Science
on May 30, 2007, in partial fulfillment of the
requirements for the degree of
Master of Engineering in Electrical Engineering
Abstract
This thesis presents a method for reconstructing motion-degraded images by using
velocity information generated with a phase-based optical flow calculation. The optical flow method applies a set of frequency-tuned Gabor filters to an image sequence
in order to determine the component velocities for each pixel by tracking temporally
separated phase contours. The resulting set of component velocities is normalized and
averaged to generate a single 2D velocity at each pixel in the image. The 2D optical
flow velocity is used to estimate the motion-blur PSF for the image reconstruction
process, which applies a regularization filter to each pixel.
The 2D velocities generally had small angular and magnitude errors. Image sequences where the motion varied from frame to frame had poorer results than image
sequences where the motion was constant across all frames. The quality of the deblurred image is directly affected by the quality of the velocity vectors generated
with the optical flow calculations. When accurate 2D velocities are provided, the
deblurring process generates sharp results for most types of motion. The magnitude
error proved to be a larger problem than the angular error, due to the averaging
process involved in creating the 2D velocity vectors from the component velocities.
Both the optical flow and deblurring components had difficulty handling rotational
motion, where the linearized model of the motion vector is inappropriate. Retaining
the component velocities may solve the problem of linearization.
Thesis Supervisor: Jae S. Lim
Title: Professor
Thesis Supervisor: Michael Matranga
Title: Group Leader, Charles Stark Draper Laboratory
Acknowledgments
This work would not have been possible without the generous support and guidance
of many people from both Draper Labs and the Electrical Engineering and Computer
Science Department of MIT.
First and foremost I would like to thank Mike Matranga of Draper Labs. I could
not have asked for a better coach. Mike suggested this thesis topic among others and
connected me with others at Draper Labs that have expertise in this field. Having
launched me down this path, his enthusiasm, encouragement, and abundant assistance
over the past two years propelled me to this target.
Paul DeBitteto, Rich Madison, and Greg Andrews also provided much needed
advice and guidance that helped shape this thesis. Rich in particular gave freely his
extensive knowledge, based on his own thesis research, that was invaluable to me.
I would like to thank Jae Lim of MIT for intriguing me and stimulating my
interest in image processing while attending his course and for his support as my
faculty advisor. Also, to Anne Hunter, who is always knowledgeable and supportive,
thank you for your kind attention to me and for genuinely caring for each student
undergoing the entire thesis process.
And finally, I would like to thank my family and friends, without whom I would
not have survived this past year. For their unwavering support and encouragement,
I am extremely grateful.
Contents
1 Introduction
15
1.1
M otion Estimation
............................
15
1.2
Image Restoration
............................
17
2 Relevant Work on Optical Flow
2.1
21
Brightness-Constancy ..........
.......
..........
..
21
2.1.1
Horn and Schunck
........................
23
2.1.2
Lucas and Kanade
........................
24
2.2
Phase-Based Methods: Fleet and Jepson . ...............
2.3
Method Analysis and Evaluation
2.3.1
Image Sequences
2.3.2
Parameter Choice ........
25
. ..................
.
.........................
28
.................
29
3 Relevant Work on Deblurring
4
31
3.1
Motion Blur as Convolution .......................
3.2
Noise .....................
3.3
Blind Deconvolution
31
..............
33
...........................
36
Approach
4.1
28
37
Optical Flow
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.1.1
Gabor Filters .....................
4.1.2
Component Velocities .......................
4.1.3
2D Velocity ..
...
......
7
......
..
37
37
39
...
..
.
..
.......
39
4.1.4
4.2
40
Error Calculation .........................
D eblurring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. .
41
4.2.1
Regularization Term r(w) ....................
41
4.2.2
Generating the Filter H(w) ...................
41
4.2.3
Applying the Deblurring Filter H(w) . .............
42
4.2.4
Measuring the Deblurring Error . ................
43
5 Results
5.1
5.2
5.3
6
45
O ptical Flow
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
45
5.1.1
Optical Flow Test Sequences . ...............
5.1.2
Parameter Selection ........................
53
5.1.3
Evaluating Error .........................
61
. . .
45
66
D eblurring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. .
.
66
5.2.1
Deblurring Test Sequences . ..................
5.2.2
Determining r(w) .........................
68
5.2.3
Interpolation
69
5.2.4
Evaluation of Deblurring ....
. ..........................
. . . . . ...
... . . .
..
69
Combining Optical Flow and Deblurring . ...............
75
5.3.1
Artificially Blurred Images ....................
76
5.3.2
Real Im ages ............................
79
87
Summary and Future Work
6.1
Optical Flow Summary ..........................
6.2
Improving the Optical Flow Method
6.3
Deblurring Summary ...........................
6.4
Improving the Deblurring Method . ..................
6.5
Future Work and Applications ......................
87
88
. .................
89
.
..
90
90
A Calculating Component Velocities
93
B Phase Identity Derivation
97
C Fleet and Jepson: Computing 2D Velocity
D MATLAB Code
99
101
D.1 Optical Flow Code ............................
101
D.1.1 function OF ............................
101
D.1.2 function OF3Dsep
105
........................
D.1.3 function gabor3Dsep .......................
106
D.1.4 function compvel .........................
109
D .1.5 function g3D
111
...........................
D.1.6 function validvel .........................
112
D.1.7 function stdize_ dim
115
.......................
D .2 Deblur Code ................................
117
List of Figures
2-1
Yosemite: Frame #9 and Ground-Truth Optical Flow Field ......
29
3-1 Example PSF and its Fourier Transform .
. . . . . . . . . .......
3-2
Inverse Filter and Restoring Function.
..................
3-3
Inverse Filter with Threshold and Corresponding Deconvolution Kernel.
4-1
Real Parts of Gabor Filter and its Fourier Transform
5-1 Yosemite
.
.
.
.
.
.
.
.
.
.
.
.
.
•
.
..
5-2
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.•
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Sphere
5-3 Office .
5-4
Street
5-5
Blocks
5-6
Vcbox
................
.
.
.
.
.
.
.o......°..ooo..
5-7 Simple
............
o...
5-8 Medium
.........
5-9
•......
Complex
5-10 Mean Angular Error vs. Frequency .
5-11 Mean Magnitude Error vs. Frequency . . . . .
5-12 Mean Angular Error vs. Frequency; Difference between Original and
Interpolated Tests ................
5-13 Yosemite Mean Angular Error, Clouds Masked
5-14 Yosemite: Angular Error Map .........
.
.
.
.
.
5-15 Mean Angular Error vs. Temporal Support Size part 1
11
.
.
.
.
.o
55
°
•
.
.
.o
56
.
.
.
.
.
.
.
.
57
5-16 Mean Angular Error vs. Temporal Support Size part 2
5-17 Mean Angular Error vs. 0
.
5-18 Density vs. Frequency
5-19 Discontinuity Error .
5-20 Magnitude Comparison
5-21
qimnleP
Sqh••nw Errnr
5-22 Images for Deblurring Tests .........
5-23 Effects of Different r Values .........
5-24 Office: Effect of Interpolation ........
5-25 Deblurring Test Results
. . . . . . . ....
5-26 Office: Zoom on Picture Frame .......
5-27 Street: Zoom on Car .
............
5-28 Vcbox: Zoom on Box Label
.........
5-29 Medium: Zoom on Car ............
5-30 Deblurring Results Using Calculated Optical Flow
5-31 Effect of Magnitude Error on Deblurring ..
5-32 Effect of Angular Error on Deblurring . ...
5-33 Real Sequence 1 ..............
5-34 Real Sequence 2 ..............
5-35 Real Sequence 3 ..............
5-36 Real Sequence 4 ..............
5-37 Real Sequence Results
.
...........
5-38 Real Sequence Results Using Scaled Velocity Vectors
List of Tables
5.1
Mean Error............
.
.. .. . .. . .
........ . .
...
62
Chapter 1
Introduction
This thesis proposes a methodology to correct images that are blurred by motion.
Many applications require sharp images, but motion of either the camera or objects
in the field of view can cause the image to be fuzzy or blurred. The basic premise of
the thesis is to estimate the motion vectors and use this information to restore the
photo. The motion estimate is determined using a phase-based optical flow method.
The image restoration process applies current deblurring techniques, utilizing a point
spread function generated from the optical flow field.
This chapter gives a broad overview of motion estimation and image restoration.
The following chapters provide detailed descriptions of relevant work done in the
fields of optical flow and deblurring. The two concepts, motion estimation and image
restoration, are discussed separately as they developed independently historically.
1.1
Motion Estimation
Motion estimation is the process of determining the direction of motion within an
image. The resulting motion estimates are useful in many image processing and
computer vision (CV) applications.
The level of accuracy and resolution that is
necessary in the motion estimates depends on the application for which the estimates
are being used. This section describes some basic motion estimation methods and
applications.
One use of motion estimates is to compress video sequences, as in the MPEG encoding system [6], [24]. Where possible, MPEG encoded video streams transmit only
the motion estimates and the prediction error, rather than the entire frame. This
limited transmission decreases the amount of data that is transmitted and therefore
allows for lower bit rates [24]. The encoding scheme transmits either the motion information or the intensity values based on which method requires the smaller number
of bits to adequately represent the frame. Since the goal of using the motion information is to reduce the number of bits transmitted, and because increased resolution
in the motion information does not necessarily improve the result, pixel or subpixel
accuracy in the motion estimation is not desired. For video, the frame rate is fast
enough that small distortions between frames are not noticeable to the naked eye and
coarse motion estimates are acceptable [19]. Motion estimation can also be used for
video to interpolate a frame at a time between given frames. Temporally interpolating an image in this manner can be useful to adjust between frame rates, as for going
from a 30 frames per second format to a 60 frames per second formats in TV [24] as
well as to generate slow-motion effects in video [16].
Many motion estimation methods are generally rather simple, including, for example, region or block matching. The region matching method involves searching for
the best match to a section of the image [10], [24] . This method of motion estimation
and many other similarly simplified methods make broad assumptions concerning the
constancy in shape and size of the objects in the sequence [19]. These assumptions
reduce the effectiveness of the restoration when objects in the image change over time
due to reflections, deformations or occlusions. This type of estimation also works only
with a relatively slow-varying image where each region has identifying characteristics.
However, for many cases these assumptions are adequate for the application. Again,
in video, the frame rate is fast enough that the artifacts from the processing are
generally unnoticeable to the human eye.
Motion estimation is also useful in Artificial Intelligence (AI) applications. These
applications often require a higher degree of resolution. Within a given frame, this
type of processing can help with segmenting and tracking objects, which is beneficial
in computer vision for automatic recognition and tracking of visual input. Object
tracking can help to reduce noise and enhance the image at occlusion boundaries [16].
This technique can be applied to machine recognition of facial expressions and aid
in machine learning [5]. Another use of motion estimation is to reconstruct threedimensional descriptions from a flat 2D image [7], [16], [6]. In these varieties of applications, pixel or subpixel accuracy can be important, as the image will be interpreted
at each frame and should be visually correct as a still. Due to the more stringent
accuracy requirements, robustness with respect to variations in lighting and deformation of objects is important. Therefore the coarse motion estimation algorithms
mentioned above are poor choices for these applications. The optical flow method of
motion estimation can result in very dense, highly accurate velocity flow fields on the
order of subpixel resolution [13]. This capability makes optical flow useful for those
applications in which high resolution is necessary, in contrast to the low resolution
estimators previously mentioned. Optical flow methods have also been used to create
special effects in movies such as What Dreams May Come and The Matrix [26],[16].
Chapter 2 provides details on methods of generating optical flow.
1.2
Image Restoration
The purpose of image restoration is to enhance or correct a degraded image. Image
degradation occurs for many reasons including noise, quantization through digitization, or blurring. The correction process depends on the type of degradation that
occurred. This thesis focuses on restoring images that were degraded by blurring
due to motion, and this section describes the process through which the motion blur
occurs.
Blurring can be caused by many factors, including lens misfocus, atmospheric
distortion, or movement during the image capture. [24],[10]. Blurring smoothes high
frequency components, reducing detail. This blurring is essentially a low-pass filter
effect, and so some details may be unrecoverable.
Motion blur is captured in a given frame due to the finite exposure time necessary
for the light to travel to the film or photosensors. This phenomenon is a consequence
of the physics involved in taking a picture, regardless of whether the camera is film
or digital . For simplicity, the concepts will be discussed in terms of digital cameras
(pixels versus film grains), although the concepts apply to film cameras as well. A
lens focuses light to the photosensors, which are activated by the light photons. The
amount of light received at each sensor determines the degree of that sensor's contribution to the final image. The longer the sensor is exposed to light, the greater the
degree of its activation. A large sensor can capture more light over the same time
period than a small sensor can. However, the resolution of the final image will be
lower, because there are fewer independent pixels. Motion during the exposure time
causes the incoming photons to shift relative to the receiving photosensors. Therefore, neighboring sensors can receive similar visual information, and involvement of
multiple pixels causes the blurring effect in the resulting image.
The amount of motion captured, due to the spread of light across the pixels, depends on the length of the exposure time of the image and the velocity of relative
motion between the camera and the image being captured. Exposure times are constrained by the light available as well as the mechanical limits of the shutter. The
exposure time cannot be shorter than the speed at which the shutter can operate.
The required exposure time is a factor of the sensitivity and size of the imaging sensors, as well as the external light conditions. Low light conditions require a longer
exposure time in order to achieve acceptable contrast in the image.
The motion blur is a relative effect, and therefore can be caused by movement
in either the subject or the camera. When the camera is the source of movement,
the blur field is generally consistent throughout the image [10]. However, motion in
front of the camera may also contribute to the blurring effect. When the motion lies
in the field of view of the camera, the motion contributing to the blurring may vary
at points across the image, making this situation more difficult to analyze. Many
cameras come with stabilization hardware to reduce handling motion on the part
of the photographer. Hardware solutions have also been developed to track camera
motion during the image capture and provide a motion estimate [3]. Sensors on the
subject of the image can be used to track motion in front of the lens as well, but
this sensor tracking requires cooperation from the subject, as well as foreknowledge
of the moving objects to be photographed. This thesis assumes that there is no prior
knowledge of the direction of motion from either the camera or the subject.
In many cases the motion is undesirable from both a visual and practical standpoint. In addition to being more visually appealing, unblurred images are important
for the future processing the image may undergo. Images with sharper edges and
more distinct features and textures improve the ability of many AI applications,
such as those mentioned in section 1.1, to track and segment objects. Furthermore,
identification processes such as iris recognition require clear, high-resolution images
to perform the identification . Acquiring adequately detailed images without postprocessing involves cooperative interaction between the subject and the imager which
can be difficult in many cases. Estimating the motion blur is critical for accurately
restoring the image to remove the motion. More details on the restoration process
appear in Chapter 3.
Chapter 2
Relevant Work on Optical Flow
For the past 20 years, optical flow has become a popular technique for motion estimation in many AI applications as well as visual special effects. However, optical flow
methods are far from perfect. There is a fundamental limitation in that the method
requires the velocity to be calculated for two dimensions, while there is only a single
value at each spatial position on an image.
This results in a single equation with two variables. This deficiency is commonly
referred to as the aperture problem. Assumptions must be made to generate a uniquely
solvable system of equations. The assumptions generate limits on the quality of the
results and often focus on brightness-constancy or phase-constancy. This chapter
discusses these common approaches to solving this aperture problem. The first two
methods illustrate brightness-constancy approaches, while the third method described
here represents a phase-based approach. The chapter concludes with a discussion on
the analysis and evaluation of optical flow methods.
2.1
Brightness-Constancy
The common optical flow method of brightness-constancy assumes that the intensity
of a given pixel does not change as the pixel translates from one frame to the next.
This assumption is violated by changing light sources, rotation, and specular reflection, among other things [15]. However, in many cases this assumption is accurate
over the time frame of the image sequence and can be used to generate an optical flow
field. Brightness-constancy can be written as in Equation 2.1 below, with x = (x, y)
and v = (,
2) = (u, v).
I(x, 0) = I(x + vt, t)
(2.1)
This is equivalent to stating that the brightness is conserved over time, which can be
expressed as:
I(x, t) = c
(2.2)
Solving the above equation therefore becomes the fundamental problem in brightnessconstancy optical flow. Taking the first-order derivative of Equation 2.2, we get:
dI(x, t)
dt
=0
(2.3)
Using the chain rule for differentiation, Equation 2.3 becomes:
8I(x, t) dx fI(x,
t) dy +
+
+
az
dt
0y
dt
I(x, t) dt
= Ixu + Iv+It
Iv+
at dt
= 0
(2.4)
Equation 2.4, called the gradient constraint equation in the literature [2], is the
main equation for first-derivative gradient-based optical flow methods.
Solutions
to the brightness-constancy equation have also been developed using second order
derivatives. However, these methods have been demonstrated to be less effective in
practice [2], and therefore will not be discussed in this thesis.
The deviation of this equation from zero will be called the error in brightness
constancy Eb. Minimizing the error 6 b is the main objective in choosing values for the
velocity vector. However, the aperture problem still needs to be solved, since there is
a single equation in I with two variables u and v.
Two methods provide the basis for much of the work done in optical flow using
brightness-constancy methods. These are the first-order differential methods developed by the pairs Horn and Schunck and Lucas and Kanade. Sections 2.1.1 and 2.1.2
will describe these methods, respectively.
2.1.1
Horn and Schunck
Horn and Schunck proposed a method in which they applied a global constraint to
the brightness-constancy equation to arrive at a solution. They assume that the
brightness is smoothly varying over the image, and use a minimum of least squares
method to solve for the flow velocity. This generates a system of 2 equations involving
Eb from Equation 2.4 above, and the smoothness constraint 68 given in Equation 2.5
below. This system of equations is then solved for the velocity vector.
Smoothness constraint:
du
2
d S+
E
I-
dv
2
2a
a
-x
2
+
2
+
2
2
(2.5)
This leads to the total error
E=2
(2E2
+ e dxdy
(2.6)
where a is a weighting factor which helps mitigate noise effects, especially in areas
where the intensity gradient is small. This total error e is minimized to solve for
the velocity components u and v. The solution can be found by utilizing iteration
techniques in order to simplify the computation. The combination of the global constraint on smoothness and the iteration techniques used in implementation generate a
'fill-in' effect for the resulting optical flow output [21]. In regions where the brightness
gradient is small, the local velocities are propagated into the interior of the constant
region. This can be useful to recover motion vectors for a smooth, translating object.
However, this filling-in effect also illustrates some of the restrictions on using the
global smoothness constraint, as rapidly changing velocities will be absorbed into the
smoothing function as well and possibly lost in the final output. Still, for appropriate
input images satisfying the assumptions used in this method, the optical flow output
can be very accurate. Please see [21] for more specifics on the Horn and Schunck
optical flow method.
2.1.2
Lucas and Kanade
Lucas and Kanade used a similar approach to that of Horn and Schunck to calculate
optical flow velocities. However, instead of applying a global constraint and minimizing the error over the whole image, Lucas and Kanade chose to apply their constraint
to local regions Q, using a window function W(x). The assumption used here to
solve the aperture problem is that within the region Q, the velocity vector is constant
[2], [8]. Therefore, minimizing with a weighted least-squares solution, the Lucas and
Kanade equation can be written:
SW2 (x) [I +
+ It]2
(2.7)
XEQ
In most cases, the localized method of Lucas and Kanade is considerably more
accurate than the global method of Horn and Schunck [2], [17]. Additionally, Lucas
and Kanade provide a method for evaluating the accuracy of the velocity flow at each
estimate. Eliminating the poor estimates contributes to greater overall accuracy,
although it also decreases the density of the optical flow output [2]. Also, due to the
windowing process of Lucas and Kanade, there is no gradient information for areas
where the brightness is constant, and the velocity flow at that location cannot be
determined. In contrast, Horn and Schunck's global smoothness approach is successful
at providing a flow vector in those regions due to the fill-in effect. Therefore, both
methods have practical usage depending on the image sequence. Work has also been
done to combine these methods to exploit the strengths of each. These combination
methods, such as the one described in [8], can mitigate the adverse effects from
various differential systems by integrating components of both, to generate a dense,
accurate optical flow field.
Many other methods have been derived starting from
the brightness-constancy assumption. These methods include affine-motion models,
several 2nd order differential approaches, as well as region-based matching (described
in section 1.1) and energy-based methods. However, these methods are generally less
effective than those previously described [2], [17]. Additionally, the basic methods
have been used much more frequently in comparison studies for new methods, and
therefore provide a better basis for evaluation.
2.2
Phase-Based Methods: Fleet and Jepson
Another approach to the optical flow calculations, based on phase gradients, was
developed by Fleet and Jepson [13]. This approach assumes phase-constancy. This
method is similar in derivation to those using the brightness assumption, and the
approach is also similar in concept to frequency-based methods [13]. However, Fleet
and Jepson demonstrated the robustness of phase gradients with respect to noise and
variations in brightness [14]. This stability indicates that phase-based methods should
perform better than local differential ones when the basic assumption of constant
brightness is invalid.
The phase gradients used in Fleet and Jepson's approach are developed by convolving the image with a set of complex Gabor filters. Each filter within the set is
tuned to a specific frequency and orientation. A component velocity in the direction
normal to the filter orientation can be determined for each filter tuning. The use of
the normal direction solves the aperture problem for this approach. The combination
of the component velocities for all filters generates a final 2D velocity. The calculation
for Fleet and Jepson's phase-based optical flow method is explained below.
Component Velocity
Generating a component velocity corresponding to filter starts by convolving the
image with a complex Gabor filter to generate a complex output R(x, t). This complex
filter Gabor(x,t) is composed of a sinusoid modified by a Gaussian G(x, t), such that
Gabor(x, t) = ej(Xt)'(fo~wo)G(x, t; C)
(2.8)
where the values fo and wo represent the frequency and orientation of the filter,
respectively. Section 4.1.1 will describe the Gabor filter in greater detail.
The output is the result of the convolution of the image with the complex filter
given above.
R(x, t) = I(x, t) * Gabor(x,t)
(2.9)
The complex output R(x, t) can be written in terms of its magnitude and phase,
j (xt). The magnitude component p is subject to fluctuations in
R(x, t) = p(x, t)ei
brightness, and therefore the phase € is used to track motion over time in the image.
The assumption of phase-constancy leads to the initial equation:
O(x, t) = c
(2.10)
where c E R.
Taking the derivative with respect to time, Equation 2.10 becomes:
¢xu + Cyv + ¢, = 0
(2.11)
which is analogous to the gradient constraint equation 2.4 given above. Appendix B
demonstrates one way to calculate the phase gradient from a discrete input.
The normal to the gradient is defined as:
xt)=
n(x, t)
x(x, t)
x(x=
,t)
(2.12)
Writing the velocity in the direction of the normal as:
v, = an(x, t)
(2.13)
solves the aperture problem. The value of the speed coefficient a can be derived from
the phase gradient constraint.
The results given in [13] indicate that the component velocities vn provide an
accurate measure of the flow in the direction of the corresponding filter for a variety
of image sequences, with the error measure being relative to the normal. The use of
component velocities also allows multiple motions to be estimated at a given spatial
location [12]. Multiple velocities can occur at places like an occlusion boundary, or
where there are transparent objects present [9], and accounting for these conditions
can greatly increase overall accuracy.
2D Velocity
To construct the 2D velocity from the component velocities, Fleet and Jepson applied
a least-squares solution to local estimates, which is further explained in Appendix C.
However, the use of local estimates assumes smoothness of the optical flow field, which
is one issue inherent in brightness-constancy methods that the use of phase was meant
to circumvent. This implicit assumption of smoothness led to increased error around
regions such as occlusion boundaries. Using this 2D velocity field also eliminates the
capability of representing multiple motions [12].
Despite these inefficiencies, the 2D velocity derived from the phase-based approach
still performed well, achieving similar results to those of Lucas and Kanade [2]. Like
Lucas and Kanade, Fleet and Jepson provide accuracy measurements that can eliminate invalid velocity estimates to improve the performance. Additionally, Fleet and
Jepson did work to demonstrate the robustness of phase-based methods to variations
in brightness in [14], indicating that an improved method for developing the 2D velocity from the component velocities could further increase the accuracy of the optical
flow output. After Fleet and Jepson's work on phase-based optical flow, various other
phase-based methods were implemented. One of these implementations uses neural
networks to combine the component velocities into a single resulting velocity flow
[18]. Another implementation uses phase techniques to generate edge maps, which
are then tracked to determine velocity estimates at those edge points [2]. Work has
also been done to increase the computational efficiency of the phase-based method,
which is generally quite computationally expensive [6]. Many of these methods claim
similar accuracy to that of the Fleet and Jepson method but have not been included
in large objective studies, so it is difficult to make an objective comparison.
2.3
Method Analysis and Evaluation
Evaluating each optical flow method presents a problem, as there are no formally standardized methods of comparison. A few studies have compared different optical flow
methods, such as in [2] and [17]. These studies have focused more on the brightnessconstancy methods than the phase-based methods. Additionally, even within these
studies, there are some difficulties with the analysis. This section describes a few of
the issues that arise when testing the performance of a given optical flow method.
2.3.1
Image Sequences
One issue that arises is in the generation of the "true" flow field. Natural images do
not provide a corresponding flow field, and it is difficult to create a field to match a
natural image for any but the simplest of motions. Ray-tracing programs are used
to approximate the ground-truth optical flow, but their capabilities can limit the
complexity of the scene. For example, the ray-tracing program used in [11] does not
take into account the motion of the shadows. Shadow effects are a cause for both
illumination changes and multiple motions [12]. Neglecting these effects can lead to
variation between the ground-truth and the calculated motion fields.
Synthetic images are therefore frequently used to evaluate the success of a given
optical flow method. The synthetic generation of the image allows for the creation
of a flow field to match the motion in the image. Even these synthetic images tend
to be simplified, containing broad regions of similar motion patterns. For example,
in the comparison study by Barron [2], each test sequence analyzed a specific type
of motion, such as translational or dilating motion. These broad motion fields often
contribute to better optical flow results, as assumptions such as brightness-constancy
are more likely to hold than in a more complex motion pattern.
However, synthetically created images are not entirely reliable, either. The Yosemite
sequence, one frame of which is shown below alongside its ground-truth flow field, in
particular has been the subject of much debate, due to the motion in the cloud region. Professor Black from Brown University claims, "There is no ground truth for
.
. .. .
. f
.
.
1
1111
d441
ei
e
//e
. .
.
. .
11
/l//
l
C '///,////i////////l/
///I'/////////
////////////
/
.
. .
.
. .
.
. .
.
. .
......... ''...
/e/ IeI
/
//
.
/ / il
/ /
'''
.
.
..
o
. . . . . . . . . . .
I .4,,411114
,
1111
£4 114i 1•
1
t111
. .
. .
.,
'
I
///l//////////I/ll
III
llr
////////////l////
tllllil•r
1
llllr
Figure 2-1: Yosemite: Frame #9 and Ground-Truth Optical Flow Field
the cloud motion. [4]" If this is the case, there is no way to measure optical flow accuracy in that region, and therefore the region should not be used to evaluate error.
Removing the cloud motion from the scene simplifies the flow field significantly by
eliminating the occlusion boundary between the sky and the land [1]. However, since
the image sequence is commonly used in the field, it remains useful for comparison
among optical flow techniques. Hopefully more intricate synthetic sequences will be
designed and standardized in the future.
2.3.2
Parameter Choice
The parameter choice for each algorithm provides another point of contention in
reviewing optical flow results. For most optical flow methods, there are adjustable
parameter choices made in the implementation. These parameters include the size
or shape of a window kernel or filter, as well as the thresholds for discarded invalid
velocity values. However, since there is no predetermined optimal choice for those
parameters, the selection of parameters for a given test can have a large impact on
the results. In many papers, the reasons for using the parameters chosen are not
given, and it is unclear that the parameter choice was optimal. Experimentally, it
can be shown that there are different optimal parameter choices depending on the
image sequence, as well. Work done in [7] attempted to define a set of guidelines
29
to assist in parameter selection for gradient-based optical flow methods, as well as
demonstrating that results can be significantly improved with different parameter
settings. However, this thesis will use experimentally determined optimal parameters
tested over a variety of image sequences.
Chapter 3
Relevant Work on Deblurring
Deblurring is a method of image restoration aimed at sharpening edges and bringing
out details in the image. A point spread function (PSF) is a mathematical descriptor of the way the blurring occurred. The accuracy of the modelled PSF is highly
correlated to the capability of restoring the image. The following sections give a
mathematical description of the blurring problem for motion and propose solutions
for restoring the image assuming the PSF is known. In this chapter, as for the rest of
the thesis, the common notation a(x) denotes the spatial function and A(w) denotes
its corresponding frequency domain function. The Fourier transform property of convolution will be used. This property states that a(x) * b(x) in the spatial domain
becomes A(w)B(w) in the frequency domain and a(x)b(x) in the spatial domain
becomes A(w) * B(w) in the frequency domain.
3.1
Motion Blur as Convolution
To model motion blur in one dimension, under the assumption that the sensors are
linearly receptive to light, we can write that the response to a specific point of light
with no motion at point x is
T
u(x) = P
Idt
0
(3.1)
where I is the intensity of the light, T is the exposure time of the pixel to the light,
and t is the sensitivity of the sensor to light. This equation shows that the response
of the sensor to a static light source is simply the total amount of light received at
the sensor, scaled by the sensitivity of the sensor. Under the assumption that the
intensity of the light is constant during the exposure time T, this equation becomes
u(x) = pIT
(3.2)
Extending this concept for moving light sources, the response of a pixel to the
moving light source is dependent on the amount of time the pixel is receiving light
from said source. If f is the size of the sensor, and the velocity is constant with
v = L/T, the light will travel across N pixels, such that N = L/IE.
Therefore,
for a single point of light on one sensor n, the effective exposure time of the pixel
tn = T x EI/L. This relationship leads to the equation below:
tn
k, =
/ Idt
(3.3)
0
This equation describes the total amount of light received at pixel n from one point
of light. Assuming uniform light intensity during the exposure time, this equation is
also the same as stating that the light received at the pixel n is the initial response
u(x) scaled by the fraction of time that the light source affected the pixel, or the
value E/L.
kn = PI x tn = (lI
x T) x EIL = u(x) x EIL
(3.4)
Therefore the total response u(x) is equivalent to the sum of the response at each
N
sensor along the light path, E kn [10], [24].
This response can be extended into two dimensions and be written as a convolution
of the ideal image y(x) with the blurring kernel k(x) [22], [10], [24].
b(x) = y(x) * k(x)
(3.5)
Using a block diagram to indicate the convolution process, the blurring function can
be shown as:
y(x) -- k(• -- b(x)
The process of deconvolution is therefore determining the estimate ý(x)by filtering
the blurred image with some filter h(x).
- b(x) --
y(x)
(x)
The estimated image is then:
y(x) = b(x) * h(x)
(3.6)
Converting into the frequency domain, this becomes
Y(w) = B(w)H(w)
After substituting B(w) = Y(w)K(w) from equation 3.5, the estimated image in
frequency is:
Y(w) = Y(w)K(w)H(w)
(3.7)
To recover Y(w) from Y(w), a simple solution to this deconvolution problem sets the
filter H(w) equivalent to 1/K(w).
3.2
Noise
This method of inverse filtering for deconvolution is highly noise-sensitive. This sensitivity is because the blurring function tends to exhibit a low-pass filter effect, with
small values at high frequencies. When the inverse is taken, the small values become
large, amplifying error that exists at those frequencies [10], [24]. These frequency regions where the signal is small indicate that the signal-to-noise ratio is also very low,
and therefore these regions will tend to amplify mostly noise [22]. Figure 3-1 shows
an example low-pass filter PSF and its Fourier transform K. Figure 3-2 shows the
0.3
0.2
0.15
i
0.1
i
o.05
4
-4 -
a
2
_
Figure 3-1: Example PSF and its Fourier Transform.
inverse filter H and its inverse Fourier transform, the deconvolution kernel h. Note
that the scale for H shoots up to 1011.
4.Sxo"
4.
I
I
Z5
2
0
r
1
1
1
~
2
rl
I~
I
.I
4
442*24*I
Figure 3-2: Inverse Filter and Restoring Function.
This noise amplification effect can be demonstrated mathematically by including
the noise in the system model. Clearly, the type of noise modeled will affect the
deconvolution solution.
When the system includes some additive noise v(x), which could be contributed
by quantization [23], for example, the block diagram becomes
v(x)
-1
where the blurring function b(x) and the recovered image Y(w), given below, include
the additive noise.
b(x) = y(x) * k(x) + v(x)
(3.8)
Y(w) = Y(w)K(w)H(w) + V(w)H(w)
(3.9)
The result after applying the inverse filter H(w) = 1/K(w) becomes
V(w)
Y(w) = Y(w) + K(w)
K(w)
(3.10)
after inserting the new equation for B(w), corresponding to Equation 3.8. This
expression mathematically explains the noise amplification described above.
One method of mitigating this problem of noise is to apply a threshold to cut off
the large high-frequency values. The effect of a simple threshold is shown in Figure
3-3.
I
Im 1400
-low
as
13
4
-4
-2
0
2
4
s
Figure 3-3: Inverse Filter with Threshold and Corresponding Deconvolution Kernel.
It is easy to see that the kernel resulting from the thresholded H is much smoother,
and therefore less likely to amplify high-frequency noise, although this smoothing also
affects the accuracy of the resulting deconvolution kernel.
One form of a threshold can be written using a regularization term r(w), such
that
H(w)
K*(w)
K*(w)
IK(w)
2
+ r(w)
(3.11)
For r(w) << JK(w) 2, this filter approaches the ideal inverse filter. However,
when JK(w) 12 is small, the regularization factor r(w) helps to reduce the magnitude
of H(w) and prevent the domination of the noise term. Choosing the regularization
factor is important. The Wiener filter for deconvolution is a special case of regularization, in which the filter r(w) is the squared noise-to-signal ratio [10]. However,
calculating the noise-to-signal ratio can also present a problem, as the power spectrum
is not always easy to estimate.
There are many variations on regularization filtering to improve its functionality.
One major concept, used in many image processing applications in addition to Wiener
filtering, is the idea of making the filter adaptable. Adaptive processes alter the filter
for local signal properties to generate a more accurate result [24]. For example, the
Wiener filter requires a spectral estimate of the signal and its noise properties. In
solid background regions, the signal-to-noise characteristics will differ greatly from
the characteristics in densely detailed regions of the image, and generating separate
statistics for various portions of the image can help generate a more accurate PSF
at each point. Additionally, deblurring operations can be iterated to increase the
accuracy of the output. Other methods, such as the Lucy-Richardson algorithm, use
iterated solutions to generate the deblurred image.
3.3
Blind Deconvolution
Blind deconvolution is a method of deblurring the image when the blurring kernel is
unknown. In contrast, all of the methods described above require that the blurring
kernel is known. The blind deconvolution process involves applying constraints to the
system in order to solve for the original image y(x). Information about the imaging
system, as well as properties of the signal itself, can be used to provide constraints
[23]. Assumptions concerning the original signal or the PSF can also be used to solve
the blind deconvolution problem [24]. Once the constraints have been determined,
the system is minimized to find the solution that most closely matches the blurred
result. The MATLAB deconvblind function uses a maximum-likelihood approach to
minimize the error between the restored image convolved with the predicted PSF and
the initial blurred image. This function assumes a set PSF size, which is entered
by the user in calling the function. This thesis uses the deconvblind function for
comparison with the deblurring results.
Chapter 4
Approach
This thesis focuses on correcting motion blur using optical flow velocity fields. All
of the image sequences used for testing were in grayscale. The optical flow and
deblurring portions were developed separately and then combined at the end.
4.1
Optical Flow
The optical flow method used closely follows that of Fleet and Jepson described in
[13]. Phase-based methods were chosen because of their robustness with respect to
changing illumination and deformations [14], [6]. This section describes the specific
implementation developed in this thesis.
4.1.1
Gabor Filters
In Section 2.2, the Gabor filter used for processing the image was briefly discussed.
To review, this filter is a sinusoid modified by a 3-dimensional Gaussian envelope with
covariance C.
Gabor(x,t) = ej(Xt)'(fO~wo)G(x, t; C)
(4.1)
The values fo and w0 represent the frequency and orientation of the filter, respectively.
For separability, which will speed computation, it is convenient to use a zero-mean
Gaussian that is separable in 3 dimensions, each dimension with a standard deviation
of a so that the covariance matrix C becomes
0
0
0 a2
0
a2
0
0 a2
The covariance argument to the Gaussian will now be referred to as the corresponding
a.
In the frequency domain, G(x, t; a) transforms into G(x,t;Uk) with aUk equal to
1/a. The exponential component represents a frequency shift, and the resulting filter
is a Gaussian of ak located at fo, wo in the frequency plane. The value of a was
determined as in [13], with an octave bandwidth
Uk
f
such that
fo(2/+1)
(2 - 1)
(2 + 1)
(4.2)
infrequency and a = 1/ak. Figure 4-1 presents an example Gabor filter, showing
both the time and frequency domains on the left and right, respectively. For this
filter, fo = .15, / = .8 and wo = 45'. For both domains, the center of the image is
set to the coordinates (0,
0). The Fourier transform is shown from -7r to
the x and y directions.
Figure 4-1: Real Parts of Gabor Filter and its Fourier Transform
7r
inboth
4.1.2
Component Velocities
Using the basic equations from Section 2.2, we can derive a simple method for explicitly calculating the component velocities from the phase q(x, t) of the filtered image
R(x, t). The basic equations are given again below:
O.u + Oyv + Ot = 0
n(x,t)
=
'X(X, t)
t)
SIlx (x,t)l|
v, = an(x,
(4.3)
(4.4)
(4.5)
t)
After simplification, shown explicitly in Appendix A, the final equations for the
velocity components u, and v, are:
The derivatives of ¢, ,x, O,, and
un =
+
v-
+ ¢• C
qt
V(x, t)-
=
z
(4.6)
(4.7)
can be determined using the phase identity
[R*(x, t)VR(x, t)]
(4.8)
p2 (x, t)
derived in Appendix B, and therefore the component velocities un and v. can be
determined.
4.1.3
2D Velocity
The 2D velocity was calculated differently than the method of Fleet and Jepson in
[13]. As was mentioned in Section 2.2, Fleet and Jepson used a method of leastsquares over the local region, which implied an assumption of a smooth surface.
They acknowledged that this solution required further work [12], as this assumption
subjects the 2D velocity to the same constraints as the brightness-constancy method.
In this thesis, the component velocities were combined using a normalization pro-
cess across all filters, eliminating those velocities that fell outside a normalized distance of 2 from the mean, where there was a standard deviation of 1.5. The values
of standard deviation and distance from the mean were determined experimentally,
to generate accurate results while maintaining adequate density in the final velocity
field.
In order to completely cover the frequency space, the filter set was designed to
cover 360 degrees. The filters were tuned to frequencies 2 a apart, so that each filter
spanned a radius of a. The frequency extent of the filter was limited to one u. Due
to the inability to simultaneously localize in the space and time domains [20], [25],
this frequency restriction causes spreading in the space domain. However, limiting
the frequency range of the filter reduces the error for the corresponding component
velocity and is used as one of the measures for eliminating invalid velocity values in
[13]. The choice of a distance of a between filters was used because it sets a narrow
frequency band while limiting the spreading in the space domain.
4.1.4
Error Calculation
Assuming the ground-truth flow field is known, there are multiple methods of calculating the error between the ground-truth flow field and the output field of the
optical flow method being tested. One of the error calculations used in this thesis is
the angular error measure described in [12], given below in Equation 4.9. This error
measure was also used in the comparison study performed by Barron et al. [2]. Since
many newer methods reference the Barron study frequently, using the angular error
measure allows the results to be analyzed in reference to the previous work.
Using the notation c for the correct velocity vector and e for the velocity estimate,
the angular error measure is given by:
TIEZ
= arccos (C - e)
(4.9)
The second error measure is a simple magnitude measure, also used in [17].
PElll-
= II - ell
(4.10)
These two error measures completely specify the vector by including the direction and magnitude. Additionally, these error measures are useful because of their
compatibility with the fspecial function used in the deblurring process that will be
described in Section 4.2.
4.2
Deblurring
A regularization filter, as described in Section 3.2, was used on each point in an image
to deblur it. The filter is based on the estimated motion blur and an experimentally
determined value to adjust the extent of its effect on the image. The motion blur
estimate is produced by the MATLAB fspecial function with the 'motion' argument. This section provides further details on the implementation of the filter and
its application to the image.
4.2.1
Regularization Term r(w)
The regularization term r(w) is a constant that was experimentally determined to
achieve adequate image sharpening while maintaining reasonable magnitudes. The
use of a constant effectively sets a bound on the magnitude of the filter at all frequencies. The term r(w) is frequency-independent because signal and noise properties are
assumed to be unknown in this thesis. A frequency-dependent term, such as the one
used in Wiener filtering, could be applied to images where information on signal and
noise properties is available.
4.2.2
Generating the Filter H(w)
To create the regularization filter H(w), MATLAB's fspecial function was used with
the 'motion' argument, which simulates the effect of camera motion.
The other
two arguments to this function are an angle Z and a length L. The values of these
arguments were determined for each point on the image using the velocity vector v for
that point. For a given point, the angle value Z was determined using the 4 quadrant
arctan of v, while the length L corresponded to the magnitude of the v. The output
of the fspecial function using Z and L corresponds to the motion blur kernel k(x) for
the input image.
The filter H(w) was then computed directly from K(w), the Fourier transform of
the PSF, and r(w), using the regularization filter equation 3.11 repeated below.
H(w) =w)
K*(w)
IK(w)|2 +r(W)
(4.11)
Discretization affects the calculation and processing of the deblurring filter. The
conversion of the PSF into the deblurring filter occurs in the frequency domain and
the resolution of the discrete fourier transform is limited to the size of the function
in either dimension. In addition, the calculation of the PSF requires an angle and a
length. The discretization of the pixels means that the angles are fit to the closest
approximation, and that the lengths are rounded to integer values. As the lengths of
the vectors increase, there are more pixels available to achieve a more accurate angle
approximation, as well as to reduce the effect of rounding to the nearest integer.
Therefore the value of L passed into the fspecial function was a scaled value of the
original magnitude. The image was interpolated a corresponding amount prior to
filtering to maintain the relative velocity-per-pixel ratio. Interpolation increases the
computational power needed but helps to minimize error due to discretization.
4.2.3
Applying the Deblurring Filter H(w)
Filtering the image occured in the space domain, due to the space/frequency simultaneous localization problem.
To maintain localization in the space domain, the
degraded image was processed by the inverse Fourier transform signal h(x) of H(w).
Because convolution is the inverse process of multiplication when switching between
domains, this would normally increase the computation significantly. However, since
the convolution is only being calculated at a single point for each velocity vector
[PSF], the computational increase is small. The signal h(x) is flipped over both axes,
aligned with the interpolated image, and the two signals are multiplied. The sum of
the results of this multiplication is equivalent to the convolution of the two signals at
the point in question, and this value becomes the value of the output image at that
point.
Conveniently, the error measures applied in the optical flow component (see Section 4.1.4) are also in the form of an angle and a length, and therefore the amount of
error generated in the optical flow portion of the processing should correlate to the
error in the deblurring portion of the processing.
4.2.4
Measuring the Deblurring Error
Specifically calculating an error value for the deblurring process is difficult. Therefore, the margin of error is determined subjectively, by visually comparing the reconstructed image to its original. The specific points of interest in the comparison were
image sharpness and detail.
Additionally, the output image was compared to the results of a blind deconvolution, as well as the results of a LR filter using the average PSF of the image.
Chapter 5
Results
This section evaluates the performance of the system. An explanation of the testing
procedures applied is given first, and the results of the tests follow. The optical
flow and deblurring components are initially analyzed separately, and at the end the
combined results are given.
5.1
Optical Flow
This section gives the experimentation and analysis for the optical flow component
of this thesis.
5.1.1
Optical Flow Test Sequences
The test sequences in this thesis include the Yosemite sequence mentioned earlier, as
well as some other synthetic images. These synthetic sequences were retrieved from
[11], where the ground-truth was calculated using a ray tracing program. For areas
where a section of an image remains constant in intensity despite the motion, the
ray tracer returns a value of approximately zero for the velocity. This is noticeable
where large, opaque, and smooth objects translate across the image; the motion of
the object's interior is not captured. Additionally, in regions where the magnitude of
the velocity approaches zero, it is possible to get large angular errors, as the angle
of a vector with zero magnitude has no practical meaning. Therefore, these regions
with near zero velocities will be masked out for error calculations.
Another important point to note is that the given ground truth is only relevant for
that specific frame of the test sequence. In the case that the direction of motion altered
between frames, succeeding frames of the same sequence have different ground-truths.
The values used to calculate error consisted of an average over the ground-truth values
from all of the frames used in the test.
In the series of figures below, a frame of a given test sequence will be presented along with its corresponding ground-truth optical flow field and a description of the motion. The optical flow image shown is subsampled for ease of viewing and represents the relative, rather than the absolute, magnitudes of each vector.
The Yosemite sequence is the one described in Section 2.3.1. This sequence contains
a landscape which includes dynamically changing clouds on the top. The motion in
the sequence is that of a camera approaching the scene, with the focus of expansion
towards the middle on the right side. The sky component presents motion generally
to the right, but with other motions occurringv within the clouds themselves.
..........
, , ,, ~~
.
-------------_--_
----------
Figure 5-1: Yosemite
The Sphere sequence consists of a checkered sphere rotating in place over a constant background. The rotation axis of the sphere is tilted forward slightly, so that
the rotation of the far side can be seen.
·____
\ ~\_______I
~_____Lrl~l
i
I \\~~-------~~~I
\\~~5---\~~\~-------riill
~rl
~~-·~-----~'~C~
,----
Figure 5-2: Sphere
The Office sequence represents a situation in which the camera is approaching the
scene. The focus of the dilation is in the center of the image.
\\\\\\~\ (
...............
II~~~
.........
Figure 5-3: Office
The Street sequence contains a large constant background, with a car translating
across it. The motion of the car varies between the wheel rotation and the translation. The camera pans slowly towards the right as the scene progresses. The car is
partially occluded in the initial frames.
---------------------------------- ------------------------------ ------------------------------------------------------------------------------------------------------------------------------ -------------------------------:---------------------------------
------------ - ---------------------------------------- ---
------------- ---------------------------------------------------------------------------
Figure 5-4: Street
The Blocks sequence illustrates rotational and dilational motion of the camera simultaneously. A group of blocks sit on a textured background, while the camera
moves in and around the scene. Unfortunately, the motion for this sequence varies
drastically between frames. This variation can be seen in Figure 5-5, where the three
flow fields shown (corresponding to the first, second, and third frames of the sequence)
are considerably different. The effects of this variation in motion will be discussed
with the results.
.........
()IIIIIII:...........~1/
·\\~\\~~\~\\~~\~(L)L)
.....
......
.....
......
lll",ý
zzllll
~ ~ýý..
LLL)
·\\~\\\\\~\~
............
1~11
............
.\~\\~\
~ \II\ L)IIIIII( II 1~1
11
111
..
.............
..........
......-..............
...........
\~~\i~\\\\\\\\\
;i4\i~\\\~~\~\~~~~~\\\\\\\\Mills;~:I
\I\\
.2~~-~-~\
~c~~~\
~ ~ ...........
,,,,,..
.............
., r\\\\
--- . .\~~\\~,\,
---------
~I \~i~
I
I~~
..
I
Figure 5-5: Blocks
In the Vcbox sequence, the camera approaches a labeled Visual C++ box, positioned on a textured background. However, the focus of the expansion shifts between
frames. This can be seen by observing the flow fields (again corresponding to the
first, second, and third frames of the sequence) in Figure 5-6. The area containing
the short velocity vectors represents the focus of expansion.
49
1_____________·_~~··~~·1~··1·1111··_llr
-T----~5-·~-~~·--~\~~\~~\r~~·~Ltl)llllll
----------~---~~'\\~\~~"~)'(11111(11~1
· ) I ( I I I I I I I I 1 1111111
I · ( I I I I I I I I 1III1
Y----r----r----llrrrrrrl
VZL~rr-rrrrrr-rrlrrrrr·······.·-rr-----Wcr~ZccccccccrllllllI)l······-··r__r------------------- 1141~111·1·1·1·.··-~·--C~-~~~~-~~CCrllllll~111111111~(
C~rrrC~~CrCrCrr~l~l
C~~---~---~~r,,rrrrl~1111111~111111\~~~~
CC/-Cr~~llllCl~llrll111111111111(111\~~\
r~~~CC/IC~II~~III~I~II
( I I I 1 1 I I ) I 1 I I I I ~~~\\~~~~
r/~/~~/~~~l~~l~~~l~l~~~1111
11111(1~~~~\\\~~~~~
C~~~~~~~~~II~I~I~~II
I I I ) ) I f I I LI I ( ) I I \ \~\\\~~~~
C/~~~C~~~~~~~~I~III~~I~(
\\\\\\~~~~
C~~~~~r·~ll~l~ll~l
I II I I I I I)II (1 I1 I( II II II (It( II\~ (\\\\\~~~~
1//11~11111111~111
rll I I I I I I f 1 1 I I ( 1 I \ \\\\\~\\\~
r//~/~lll~~llllll
1 1(11 I) ( ( I t t I I ( I I \ I I\ \\\\\~\~
~~////111/1111111
111) ) I) I ( ( t t I ( I I I\ \\\\\~~~\\~
1//~~/111~111~111111I
11) I I I t 1 1I I \ 1 \ ~ \\\\\\\\~~
l~llllrlllllllllllllI
""'
· "' ' ' ' ' · ' ' ' ' ' ' · 1)
' · )· )' )' )· f' I' I' (· \' \· I· \· 'I \\\\\\\\~~
' \ ' ' · ' ' "'
·I-~\\\~~\\\T~~-rr\\\r7\1
l~rT111111111 III111111(11
-.
~\-r\\\\\\~\\\\\\\\~\\\\~lllr(rlllrl/ll/
·~\\~~~~·\\~\\\\\\\\\\\~~L~~II1111111/1/
·~\~\·~~~\\~\\\\\\\\~\~~~LllllllIlll/~l/
~~~~~\\~\\\\\\\\\~\
II)III
I 11111//////111/1//
·~\~\~\\~~\\\\\\~\\\)LIL\)~IIIII((//~III
·\\~~~\\~\\\\\\\\
\ \ ) I LI ) ( I I I I I ) ( ( I I I I 1 1~111~/~
·~\~~~~~~~~\\~\\\~\~\L)))I~I(IIII(II~/I/
·-- ~-~\~\\\\~\\\\~\~LI\I(II1(II()IIIIIII
~~~~~\~\~\~\\\\~\L(
(III I I i I i 111)11/1111~1111
~~~~~~~~\\\\\\\\\~ LIII I I I I 111 11)111111~111~~
··~--~-·-·-·5~~~~5~\\~\~
\-·\\~~~~\\\\\\\\\~~(\\~(~lllll(l)llllr
~~~1·1 · I II II · I·rllrrrr-llre
·~~-·-·~~~-~\\~~~~~~~\~~~~llllllll,·rrrr
I · i _ . r - - r r r---r-I · · I · r · r -- - - r r-----rC15~-~-~-·5--~~-\r-- ~-~-)))I I-· I-II---rCCC------------------ _---~~L)I~)·------------- ~- ) ) I
~---~------~~-~-------------··~I
-·----------------------~..·
~·~~----~-..........\
............
. . i ii j:: ::
UCC-2-------r~~
I-rrC~rCCCCC~ZII~III4111·11·1·11
~CCCCCCrCr~-CIIIIIIIIll~llllll·~tl\ll\~\
·------- ~C-~ll-~~l~~~111(·)111~·~1~~~1~~
·CC~CC~C~~~I~~~~II~~11~11((1111(11\·~\\\
· ~~~~~~~~rCIII II III I r · · I I · I I ( ( · I I I I I I I I \ 1~1\~~
Figure 5-6: Vcbox
The Simple sequence contains two objects, each rotating in different directions. One
object is a checkered sphere, as in the Sphere sequence, but with a vertical axis of
rotation. The second object is a solid square close in color to that of the background. Additionally, the objects move into occluding positions during the sequence.
One drawback to this sequence is that the motion of the shadows is not given in
the ground truth, although the shadows clearly follow the translation of the objects.
Figure 5-7 shows the first and last frames along with their corresponding velocity
flow fields. Observe the object shadows in the images on the left, and the lack of a
corresponding velocity vector in the flow fields on the right.
50
~
-•- - ------------ ----
•
.....
i ii
,
::
.........
m
!
Figure 5-7: Simple
In the Medium sequence, a car sits on a checkered floor. As the camera approaches
the car, the camera swings around to view the car side-on, combining dilational and
translational motion. This sequence raises a point of contention about whether the
car is in motion, since the location of the car on the checkered pattern changes. This
change in location causes the car to appear to move, as can be seen in the frames
shown in Figure 5-8. However, the ground-truth provided by the ray-tracing program
indicates that the floor beneath the car is in motion.
Figure 5-8: Medium
The Complex scene is viewed from the windshield of a car driving along a road,
passing buildings and trees on either side. A car pulls out of a side street, and another car approaches from the front.
.. .. .. .. ..
Figure 5-9: Complex
5.1.2
Parameter Selection
Using the test sequences described above, a series of experiments was run to determine
the optimal choice of parameters for a given type of sequence. For the plots in this
section, frequency is given in radians, angular error is given in degrees, and the density
is given in percent coverage of the image. The values for / and the magnitude error
are scalars and have no units.
Frequency
The optimal operating frequency was determined experimentally. Frequencies between .02 and .15 were tested with / = .8 and using 3 temporally separated frames
of a given sequence. The results for the average angular error at each frequency are
given in the plot below, Figure 5-10.
mean angular Error
10
16
14
Sphere
-Office
Street
-Blocks
-Vcbox
---- Simple
Medium
Complex
Yosemite
SlO
SIC
a
Figure 5-10: Mean Angular Error vs. Frequency
The average angular error is mostly constant over the range of frequencies, trending higher at the extreme low and high frequencies. Most of the power in the image
is concentrated at low frequencies, and therefore it is difficult to separate out the
direction at those frequencies. At higher frequencies, noise becomes more prominent
among the values selected by the filter. Additionally, at extreme high frequencies,
implementation in a discrete environment becomes more difficult. Aliasing becomes
an issue in the frequency domain, and the effect of the aliasing in the time domain
is that the image must be interpolated in order to represent an adequate number of
pixels. In general, frequencies around .06 Hz seem to be successful.
Meanwhile, the mean magnitude error remains constant across all frequencies.
This effect is shown in Figure 5-11. For the majority of images, the magnitude error
is less than approximately 3. However, for the medium and complex sequences, the
magnitude error blows up. This effect can be explained due to some irrationalities
in the magnitudes provided for these test sequences. In the Complex test sequence,
the velocities approach a magnitude of 150 pixels per frame. The image dimensions
are 300 x 400, and such high velocities would completely distort the image within
very few frames. Therefore, the magnitudes provided appear to be in error. In the
Medium sequence, the motion between the car and the ground could be disputed, as
explained in Section 5.1.1.
mean magnitude Error
IO
14
12
---
Sphere
Office
Street
----
Blocks
--
Vcbox
Simple
Medium
Complex
Yosemite
8
w 10
--4
2
a,
602
d'.02~ 00 0.4
0.04
0.06
.0
0.08
f9
.
.2
.4
0.1
0.12
0.14
01
0.16
Figure 5-11: Mean Magnitude Error vs. Frequency
Effect of Interpolation
For the frequencies tested here, the results from interpolating the image in preprocessing are similar to the results without any interpolation. At higher frequencies,
interpolation would be necessary to avoid aliasing. The difference in average angular
error between the interpolated image and the original is plotted in Figure 5-12. A
mean angular Error
----
0' 1
F
w
a-.
-U
'U 1
-
E
Sphere
Office
Street
Blocks
Vcbox
Simple
Medium
Complex
Yosemite
16
;6
Figure 5-12: Mean Angular Error vs. Frequency; Difference between Original and
Interpolated Tests
positive difference indicates that the interpolated test for the corresponding sequence
performed worse than the original. A negative difference indicates that the interpolated test was preferred over the original. However, while the negative differences
are within a negligible two degrees of the original results, the positive differences
can contribute up to almost 18 degrees of additional error. The conclusion is therefore that overall, for these low frequencies, interpolation is unnecessary and may be
detrimental to the accuracy of the results. This outcome can be explained by the
side effects of interpolating. Interpolation assumes smooth changes between points
in order to estimate the value between the points. The interpolation smoothes the
gradients that the optical flow process uses to determine the flow, and therefore can
degrade the quality of the optical flow results.
Yosemite Cloud Region
As discussed in Section 2.3.1, there is conflict over the validity of the cloud region in
the Yosemite sequence. If this region is masked, the angular error drops approximately
10 degrees for all frequencies. The dashed line in Figure 5-13 represents the masked
version.
mean angular Error
16
1n
_
Figure 5-13: Yosemite Mean Angular Error, Clouds Masked
This effect can be observed qualitatively by viewing the angular error map of the
Yosemite sequence, presented below (Figure 5-14) for fo = .06 and 0 = .8. Bright
regions indicate higher angular errors; note the prevalence of these bright areas in the
sky.
Figure 5-14: Yosemite: Angular Error Map
Temporal Support
The testing above was run using 3 frames. Further testing was done to compare
the average error when 5 and 7 frames were used. In the plots in Figures 5-15 and
5-16, the solid, dashed and circled lines represent 3, 5 and 7 frames, respectively. The
horizontal axis for each of these images is fo, and the vertical axis is the mean angular
error in degrees.
As one can see from the results in Figures 5-15 and 5-16, the average angular
error generally remains constant regardless of the number of temporal images used
to generate the optical flow output. However, in most of these cases the flow was
approximately constant across all images in the sequence. In the images where the
direction of motion changed within the time period of the sequence tested, such as
the Blocks image, variation in the angular error for different numbers of images is
observed.
For these rapidly varying motion patterns, the use of more temporally separated
images allows for greater averaging of the flow in order to settle on the overall trend
of the motion and to determine the flow path. For the test cases, the correct flow
fields were also averaged, and therefore this result demonstrates that the optical flow
implementation determines the average motion over the image sequence provided.
However, for use in deblurring, the shorter image sequences should provide a better
K
,~
4., 4
nl
"
0.05
0.15
0.05
0.15
Office
Sphere
-
74---------
7
.
..
9
i'
0.05
0.15
0.05
Street
'--'-.4
0.15
Blocks
~" ~
..
0.05
0.15
Vcbox
0.05
0.15
Simple
Figure 5-15: Mean Angular Error vs. Temporal Support Size part 1
,..
.------..
----,,, .
---------
.-------
0.15
0.05
Medium
0.05
..-
0.05
0.15
Complex
0.15
Yosemite
Figure 5-16: Mean Angular Error vs. Temporal Support Size part 2
representation of the specific motion from frame to frame.
This result indicates that for most cases, where the motion is relatively consistent
over the time period throughout which the image sequence was captured, a small number of images can be used. Using fewer images is beneficial in conserving operation
time. In cases where there is inconsistent motion between frames, the assumptions
of linearity made in the temporal constancy motion model are violated, and optical
flow is a poor solution.
Beta
The value of
/
is the final parameter to be determined. Values of 0 between .4 and
1.2 were tested, with fo = .06 and 3 temporal images. Figure 5-17 plots the results.
mean angular Error
Sphere
-Office
SStreet
Blocks
-Vcbox
w
I
- Simple
Medium
Complex
Y--osemite
U
C
Up
3
Figure 5-17: Mean Angular Error vs. P
These results show that 3 has little effect on the average angular error; the error
is approximately constant across the plot for all images. The original value used for
testing, 3 = .08, will be the final choice for this parameter. This value was also used
in [13], which is convenient for comparison.
Densities
In all of the tests, the density of the optical flow result was greater than 99 %. This
result is shown in Figure 5-18. The scale on this figure is from 99.75 to 100 %. The
individual component velocities had lower densities, but the invalid values were generally at different locations for each filter tuning. Therefore, the combination into
the 2D velocity successfully produced a value for nearly all locations, only rejecting
values where less than 5/6 of the component velocities had valid results. Varying /
generated similar density results.
mean Density
Sphere
SffiOce
-Street
-Blocks
Vcbox
-Simple
Medium
Complex
Y-osemite
C
16
Figure 5-18: Density vs. Frequency
5.1.3
Evaluating Error
This section presents the error data and analysis for the selected parameters. These
parameters are fo = .06, / = .8, and a temporal support of 3 images. Table 5.1 gives
the mean errors for these parameters.
The average angular error is largest in the Blocks and Vcbox images. The error
Image Sequence
Sphere
Office
Street
Blocks
Vcbox
Simple
Medium
Complex
Yosemite
Yosemite (masked)
MAE
5.4606
15.0868
10.0560
65.9089
47.3316
8.3123
21.2902
21.8977
32.6134
20.1809
MME
0.2937
0.3049
0.4921
0.7165
3.1848
1.6156
14.5334
15.3410
1.4687
1.7264
Table 5.1: Mean Error
for these sequences is significantly higher than any of the other sequences. These
two sequences were also the most at fault for variation in flow field from image to
image. This result implies that the accuracy of the optical flow calculation reflects
the quality of the input sequence.
Flow Discontinuities
Viewing these results, the primary observation is that the regions around flow discontinuities contain the most error. This effect is noticeable in the Sphere and Office
sequences.
!
Figure 5-19: Discontinuity Error
In the former sequence, shown on the left of Figure 5-19, the largest error occurs
near the axis of rotation, where the neighboring velocity vectors are in nearly opposite directions. The Office sequence, on the right of Figure 5-19, illustrates this effect
near the focus of dilation where the velocity vectors point outwards from the center.
This effect can be partially explained by the combination of the component velocities
into a single 2D velocity. Each component velocity selects a flow tuned to a specific
frequency and orientation and covers a certain spatial area. Averaging the flows from
all orientations allows neighboring vectors of different orientations to influence each
other and generate error at the boundary.
Magnitude Error
The process of combining the component velocities into a single 2D velocity can
account for much of the magnitude error, as well. The magnitudes calculated by the
optical flow routine are generally flat across the image, and therefore the magnitude
error is approximately proportionate to the magnitude of the flow field. Averaging the
component velocities meant that the filter orientations with weak responses mitigated
the responses of the filter orientations with strong responses. Figure 5-20 illustrates
this effect for the Sphere, Office, Street, and Yosemite sequences.
Shadow Regions
As expected, the shadowed regions of the Simple sequence are picked up by the optical
flow process. These regions were ignored by the ray tracer that created the groundtruth flow field and therefore appear as error in the result. Figure 5-21 shows the
magnitude of the flow fields, as well as that of the angular error. The magnitude
of the calculated flow is scaled so that the magnitude of the shadow is visible. The
calculated optical flow is reasonable because the shadow does create a noticeable
moving gradient, which can be seen in frames from the sequence shown in Figure 5-7.
Aside from the conflict with the ray tracer flow field, the shadows are significant
because of their transparency. Gradients covered by the shadows can still be computed
in the optical flow calculations, and these can lead to multiple motions. Although
this effect was not noticeable here, as the region beneath the shadows was static, the
generation of multiple motions from transparent shadowing could cause a lot of error
when combining the combining velocities into the 2D velocity.
Calculated Magnitude
Ground-Truth Magnitude
Magnitude Error
Figure 5-20: Magnitude Comparison
Ground-Truth Flow
Calculated Flow
Angular Error Map
Figure 5-21: Simple: Shadow Error
64
Handling of Different Motion Types
Overall, areas of unchanging gradient are handled well. This situation occurs in static
regions such as in the background of the Sphere sequence, and in regions of the Office
sequence such as in the window in the upper left corner. In the latter example, the
optical flow calculation agrees with the ray tracer in ignoring the internal motion.
For the purpose of deblurring, this version of the motion field should be acceptable
as long as the motion vector is shorter than the edge boundary created by the optical
flow. Essentially, as long as the motion of the constant gradient region is smaller than
the region captured in the optical flow calculation, the deblurring process should be
unaffected by ignoring the motion in that region.
This implementation handles dilation and translation well. This effect is illustrated by the good performance of the images containing these motions. The Yosemite
sequence demonstrates an approach into the scene, while the Office sequence demonstrates an expansion out of the scene. The Street scene captures the translational
motion to the right, although there is some difficulty measuring the motion of the
car. Low average error is observed in all of these sequences.
This optical flow implementation has the most difficulty handling rotational motions. The difficulty with rotation is evident in the Simple sequence. The rapid
rotation of the square object contains a lot of error. The fact that the slower rotations both towards the inside of the square object in this sequence and on the broad
side of the sphere in the Sphere sequence are generated correctly indicates that the
speed of the rotation affects the accuracy of the results. This effect is a consequence of
the assumption of linear motion made in the optical flow implementation. For a short
velocity vector, the motion approximates the model. However, as the magnitude of
the vector increases, the model fails since it cannot account for the curvature of the
motion.
5.2
5.2.1
Deblurring
Deblurring Test Sequences
For the deblurring portion of the testing, many of the sequences from the optical
flow experiments were used, in modified form. Three frames of a given test sequence
were averaged to generate the blur and the given ground-truth flow fields for those
frames were averaged to generate the ground-truth for that artificial image. For
image sequences with larger velocities, the blurred image was also smoothed by a
Gaussian to reduce the discontinuities from the image averaging and generate a closer
approximation to an image smoothed by motion. Applying the Gaussian blur may
adversely affect the results by violating the assumption that the blur was strictly
caused by motion, but the artifacts generated by the averaging process also reduced
the quality of the motion approximation. The Gaussian smoothing was applied to
the Vcbox, Simple, and Medium sequences, which have maximum velocities of 10.1,
19.8, and 22.0 pixels per frame, respectively. The Sphere, Office, Street, and Yosemite
sequences have maximum velocities of only 2.6, 1.4, 5.1, and 5.5, respectively, and
therefore need no additional smoothing.
The Blocks and Complex sequences were not used for the deblurring tests. The
extreme motion variation between frames in the Blocks sequence made it a poor
representation of motion blur, since the motion direction was changing. Vcbox had
this problem to a lesser degree but was kept in the tests to show results for both the
textured background and the irregular motion field. The large velocity magnitudes of
the Complex sequence, up to 150 pixels per frame, made it an impractical sequence
to deblur due to the size of the PSF that would be required for reconstruction of the
image.
The images in Figure 5-22 (page 67) show the center frame of the original image
set next to the constructed image simulating the motion blur for each of the test
sequences used in the deblurring experiments.
Original
Blurred
Figure 5-22: Images for Deblurring Tests
67
5.2.2
Determining r(w)
The value of the regularization term r(w) was determined experimentally. A constant
value was chosen to provide consistency across the entire image. Since this term will
be a constant, from now on it will be referred to simply as r. The constant effectively
sets an upper bound of 1/r on the magnitude of the inverse filter in the frequency
domain. Various values of this constant were used to deblur the test images described
above, and the resulting images were compared to select a value that would provide
adequate sharpness enhancement without enhancing the noise too much. Using the
Office sequence as an example, the effects of varying r can be shown. Figure 5-23
~1
r =.1
r =.01
I
1II-
r =.001
Figure 5-23: Effects of Different r Values
illustrates very clearly how r influences the contrast in the result. For small values
of r, the regularization filter approximates the ideal inverse filter, which enhances
high-frequencies as explained in Section 3.2. Increasing r smoothes the filter response
and limits the high-frequency enhancement.
In practice, the deblurring process could be run for many values of r and the best
result selected. For this thesis, the value of r will be set to simplify the analysis. The
value chosen for r is .02, which retains enough contrast for the sharpening properties
of the deblurring to be easily viewed in the small pictures shown without losing all
details in the image. For other applications, a higher value for r providing more subtle
results may be desired.
5.2.3
Interpolation
As discussed in Section 4.2.2, interpolation is necessary to reduce the negative effects
of discretization. These effects can be clearly seen in the Office sequence. The focus
of the dilation at the center of the image corresponds to velocity vectors with very
small magnitudes. These magnitudes are scaled with the degree of interpolation to
maintain the correct ratio of velocity per pixel. Figure 5-24 illustrates how the size of
the distortion decreases as the degree of interpolation increases and the magnitudes
of the velocity vectors scale upward proportionately. To show the distortion more
clearly, the images are cropped and the value of r used is 0.
Interpolation = 4x
Interpolation = 5x
Figure 5-24: Office: Effect of Interpolation
Unfortunately, as the degree of interpolation increases, so does the computational
intensity of the process. The computation uses more memory and takes a considerably
longer time to evaluate the output image for each additional interpolation step. A
degree of 4 for the interpolation was chosen to achieve accurate deblurring while
maintaining a reasonable processing time and manageable memory usage.
5.2.4
Evaluation of Deblurring
This section discusses the results from the optical flow based deblurring process. Additionally, this section provides a comparison against two other standard methods,
blind deconvolution and Lucy-Richardson deconvolution. The blind deconvolution results were generated from the deconvblind function using a square input PSF, the size
of which was chosen to provide the perceived optimal results. The Lucy-Richardson
Blurred
Thesis Output
I3lind
Figure 5-25: Deblurring Test Results
Lucy-Richardson
results were generated from the deconvlucy function using the average PSF calculated
for the image. For both of these functions, the optional input parameters are set to
the MATLAB defaults. Figure 5-25 shows the results for all of the image sequences.
From the left to the right, the images displayed are the original, the thesis result, the
blind deconvolution and the Lucy-Richardson deconvolution.
Viewing these results, it is apparent that the blind deconvolution method is least
effective.
For all of the test images, the blind deconvolution generates the most
distortion in the output. This result agrees with the previous work, described in
Section 3, which concludes that knowledge of the PSF is essential for an accurate
derivation of the original from a blurred image. The deblurring output will now be
discussed for each test image.
Sphere
The Sphere image contained very little blurring, due to the motion having very low
velocities. The pattern on the object in motion, the sphere, also has very soft transitions in color. This makes the analysis of the deblurring somewhat difficult, since the
visual difference between the original and blurred image (see Figure 5-22) is small.
In general, the performance of the optical flow deblurring for this image is poor
in comparison to that of the Lucy-Richardson method. Distortion is evident across
the smooth-varying surface of the sphere in the optical flow result, and not in the
Lucy-Richardson result, or even the blind deconvolution result. However, the optical
flow result does not exhibit the ringing that appears around the high-frequency components for both the blind and Lucy-Richardson deconvolution methods. The ringing
occurs because the edges of the image contribute artificial high-frequencies that get
processed in the MATLAB deconvolution functions; operating in the space domain,
as was done for the optical flow method, prevents this ringing effect.
Office
At broad glance, and recalling that a value of r was chosen to generate high contrast in
the optical flow deblurring output, the results from the Lucy-Richardson and optical
flow methods seem comparable. The Lucy-Richardson method again exhibits ringing
around object edges. Focusing on certain regions of the image, however, illustrates
how the optical flow method can generate a sharper result than that of the LucyRichardson method. The region shown below in Figure 5-26 is the picture frame at
the top of the Office image. This close-up view shows that the shapes within the
I
Original
Thesis Output
Figure 5-26: Office: Zoom on Picture Frame
picture frame are resolved more clearly in the optical flow result than in the LucyRichardson result. Neither method was able to recover the extreme high-frequency
components entirely.
Street
In the Street sequence, it is useful to focus on the car and the man on the bench,
as in Figure 5-27. The sharp edges of the man's shoulder are captured clearly in the
Original
Thesis Output
Lucy-Richardson
Figure 5-27: Street: Zoom on Car
optical flow deblurring result, while the edges remain smooth in the Lucy-Richardson
result. The car is similar in color to the background, which makes it difficult to see,
especially after the blurring blends the color. Both the Lucy-Richardson and optical
flow methods recover the general shape of the car, although the former method handles
the wheel rotation better. Rotation is not linear, and therefore the linear model of
the motion vector fails to appropriately recover the initial rotating image when the
optical flow method is applied.
Another point to note on the optical flow method is that the image recovery is
somewhat limited by the motion itself. The motion, and correspondingly the blurring,
of the car is horizontally oriented. When the image brightness varies little, as in this
case, the blurring leaves little information to be extracted by the deblurring process.
The result is that the recovered image of the car still appears stretched horizontally.
Vcbox
The Vcbox image was deblurred surprisingly well by the optical flow method. The
issues with the motion assumption caused by the shifting focus of expansion, discussed
in Section 5.1.1, were expected to adversely affect the results of the deblurring. Not
only does the optical flow based deblurring handle the textured background with
reasonable results, it also generates a sharper label on the front of the box, which
can be seen in Figure 5-28. The deblurring result of the Vcbox image does contain
Original
Thesis Output
Lucy-Richardson
Figure 5-28: Vcbox: Zoom on Box Label
distortion, in the form of high-frequency stripes. The problem with handling small
velocity values is also evident towards the center section. However, the edges of
the logo, especially near the upper left, are considerably clearer than in the LucyRichardson deconvolution result. The edges of the box are cleaner, as well.
Simple
The motion of the checkered sphere in this image was significantly faster than that of
the Sphere image, and here it was translating in the direction of rotation to increase
the velocity further. Unfortunately, most of the detail was lost on the surface of
the sphere in the optical flow recovery of this image. The Lucy-Richardson method
manages to enhance the surface detail that was evident in the blurred image (see
Figure 5-22), but does not recover the original pattern either. Therefore, the optical
flow method may be preferred over the Lucy-Richardson method, depending on the
application, since it provides a closer approximation to the original image despite
having a blurrier appearance.
Observations of the square in this image indicate that the rotating motion is
difficult to process. This difficulty was also encountered in the wheel rotation of the
Street image. As with the optical flow calculations, this type of motion violates the
linearity assumptions used to develop the motion model. The image recovered with
the optical flow vectors presents rounded edges on the square. The recovery with the
Lucy-Richardson algorithm isolates the square images into the original three squares
used to generate the blur. This result indicates that the Gaussian smoothing was
inadequate to eliminate the artifacts created by simulating the motion through frame
averaging.
The checkered surface beneath the objects represents a region in which the optical
flow method clearly stands out as the better method. The optical flow is zero in this
static area, and therefore the blurred image is identical to the original for this portion
of the image. The optical flow method leaves this region unchanged, while the LucyRichardson method attempts to deconvolve the static region in the same manner as
the rest of the image, introducing distortion.
Medium
None of the deblurring methods handled the Medium image well. This image contained large motion vectors that were not averaged smoothly even after the additional
Gaussian smoothing, an effect which is noticeable in the checkered floor surface (see
Figure 5-22). Because of this poorly simulated blur, the floor is poorly recovered in
all of the deblurred images, which can be seen in Figure 5-25. However, taking a
closer look at the car, as in Figure 5-29, the optical flow method clearly performs
Original
Thesis Output
Lucy-Richardson
Figure 5-29: Medium: Zoom on Car
better than the Lucy-Richardson method. The shape of the car has been retained,
and sharpening has occurred around edges, such as, for example, the window borders
and the outside of the wheels.
Yosemite
The Yosemite deblurring results for the optical flow and Lucy-Richardson methods
are comparable. Both outputs sharpen the hills in the background, as well as the
river on the right. For the mountain surface in front on the left, the Lucy-Richardson
result alters the image little from the blurred image. For the same region, the optical
flow method does provide increased contrast.
5.3
Combining Optical Flow and Deblurring
This section discusses the results from applying the optical flow output to the deblurring process. Results from processing the artificial image sequences are discussed
in Section 5.3.1. Section 5.3.2 will discuss the results from processing a set of real
images.
5.3.1
Artificially Blurred Images
The first round of tests used a set of the blurred images from Section 5.2.1. The optical
flow was calculated for these images, and the resulting velocity vectors were used in
the deblurring process. The Sphere sequence was not used because the velocity vectors
were so small the deblurring has little effect. The Medium sequence was not used
because the artificial blurring poorly approximated the desired motion blur effect.
Figure 5-30 (page 77) shows the results of deblurring with the calculated optical flow
values.
The images deblurred with the calculated optical flow vectors appear visually
identical to the blurred image. This is due to the error in the magnitude of the
calculated velocities. As was shown in Section 5.1.3, the magnitudes returned from
the optical flow process are generally constant across the image, making the error in
magnitude approximately proportional to the magnitude of the flow field. For the
deblurring process, this means that regions containing the largest velocity magnitudes
will have deblurring PSFs far shorter than the actual blur function. The short PSFs
will not cover enough area to be able to capture the information from the original
image, and therefore the unblurred image will not be recovered well. Figure 5-31
(page 78) demonstrates how deblurring with a corrected magnitude generates similar
results to deblurring with the ground-truth vectors. The images shown are, from left
to right, the blurred image, the image deblurred with the calculated optical flow, the
image deblurred with the angle of the calculated optical flow and the ground-truth
magnitude, and finally the image deblurred with the complete ground-truth flow field.
This result indicates that the magnitude error, rather than the angular error, caused
the poor deblurring results in Figure 5-30.
The angular error accounts for the remaining differences in the deblurring results.
This can be seen most clearly in the images from the Vcbox and Simple tests. Viewing
the Vcbox images in Figure 5-32, it is clear that the corners of the deblurred images
vary the most between the image deblurred with the corrected magnitude and the
image deblurred with the ground-truth values. This difference corresponds to the
Blurred Image
Deblurred Image
Figure 5-30: Deblurring Results Using Calculated Optical Flow
77
Blurred Image
Calculated Velocity Corrected Magnitude
Ground-Truth
Figure 5-31: Effect of Magnitude Error on Deblurring
78
Corrected Magnitude
Ground-Truth
Angular Error
Figure 5-32: Effect of Angular Error on Deblurring
regions of large error in the angular error map on the right. Similarly, a large portion
of the deblurred image differences appears around the outside of the rotating square,
as the rotation was difficult to capture in the optical flow calculations.
These results support the intuition that the quality of the optical flow calculation,
in terms of both magnitude and angular error, strongly corresponds to the quality of
the subsequent deblurring.
5.3.2
Real Images
Finally, the full process of calculating the optical flow and using the velocity vectors
for deblurring was applied to real image sequences. For these sequences, there are no
ground-truth values available to correct for the error in magnitudes. Therefore, the
error in the optical flow component will be determined by observing the motion in
the image sequence and comparing that motion to the calculated 2D velocities.
Real Image Test Sequences
The real images were captured by placing a camera in a hallway. Pictures were
continuously taken over a long period of time and appropriate sets were selected for
testing. The resulting image sequences are of people in motion, walking around the
hallway. The majority of the motion occurs around arms and legs, as well as the head.
The reflective floor picks up the shadows, which are also affected by the movements of
the subjects. All of the real image sequences contain 3 frames, and the center frame
is the one that will be deblurred. Because of the large image sizes in these sequences,
the interpolation factor for the deblurring component was lowered to 3.
The first image sequence, shown in Figure 5-33, shows two people walking towards
the camera. Both of the individuals in these images exhibit separate motions. The
man in the front has drastic motion of the legs as he walks forward, and his right arm
shows movement. The badge hanging around his neck also moves. The woman in
back has smaller perceived motions, due to her increased distance from the camera.
She also shows motion in the arms and legs, as well as rotation of her head.
Figure 5-33: Real Sequence 1
The second image sequence, shown in Figure 5-34, shows a stationary man zipping
his jacket. The motion for this sequence is near the zipper, including the hands, and
his head which shifts during the sequence.
Figure 5-34: Real Sequence 2
The third sequence, shown in Figure 5-35, is a man walking towards the camera.
His motion is mostly around his arms, with some slight turning of the head.
Figure 5-35: Real Sequence 3
Figure 5-36 shows the fourth and final image sequence. These images again depict
two people walking towards the camera. The two subjects here are walking closer
together than in the first image sequence. Again, there is a lot of motion around the
arms and legs. Both of the heads turn slightly, and the position of the badges shift
over the image sequence.
Figure 5-36: Real Sequence 4
Analysis of Results for Real Image Sequences
Figure 5-37 shows the results of the optical flow based deblurring for the real images.
The original blurred frame is shown on the left. The image on the right is the
blurred image overlaid with the optical flow velocity vectors. The center image is the
deblurred image. The edges of the image were not deblurred to ensure that the PSF
fit inside the image dimensions for all points and also to eliminate regions affected by
convolution error from the optical flow process. The edges are therefore not shown in
the deblurred image.
These results indicate that the optical flow process performed well. The velocity
vectors are most prominent around the edges of the subjects, as expected based on
the observation of their motion. The motion of the subject's arm swinging forward in
sequence 3 is captured in the velocity vectors pointing down and to the right of the
image. The slow rocking backwards and the slight dip of the head of the man zipping
his jacket in sequence 2 is shown with the direction of the velocity vectors. The vectors
around the hands in this sequence represent the zipping motion well, showing both
hands moving vertically in opposite directions. Image sequences 1 and 4 illustrate
again that shadows are captured in the optical flow, as evidenced by the large velocity
vectors on the floor in front of the subjects' feet. The window in the background of
sequences 1 and 3 shows some error, as the velocity vectors should be zero. Possibly
Afý
Blurred
Deblurred
AN
Velocity Vectors
Figure 5-37: Real Sequence Results
shadowing effects, caused by changing light conditions outside the window, affected
this portion of the result. The accuracy of the leg motions in sequences 1 and 4 is
difficult to determine because the leg position changes over the time period of the
sequences.
Distinguishing between the original image on the left and the deblurred image
in the center is difficult. This result is a consequence of restrictions imposed by the
limitations of the camera and error in the calculated velocity vectors. In order to get
large blurs on the image, the subject had to be moving quickly. However, the camera
that was used had a significant time delay following each frame to write the image
data to memory. The time delay between frames meant that the subject tended to be
displaced in the succeeding frame. This effect causes the calculated motion vectors
to represent the temporal displacement rather than the blur-generating movement.
Also, in the real images, the subjects vary in their motion from frame to frame. This
variation was shown in Section 5.1.2 to generate error in the optical flow calculations.
Image sequences capturing slower motions were chosen to reduce the discontinuities
between the frames and therefore the observable blur is small. This effect makes
the analysis by observation more difficult. Compounding this problem is the issue
of the small magnitude vectors returned by the optical flow process. The magnitude
error was shown in Section 5.3.1 to cause minimal deblurring effects, due to shortened
velocity vectors.
To increase the difference between the original and deblurred images, the tests were
rerun with velocity vectors scaled by a factor of ten. Figure 5-38 shows these results for
the second and third sequences. Scaling the velocity vectors leads to higher contrast
in the output image. The boundary between the subjects and their backgrounds is
more defined, for example. Also, in sequence 2, the inside of the jacket, the buttons
on the man's shirt and the separation between the fingers are sharper. However, since
the scaling factor was an arbitrary choice, the resulting image also exhibits increased
distortion. The distortion occurs because the longer PSF now covers too much area,
and the resulting pixel value includes extra information from surrounding pixels. This
result again demonstrates the need for an optical flow method that generates accurate
Blurred
Deblurred
Scaled Deblurred
Figure 5-38: Real Sequence Results Using Scaled Velocity Vectors
velocity magnitudes in addition to accurate directions.
Chapter 6
Summary and Future Work
The work presented in this thesis provides a method for reconstructing images degraded by motion using velocity information generated with a phase-based optical flow
calculation. The thesis demonstrates that the quality of the velocity vectors from the
optical flow computations directly affects the quality of the deblurring. Overall, the
method works well when the input image sequence is appropriate. Appropriate image sequences have continuous, consistent motion for the duration of the sequence,
without discontinuities between frames. These restrictions can be limiting in terms of
the camera capabilities required and the speed and direction of the subject motion,
as was demonstrated in Section 5.3.2. This chapter will review the implementation
and results for both the optical flow and deblurring processes.
6.1
Optical Flow Summary
The optical flow method implemented for this thesis closely followed that of Fleet
and Jepson given in [13] based on their efforts in [14] to demonstrate the robustness
of phase with respect to variations in image brightness. The method applies a set of
frequency-tuned Gabor filters to an image sequence in order to determine the component velocities for each pixel by tracking temporally separated phase contours. The
method implemented here differed from Fleet and Jepson's mainly in the generation
of a 2D velocity from the component velocities. The 2D velocity for this thesis did not
require values from local pixels, therefore bypassing the assumption of smoothness
that was implicit in Fleet and Jepson's combination method. Fleet and Jepson also
failed to specify their choices in parameters, so the values used in this thesis were
determined through experimentation, explained in detail in Section 5.1.2.
The resulting optical flow implementation was largely successful in determining
the direction of the motion.
The most angular error was observed in regions of
rotational motion, where the deviation of the model of linear motion from the actual
motion was greatest. Angular error was also observed where the motion varied over
the duration of the image sequence, another place where the linear motion model was
a poor representation of the true motion.
The magnitude error was significantly more important than the angular error
when the optical flow velocities were used in the deblurring process. Although the
average magnitude error was generally small, the location of the errors caused a
dramatic reduction in the quality of the image restoration that followed. This effect
demonstrates the fallacy in many of the studies in including a single error measure
when analyzing the results. The Barron study, for example, uses only the angular
error measure [2].
Many new methods that use the results of the Barron study
for comparison also use only the one error measure, and the problem of inadequate
evaluation of new optical flow methods is propagated.
6.2
Improving the Optical Flow Method
A simple improvement to the optical flow calculation would include more filters,
tuned to a variety of frequencies. This extension would capture information from
a larger range of the frequencies in the original image. Additionally, computation
considerations in the optical flow component could be increased by using methods
such as the Complex Discrete Wavelet Transform in place of the Gabor filters, as
described in [6].
Many of the problems with the optical flow method are derived from the combination of the component velocities into the final 2D velocity. Keeping the component
velocities separate could maintain more information about non-linear motions. This
idea follows from the fact that the component velocities can express multiple motions
at a single pixel, as with transparent objects or effects such as shadows [12]. Nonlinear motions such as rotation could similarly be expressed as a sum of smaller linear
motions. Reducing the linearity in this manner may improve both the angular and
magnitude error in the optical flow output which would carry over into the deblurring
stage of the processing.
6.3
Deblurring Summary
The deblurring method implemented for this thesis applied a regularization filter at
each pixel on the image. The velocities generated by the optical flow calculations
were used to create the PSF that represented the blur-inducing motion vector. The
regularization term was an experimentally determined constant. Figure 5-23 (page
68) illustrates how different values of the regularization constant have a dramatic
effect on the contrast in the restored image.
Overall, the deblurring process worked well in comparison to the blind and LucyRichardson deconvolution methods when accurate velocities were provided. The deblurring method implemented here had trouble with areas containing rotational motion, as did the optical flow implementation. Again, rotational motion violates the
linear motion model, which accounts for the error seen in the output. This implementation also had difficulty handling small motions, due to the discretization effects
described in Section 4.2.2. However, this version of an adaptable deblurring technique
was still successful in sharpening the appearance of edges and other high-frequency
regions. The results from the Vcbox test demonstrate this success very well in Figure
5-28. One significant issue with the deblurring process implemented here is the large
amount of time and memory necessary for the computation of the restored image.
6.4
Improving the Deblurring Method
The method of deblurring implemented in this thesis uses a simple regularization filter
to generate the deblurring kernel. Applying a more sophisticated technique, such as
the Lucy-Richardson method, to the process for each PSF could yield more accurate
results. However, the process should still remain in the space domain to avoid including information from neighboring pixels. Altering the filter in this manner could
significantly lengthen the already considerably time-consuming process of pixel-bypixel deblurring, even while providing better results. A simpler way to increase the
effectiveness of the deblurring kernel would be to improve upon the constant regularization term. This term could be improved either by selecting a more appropriate
constant or by generating a dynamic term that was based on extractable properties
of the image. Additionally, improvement in the analysis process would be extremely
helpful in evaluating the deblurring results. An analytic method for comparing the
blurred and unblurred images would provide a quantitative result for comparison and
allow a larger range of test sequences to be tested.
Exploiting the image sequence further is perhaps the most effective way to improve
the image restoration. Currently, only the optical flow component utilizes the multiple
frames available for each image sequence. Applying information from previous and
future frames in the sequence should greatly enhance the results of the restoration
in the deblurring component. This improvement should be seen most noticeably at
places such as occlusion boundaries, where separate objects may be blurred together.
Information obtained from preceding or succeeding frames could help to define the
borders of each object and recover each separately.
6.5
Future Work and Applications
In addition to the improvements suggested above, future work could be done to
improve the methods used in this thesis. Combining the technique presented here
with others could improve the results significantly. Superresolution, for example,
could be used to generate more highly resolved images in the initial sequence. The
increase in resolution would help to mitigate the effects of discretization in addition to
providing greater accuracy in the results. This benefit would affect both the optical
flow and deblurring portions of the process.
This process also has the possibility of providing benefit to many other applications. Clearer imaging would significantly benefit many fields, in addition to the AI
applications mentioned previously in Section 1.2. This potential applications include
medical imaging such as Magnetic Resonance Imaging, as mentioned in [22].
Appendix A
Calculating Component Velocities
Using the basic equations from section 2.2, we can derive a simple method for explicitly calculating the component velocities.
The basic equations are given again below:
qxU + qyv + Ot = 0
n(x, t) =
X(X, t)
Ibx(x, t)0l
v, = an(x, t)
(A.1)
(A.2)
(A.3)
Expand (A.3) into its components, un and vn.
vn =
cny
Now plug these values into (A.1).
xzanx + Ocan, + Ot = 0
(A.4)
Expanding (A.2) into its components, n, and ny:
Il~x(x, t~ll
n
n=l x(x, t)ll
Equation (A.4) then becomes:
0
xa
Ox
+
IIOx(x, t0 11
Y + Ot = 0
Y x(x, t)||
i
) +±2kt
ICx(x, t)l ( +
=0
However, IlIx(x, t)ll is the 2 norm of Ox, and this is equal to V
we get:
+ ¢. Therefore,
a .( + +2
¢Y+
0/2
Ot = 0
+ ,=
±20+0t =0
e
(A.5)
Solving for a, equation (A.5) becomes:
=
-a
(A.6)
Returning to our definitions of u, and v , we end up with:
Un=
nX
x~ + 02Y
The term on the left,
(
),is the magnitude of the velocity component. Now
substituting nx and ny:
-x+q
+¢+
After simplification, the final equations for the velocity components un and vn are:
+
ul -
vn
-
(A.7)
(A.8)
The derivatives of 0, 0., qy, and ~t can be determined using the phase identity
V(x, t) =
t) [R*(x,
=
t)VR(x, t)]
(A.9)
derived in Appendix B, and therefore the component velocities un and v. can be
determined.
Appendix B
Phase Identity Derivation
Deriving the phase relation from Fleet and Jepson [13]:
V4(xt) = [R*(x, t)VR(x, t)]
Vp(x,(x,
t)t)
(B.1)
Letting A, = °A(x,t), and with x = (x, y) we can write:
VO(x, t) = (OX,
Y,7
qt)
If we only look in the x dimension, this phase identity becomes:
Q[R*(x, t)Rx]
p2(x,t)
R and R* are defined as below, for p,
(B.2)
E R:
R(x, t) = p(x, t)ejo(xlt)
(B.3)
(x,t)
(B.4)
R*(x, t) = p(x, t)e-i j
Start by solving for the derivative of R using the chain rule.
R =
[p(x, t)ejo(xt)]
R, = p(x, t)
a
Ox
_
a
Ox
t)] + ej
(x t)
[ejO(x,t)] + ejO(x,t)
x
=
[p(x, t)]
8
a [p(x,
[x t)0
We can rewrite this equation as:
4 (x,t)px
Rx = jp(x, t)e~(xt)Ox + ei
(B.5)
Find R*(x, t)R, by plugging in equations (B.4) and (B.5) :
R*(x, t)Rx = p(x, t)e-Jo(x't) [jp(x, t)ejo(x't)O + eji(x't)px]
R*(x, t)Rx = jp 2 (x, t)~, + p(x, t)pX
(B.6)
Since p, 0 E R and using equation (B.6) we get:
.[R*(x, t)Rx] = p 2 (x, t)¢O
(B.7)
Therefore,
0 [R*(x, t)Rx]
pO (x, t)
(B.8)
which is equivalent to (B.2).
This can be extended to all 3 dimensions, to get the phase identity given in (B.1).
Appendix C
Fleet and Jepson: Computing 2D
Velocity
Fleet and Jepson suggest a least-squares solution to resolve a complete 2D velocity field from the component velocities. This solution unfortunately assumes some
smoothness over a local region, since it utilizes velocity neighborhoods to reach a final
velocity.
Let the matrix a be the vector of unknowns, (ao, al, a2, bo, bl, b2 )T. The vector x
is equal to x - xo, which is the spatial distance between the location of the current
estimate x and that of the pixel in question, xo. Also recall that the velocity estimate
fr can be written as bii, where fl& is the speed and ii is the direction of the estimate.
For each component velocity in the neighborhood, equation C.1 is solved.
(, nxx
nx, nY,
y, ny)a = On
(C.1)
If s is a Mxl column vector of the speeds bn for each local component velocity,
we reach a system of equations Na = s, where N is a Mx6 matrix. Minimization by
least-squares is then performed on lINa - sjl. M must be greater than 6 to fully solve
the system of equations.
100
Appendix D
MATLAB Code
D.1
Optical Flow Code
D.1.1
function OF
% This function calculates a 2D velocity for each pixel in an image
% sequence.
function [I Vx VyJ = OF(Iinit,fO, int_flag, beta)
% INPUTS
% linit = input image sequence. 3D array
% fO = tuning frequency for filters
% intflag = boolean to indicate if image should be interpolated prior to
% processing.
% beta = parameter used to determine coverage in frequency space
% OUTPUTS
% I = image sequence at end
% Vx, Vy = 2D velocities in x, y directions
Iinit = double(Iinit);
%typecast -> double
%% remove portions that dont move
101
to
[dl d2 dT] = gradient(Iinit);
clear dl d2;
dT2D = sum(abs(dT),3);
dT2D(abs(dT2D)>O) = 1;
20
dTin3 = repmat(dT2D,[1,1,size(Iinit,3)]);
Iinit = linit.*dTin3;
clear dTin dTin3 dT;
%% interpolate if intflag
[sl s2 s3] = size(linit);
if intflag
I = zeros(2*s1,2*s2,s3);
for i = 1:s3
I(1:2*sl-1,l:2*s2-1,i) = interp2(Iinit(:,:,i));
30
end;
else I = Iinit;
end;
clear Iinit sl s2 s3;
%%filter params
tSTART = clock;
if int_flag % scale frequency if interpolated
fO = f0/2;
end;
40
% calculate sigma and sigmak;
sigmak = fO*(2^beta-1)/(2^beta+1);
sigma = 1/sigmak;
%% set thresholds
thresh = .00001; % min val of velocity
102
pmin
= 0;
%min val of p (if small, negligible gradient);
vmin
= 0;
%min val of phase normal gradient
%% set up filterbank
50
% space s.t. each filter ctr is at least 2sigmak away in freq;
% dTHETA = tan(sigmak/fO); wstep = 2dTHETA;
% spacing 1/(sigma*fO) = (2 ^beta-1)/ (2 ^beta+l)
wstepmin = 2*atand(1/(sigma*f0));
%%run OF3D
tl = clock;
[Vxn Vyn ifilt] = OF3Dsep(I, sigma, fO, wstepmin); % get component vels
t2 = clock;
tRUN = etime(t2,tl);
60
fprintf('\n TIME OF3Dsep: %g\n',tRUN);
clear t1 t2 tRUN;
if int_flag % resize if image was interpolated
Vxnout = Vxn(1:2:size(Vxn,1),1:2:size(Vxn,2),:,:);
Vynout = Vyn(1:2:size(Vyn,1),1:2:size(Vyn,2),:,:);
lout = I(1:2:(size(I,1)),1:2:size(I,2),:);
clear I Vxn Vyn;
I = lIout;
Vxn = Vxnout;
Vyn = Vynout;
70
clear lout Vxnout Vynout;
Ifilt = 1/2*lfilt;
%scale back
end;
dTin4 = repmat(dT2D, [1,1,size(Vxn,3) ,size(Vxn,4)]);
Vxn = Vxn.*dTin4;
103
Vyn = Vyn.*dTin4;
clear dT dT1 dTin dTin4;
%% final output (from Vxn, Vyn)
so
tl = clock;
% set normalization parameters
dlmt = 2;
stdlmt = 5;
Imt = 10/12;
clear beta dT2D fO i int_flag Ifilt ;
clear pmin sigma sigmak st thresh vmin wstepmin;
[Vx Vy] = validvel(Vxn, Vyn, dlmt, stdlmt,lmt); % get 2D velocity
t2 = clock;
tRUN = etime(t2,tl);
90
fprintf(' \n TIME valid_vel: %g\n',tRUN);
clear dlmt stdlmt;
clear tl t2 tRUN;
tEND = clock;
tTOT = etime(tEND,tSTART);
fprintf('\n TIME testOF3D: %g secs, (%g min)\n',tTOT,tTOT/60);
clear tEND tTOT;
clear img imgE Enan ctr ss;
clear Vxc Vyc Cxc Cyc Vmag Cmag;
tEND = clock;
tTOT = etime(tEND,tSTART);
fprintf('\n TOTAL TIME running: %g secs, (%g min)\n',tTOT,tTOT/60);
clear tSTART tEND tTOT;
104
100
D.1.2
function OF3Dsep
% This function returns the component velocities based on the input
% parameters.
function [Vxn Vyn ifilt] = OF3Dsep(I, sigma, fO, wstepmin)
% INPUTS
% I = image sequence
% sigma = standard deviation of the Gaussian
% fO = tuning frequency
% wstepmin = spacing between filters
% OUTPUTS
SVxn, Vyn = 4D velocity outputs
10
% Ifilt = length of the filter used
warning off MATLAB:divideByZero;
nfilts = floor(360/wstepmin);
% number of filters that will fit
wOdeg = 0:360/nfilts:359;
fprintf('number of filters: %d\n',nfilts);
fprintf('filter spacing: 73.2f\n',360/nfilts);
%% define output vectors
[srow scol st] = size(I);
20
Vxn = zeros(srow,scol,st,nfilts);
Vyn = zeros(srow,scol,st,nfilts);
R = zeros(srow,scol,st,nfilts);
%%run compvel code
for i = 1:nfilts
wO = w0deg(i)*2*pi/360;
fprintf('\nfilter #%d: w0deg = 73.2f\n',i,w0deg(i));
lo
run Gabor function
105
[R(:,:,:,i) Ifilt] = gabor3Dsep(I,sigma,fO,wO);
%%run compute velocity code
30
[Vxn(:,:,:,i) Vyn(:,:,:,i)] = compvel(R(:,:,:,i));
end;
D.1.3
function gabor3Dsep
% This function filters the image with the Gabor filters
function [R ssigmak] = gabor3Dsep(I,sigma,fO,wO)
% INPUTS
% I = image sequence
% sigma = standard deviation of Gaussian
% fO = tuning frequency
% wO = filter orientation
% OUTPUTS
%oR = filtered image
% ssigmak = size of the filter
to
tl = clock;
%% generate Gaussian
[sy sx st] = size(I);
ct = ceil(st/2);
cy = sy/2+1;
cx = sx/2;
[G X Y T] = g3D(sx, sy, st, sigma,0); % create 3D
G3 = G(:,:,ct);
20
sv = round(sigma);
Glog = zeros(sy,sx,st);
% mask values outside stddev
for i = l:sy
106
for ji = 1:sx
for k = 1:st
yd = abs(cy-i);
xd = abs(cx-ji);
td = abs(ct-k);
Glog(i,ji,k) = (sqrt(xd^2+yd^2+td^2)<sv);
end;
30
end;
end;
clear yd xd td;
clear i ji k sv;
G3 = G3.*Glog(:,:,ct);
%% create sinusoid component
sinmodx = exp(j*(2*pi*fO.*X*cos(wO)));
sinmody = exp(j*(2*pi*f0.*Y'*sin(wO)));
40
%% make gabor filters
gx = normpdf(X,0,sigma);
gy = normpdf(Y',O0,sigma);
gt = normpdf(T,0,sigma);
gt = gt/(sum(gt));
gabx = sinmodx.*gx;
gaby = sinmody.*gy;
g = gaby*gabx;
50
clear gx gy gabx gaby sinmodx sinmody;
%% convert to final 2D signal
107
g = g.*Glog(:,:,ct);
g = g/(sum(g(:)));%normalize, so sum = 1;
ft = fft2(g);
fts = fftshift (ft);
cX = floor(sx/2+1)+round(sx*fO*cos(wO));
cY = ceil(sy/2+ 1)-round(sy*f0*sin(w0));
ssigmak = ceil(sx/sigma);
60
% split into x, y
gmask = zeros(size(g));
for i = cY-ssigmak:cY+ssigmak; % ROW/ Y direction
for k = cX-ssigmak:cX+ssigmak; % COL/X direction
gmask(i,k) = (sqrt((i-cY) 2+(k-cX)^2)<ssigmak);
% 1 if dist < sigmak away from ctr
end;
end;
clear i k;
70
ftsmask = fts.*gmask;
imask = ifftshift(ftsmask);
g = ifft2(imask);
clear cY cX srow scol txt;
clear ft fts ftsmask gmask r c;
%% 2D multiplication in FOURIER
sr = 2=sy;
sc = 2*sx;
80
dr = (sr-sy);
dc = (sc-sx);
st = size(I,3);
108
gft = fft2(g,sr,sc);
I2D = zeros(size(I));
for i = 1:st;
Ift = fft2(I(:,:,i),sr,sc);
I2Dmult = ifft2(Ift.*gft);
I2D(:,:,i) = I2Dmult(sy-(sy/2-1):sy+sy/2,sx-(sx/2-1):sx+sx/2);
end;
90
clear i sr sc st dr dc;
clear gpad gft Ipad Ift I2Dmult;
%% convolve -> R
R = I2D;
for irow = 1:size(I,1)
It = I2D(irow,:,:);
It2D(:,:) = It(l,:,:);
R(irow,:,:) = conv2(It2D,gt, 'same');
100
end;
clear irow;
clear It I2D It2D;
t2 = clock;
tRUN = etime(t2,tl);
fprintf('TIME gabor3Dsep: %g\n',tRUN);
clear tl t2 tRUN;
D.1.4
function compvel
% This function finds the component velocities from the output of the image
% convolved with Gabor filters.
function [Vx Vy] = compvel(R);
% INPUTS
109
% R = output of the image convolved with Gabor filters
% OUTPUTS
% VX, Vy = component velocities
tl = clock;
%% find phase gradient: calc gradient w/ method from Fleet
to
[srow scol st] = size(R);
fit = 1/18*[-1 8 0 -8 1]; % filter
for i = 1:st
dX(:,:,i) = conv2(R(:,:,i),fit,' same');
dY(:,:,i) = conv2(R(:,:,i),flt' , 'same');
end;
for irow = 1:srow
Rt = R(irow,:,:);
Rt2D(:,:) = Rt(1,:,:);
dT(irow,:,:) = conv2(Rt2D,flt,' same');
20
end;
clear fit;
%% calc phi_x,y,t
phiX = imag(conj(R).*dX)./(abs(R).^2);
phiY = imag(conj(R).*dY)./(abs(R).^2);
phiT = imag(conj(R).*dT)./(abs(R).^2);
%% find velocities
Pnorm = phiX.^2+phiY.^2;
30
Vx = -phiT.*(phiX./Pnorm);
Vy = -phiT.*(phiY./Pnorm);
t2 = clock;
110
fprintf(' TIME compvel: %g secs\n',etime(t2,tl));
clear tl t2;
D.1.5
function g3D
% This function creates a centered 3D Gaussian of mu = 0;
function [g x y t] = g3D(lx, ly, It, sigma)
% INPUTS
% Ix,ly,lt = lengths in x, y and time
% sigma = standard deviation of the Gaussian
% OUTPUTS
% g = 3D Gaussian
% x,y,t = corresponding vectors
%% set up vectors
if (mod(lx,2))
% lx is odd
10
xmin = -floor(lx/2);
xmax = floor(lx/2);
else
xmin = -lx/2+1;
xmax = lx/2;
end;
if (mod(ly,2))
% ly is odd
ymin = -floor(ly/2);
ymax = floor(ly/2);
else
20
ymin = -ly/2+1;
ymax = ly/2;
end;
if (mod(lt,2))
% It is odd
tmin = -floor(It/2);
111
tmax = floor(It/2);
else
tmin = -lt/2+1;
tmax = It/2;
end;
30
dx = 1;
dy = 1;
dt = 1;
x = xmin:dx:xmax;
y = ymin:dy:ymax;
t = tmin:dt:tmax;
y = -y+l;
[X Y] = meshgrid(x,y);
40
T = t;
inp = sqrt(X.^2+Y.^2);
g2D = normpdf(inp, 0 , sigma)/(sqrt(2*pi)*sigma);
gt = normpdf(T,0,sigma);
g = zeros(size(X,1),size(X,2),length(T));
for i = 1:length(T)
g(:,:,i) = g2D*(gt(i));
end;
clear gt g2D;
D.1.6
50
function valid_vel
% This function computes the normalized 2D velocities from a set of
% component velocities.
112
function [Vx Vy] = validvel(Vxn, Vyn, dlmt, stdlmt,lmt)
% INPUTS
% Vxn, Vyn = component velocities (4D vectors, Vx, Vy, frame, filter)
% dlmt = max distance from mean
% stdlmt = max std deviation
% lmt = fraction of vals needed to form sample
% OUTPUTS
% Vx, Vy = 2D velocities
[sl s2 s3 s4] = size(Vxn);
%•%find valid velocities by taking mean (over small std)
% standardize over filters
[Vxd4 Vyd4] = stdize_dimSUB(Vxn,Vyn,4,dlmt,stdlmt);
fprintf(' \nvel4');
% outputs -> nans from Vxn, or outliers
Vxnan4 = sum(~isnan(Vxd4),4); %sum # non-nans over 4th dim (filters)
Vynan4 = sum(-isnan(Vyd4),4);
% sum over s4 values ->
% sum velocity values over filters, frames
% scale by # nans at each 2D point
VxO = Vxd4;
VxO(isnan(Vxd4)) = 0;
VyO = Vyd4;
VyO(isnan(Vyd4)) = 0;
Vxns = sum(Vx0,4)./Vxnan4; %generates 3-D vect, SCALED
Vyns = sum(Vy0,4)./Vynan4; % to scale, divide by # non-nan filters (V-nan4) 30
Vxns(Vxnan4<s4*lmt) = nan; % location where all of values were nan
Vyns(Vynan4<s4*1mt) = nan; % location where all values were nan
113
% V-nan indicates # valid pts (NOT nan-> want high for good avg, sml std dev)
clear Vxnan4 Vynan4;
clear Vxd4 Vyd4;
clear VxO Vy0;
% standardize over frames
40
[Vxd3 Vyd3] = stdize_dimSUB(Vxns,Vyns,3,dlmt,stdlmt);
fprintf( ' \nvel3\n');
Vxnan3 = sum(~isnan(Vxd3),3); % sum # non-nans over 3rd dim (time frames)
Vynan3 = sum(-isnan(Vyd3),3);
VxO = Vxd3;
VxO(isnan(Vxd3)) = 0;
Vy0 = Vyd3;
VyO(isnan(Vyd3)) = 0;
50
Vxavg = sum(Vx0,3)./Vxnan3; %generates, 2-D vect, SCALED
Vyavg = sum(Vy0,3)./Vynan3;
%scaling by # of non-nan filts
Vx = Vxavg;
Vy = Vyavg;
Vx(Vxnan3<s3*lmt) = nan;
Vy(Vynan3<s3*lmt) = nan;
clear Vxnan3 Vxd3;
clear Vynan3 Vyd3;
clear VxO VyO;
60
clear Vxavg Vyavg;
clear Vxns Vyns;
114
D.1.7
function stdize_ dim
% This function performs the standardization
function [Vxd Vyd] = stdizedimSUB(Vx,Vy,dim,dlmt,stdlmt)
% INPUTS
% Vx, Vy = component velocities
% dim = dimension over which to standardize (3 or 4)
% dlmt = max distance from the mean
% stdlmt = max std dev
% OUTPUTS
% Vxd, Vyd = standardized velocities
10
VxO = Vx;
VxO(isnan(Vx)) = 0;
Vy0 = Vy;
VyO(isnan(Vy)) = 0;
%% DIM= 3
if dim == 3;
Vxstd = std(Vx0,0,dim);
for i = 1:size(Vx,dim);
vx = Vx(:,:,i);
20
dx = (Vx(:,:,i) -mean(VxO,dim))./Vxstd;
vx((abs(dx) >dlmt)&(abs(Vxstd)>stdlmt)) = nan;
Vxd(:,:,i) = vx;
clear vx dx;
end;
clear Vxstd VxO;
Vystd = std(Vy0,0,dim);
for i = 1:size(Vy,dim);
115
vy = Vy(:,:,i);
dy = (Vy(:,:,i)-mean(VyO,dim))./Vystd;
30
vy((abs(dy)>dlmt)&(abs(Vystd)>stdlmt)) = nan;
Vyd(:,:,i) = vy;
clear vx dx;
end;
clear Vystd VyO;
%% DIM = 4
elseif dim == 4;
Vxstd = std(VxO,O,dim);
for i = 1:size(Vx,dim);
40
vx = Vx(:,:,:,i);
dx = (Vx(:,:,:,i)-mean(VxO,dim)) ./Vxstd;
vx((abs(dx)>dlmt)&(abs(Vxstd)>stdlmt)) = nan;
Vxd(:,:,:,i) = vx;
clear vx dx;
end;
clear Vxstd VxO;
pack;
Vystd = std(VyO,O,dim);
for i = 1:size(Vy,dim);
50
vy = Vy(:,:,:,i);
dy = (Vy(:,:,:,i) -mean(VyO,dim)) ./Vystd;
vy((abs(dy)>dlmt)&(abs(Vystd)>stdlmt)) = nan;
Vyd(:,:,:,i) = vy;
clear vy dy;
end;
clear Vystd VyO;
end
116
D.2
Deblur Code
% This function takes an image and deblurs it using a regularization
% filter. PSFs are created using velocities which are provided.
function [I] = OFdeblur(I,n,r,Vx,Vy)
% INPUTS
% I = 2D image that needs to be deblurred
% Vx, Vy = velocity vectors
% n = degree of interpolation
% r = value of regularization term (constant)
10
% OUTPUTS
% I = deblurred image
%% pre-allocate arrays;
[sr sc] = size(I);
14 = zeros(sr*n+1-n,sc*n+1-n);
Vx4 = zeros(sr*n+1-n,sc*n+1-n);
Vy4 = zeros(sr*n+1-n,sc*n+1-n);
Vmag4 = zeros(sr*n+1-n,sc*n+1-n);
Vang = zeros(sr,sc);
%% interpolate to degree n
20
n2 = 2^(n-1);
img = interp2(I,n-1);
Vx4 = 2^(n-1)*interp2(Vx,(n-1));
Vy4 = 2^(n-1)*interp2(Vy,(n-1));
Vmag4 = sqrt(Vx4.^2+Vy4.^2);
Vang = atan2(-Vy,Vx)*180/pi;
[sr4 sc4] = size(img);
clear Vx Vy;
117
RE = 10;
% remove edges
30
Iout = zeros(sr, sc);
h2 = zeros(size(I4));
tl = clock;
for i = RE:sr-RE; % for each row
tli = clock;
for j = RE:sc-RE; % for each column
fprintf (' . );
% extract corresponding location in interpolated image
40
indi = n2*i-(n2-1);
indj = n2*j-(n2-1);
% determine length, angle parameters
len = Vmag4(indi,indj);
theta = Vang(i,j);
rlen = round(len);
%throw out invalid values; rlen = 0 or nan
if (isnan(rlen) II rlen)
lout(i,j) = img(indi,indj);
else % valid values
50
PSF = fspecial('motion',rlen,theta); % create PSF
[pl p2] = size(PSF);
if length(PSF)==l % do not deblur if PSF size = 1
Iout(i,j) = img(indi,indj);
else
K = fft2(PSF); % fourier transform
H = conj(K)./(K.*conj(K) + r); % regularizationfilter
h = ifft2(H); % convert to space domain
118
h2 = O*h2;
% clear array
h2(1:pl,1:p2) = h(pl:-1:1,p2:-1:1); %flip over axes
60
ctrr = floor((pl+l)/2);
ctrc = floor((p2+1)/2);
h2 = circshift(h2, [indi-ctrr,indj-ctrc]); % shift array
Imult = img.*h2; %operform multiplication
Iout(i,j) = sum(Imult(:)); % perform sum
end % length(PSF)
end % rlen
end %j
t2i = clock;
Y
fprintf(\nT(i=\%d) = \%og\n:'i, etime(t2i,tli));
end %i
clear tli t2i;
t2 = clock;
%fprintf(•\nT[min] = \ %g\n',etime(t2,tl)/60);
119
70
120
Bibliography
[1] Ivar Austvoll. A study of the yosemite sequence used as a test sequence for
estimation of optical flow. In Image Analysis, volume 3540 of Lecture Notes in
Computer Science, pages 659-668, 2005.
[2] J.L. Barron, D.J. Fleet, and S.S. Beauchemin. Performance of optical flow techniques. InternationalJournal of Computer Vision, 12(1):43-77, 1994.
[3] Moshe Ben-Ezra and Shree K. Nayar. Motion-based motion deblurring. IEEE
Transactionson PatternAnalysis and Machine Intelligence, 26(6):689-698, 2004.
[4] Michael J. Black. Frequently Asked Questions. Department of Computer Science,
Brown University, 2005. http://www.cs.brown.edu/people/black.
[5] Michael J. Black and Yaser Yacoob. Recognizing facial expressions in image sequences using local parameterized models of image motion. InternationalJournal
of Computer Vision, 25(1):23-48, 1997.
[6] Atanas Boev, Chavdar Kalchev, Atanas Gotchev, Tapio Saramiiki, and Karen
Egiazarian.
Efficient motion estimation utilizing quadrature filters.
unk,
unk(unk):unk, unk.
[7] Jonathan W. Brandt. Improved accuracy in gradient-based optical flow estimation. InternationalJournal of Computer Vision, 25(1):5-22, 1997.
[8] Andres Bruhn, Joachim Weickert, and Christoph Schn6rr. Combining the advantages of local and global optic flow methods. In Pattern Recognition : 24th
121
DAGM Symposium, Zurich, Switzerland. Proceedings, volume 2449 of Lecture
Notes In Computer Science, pages 454-462, 2002.
[9] Jesus Chamorro-Martinez, Javier Martinez-Baena, Elena Galin-Perales, and
Been Prados-Suarez. Dealing with multiple motions in optical flow estimation.
In Pattern Recognition and Image Analysis, volume 3522 of Lecture Notes in
Computer Science, pages 52-59, 2005.
[10] Tony F. Chan and Jianhong Shen. Image Processing and Analysis. Society for
Industrial and Applied Mathematics, 2005.
[11] Department
Dunedin,
of
New
Computer
Science,
University
Computer
Vision
Zealand.
of
Research
Otago,
Group.
http://www.cs.otago.ac.nz/research/vision/Resources/index.html.
[12] David J. Fleet. Measurement of Image Velocity. Kluwer Academic Publishers,
1992.
[13] David J. Fleet and Allan D. Jepson. Computation of component image velocity from local phase information. International Journal of Computer Vision,
5(1):77-104, 1990.
[14] David J. Fleet and Allan D. Jepson.
Stability of phase information.
IEEE
Transactions on Pattern Analysis and Machine Intelligence, 15(12):1253-1268,
1993.
[15] David J. Fleet and Yair Weiss. Optical flow estimation. unk, unk.
[16] fxguide.com.
Art
of
Optical
Flow,
Feb
2006.
http://www.fxguide.com/article333.html.
[17] B. Galvin, B. McCane, K. Novins, D. Mason, and S. Mills. Recovering motion
fields: An evaluation of eight optical flow algorithms. In British Machine Vision
Conference, 1998.
122
[18] Temujin Gautama and Marc M. Van Hulle. A phase-based approach to the
estimation of the optical flow field using spatial filtering. IEEE Transactions on
Neural Networks, 13(5):1127-1136, 2002.
[19] Hsueh-Ming Hang, Yung-Ming Chou, and Sheu-Chih Cheng. Motion estimation
for video coding standards. Journal of VLSI Signal Processing, 17:113-136, 1997.
[20] Fredric J. Harris. On the use of windows for harmonic analysis with the discrete
fourier transform. Proceedings of the IEEE, 66(1):51-83, 1978.
[21] Berthold K.P. Horn and Brian G. Schunck. Determining optical flow. Artificial
Intelligence, 17:185-203, 1981.
[22] Jiff Jan. Medical Image Processing, Reconstruction and Restoration. Taylor &
Francis Group, 2006.
[23] Deepa Kundur and Dimitrios Hatzinakos. Blind image deconvolution.
IEEE
Signal Processing Magazine, 13(3):43-64, 1996.
[24] Jae S. Lim. Two-Dimensional Signal and Image Processing. Prentice Hall PTR,
1990.
[25] Alan V. Oppenheim and Alan S. Willsky. Signals & Systems. Prentice Hall, 2nd
edition, 1997.
[26] RE:Vision Effects, Inc. RE: Vision Effects founders to receive Academy Award,
Jan 2007. http://www.revisionfx.com/company/press-releases/01222007.
123