>> Rick Szeliski: Okay, good afternoon everyone. It’s... pleasure to welcome Raanan Fattal who is visiting us this...

advertisement
>> Rick Szeliski: Okay, good afternoon everyone. It’s my great
pleasure to welcome Raanan Fattal who is visiting us this week from the
Hebrew University of Jerusalem. Raanan actually got his PhD from
Hebrew University in 2006 and then he did a postdoc at Berkeley. And
he has been faculty there since 2008. He is a long-time collaborator
of some of us here. We had co-authored a paper on edge preserving
decompositions in 2008 and he has also been a collaborator of
[indiscernible]. And he is going to tell us today about Blur-Kernel
Estimation.
>> Raanan Fattal: Thank you Rick, thank you for inviting me to MSR and
to give this talk. So I will be talking about deblurring. It’s work
with Amit Goldstein who used to be a PhD student of mine. He left to
the industry right in the middle.
Okay, so as we all know in many practical scenarios we hold the camera
either on our hand or it’s on some moving vehicle. We can’t fixate it
and the exposure time is not 0. So the camera sees the motion,
integrates it and the resulting image is blurry, which is something
that we don’t like; details are missing and so on. So deblurring is
all about removing this blur.
So the typical model that people have used is the following. So it’s
the blurry image B is the convolution of the unblurred image, the sharp
image and some blur-kernel that basically tells how long the camera was
exposed at different offsets that were integrated. Plus some pixel
independent noise term N over X.
Recently people have started asking, “What’s the scope of this model?
Is this an accurate model? Does this model explain every Blur that you
may see”? So the answer is no. It is valued for small rotations in
the camera along the X and Y directions. It is not valued when there
is motion in the camera because if there is motion in the camera we
might see [indiscernible] effects, occlusions. And also it is not
varied for the cases where we have rotation of the camera along the
optical access.
But I think that when people hold their camera in their hand these are
the two most significant angles that do go into account, but then
again. There are two classic categories of image deblurring. In nonblind deconvolution we know the blur-kernel, we have measured it
somehow. And all we have to do, and it’s not a simple task, is to
deconvolve the blurry image with the blur-kernel. And it’s not a
trivial task simply because there is noise and the model is not always
perfectly accurate. Blind image deconvolution says that we don’t have
the blur-kernel as well and we have to have both the blur-kernel and
the sharp image. So now we have this under determined problem, there
are more unknowns than equations.
This work and actually a few others that I will mention kind of open a
different category where they do not fall to either. So we are
estimating the blur-kernel. We don’t have it; nevertheless we are not
recovering the sharp image in the process. So it’s not apart of the
algorithm. So we get the blur-kernel out of this new algorithm and now
we need some non-blind deblurring in order to produce the final image.
So image deblurring I think is one of the most researched problems in
image processing. There has been tons of work. So I think the most
dominant approach is the MAP approach where the formation model of the
blurry image is transformed by using the base low into the physical
times [indiscernible] model over the latent sharp image and the blurkernel. The first attempts people used Gaussian models for the images
or their derivatives and later on people on started using the total
variation [indiscernible], the L1. More recently the MAPk where there
is integration across all the possible images. So people compute the
marginal distribution across all the possible images and they optimize
for the kernel in an independent manner.
This requires major integrations; I mean integrations at very high
dimensional space. [indiscernible] and [indiscernible] developed an
alternative a normalized metric that tackles one of the problems that
was originally in the MAPk approach and there was another acceleration
of the MAPk approach. A different type of methods make an explicit use
of the fact that there are sharp edges in the original image and try to
recover the blur by the deviation of the edges that we see in the input
blurry image.
So O’Neill has a work using the two color model to explain sharp edges.
Cho and Lee recover an image that behaves like a sharp image using some
inverse diffusion process and then try and see the difference between
this sharpened image and the input image, which is supposed to be the
blur-kernel. Cho et al does something quite similar where they detect
images, look at their profile and they use [indiscernible] form to
recover the profile itself.
There are works that try to employ a hardware into the process. There
are different ways of playing with the shutter in order to generate
some sharpness into the blur-kernel. In the time integration the coded
[indiscernible] that allows better spectral behavior and also the
recovery of the scale of the blur. And the acquisition of parameters
is basically to generate the blur-kernel, the motion, the rotation and
other work of [indiscernible].
And then there are new methods that try to incorporate the more
detailed model that does not only account for translation type of
blurring, but also for the rotation. These are relatively new works.
They typically use the map formulation to recover the 3 dimensional
blur-kernels once you add the additional rotation.
So this work kind of goes into a different path and the motivation I
think comes from the simple visual observation on blurry images that
blurry image is in fact a superposition of a sharp image just moved
around. So the image kind of resembles itself at various offsets
because it’s a sum of the same signal just translated. Then the
question is: How do we formalize that and see what the similarities are
inside the image which would indicate the blur?
So some theory: There is a well known power-law that people use to
describe the falloff of the power spectrum of natural images. Here it
is they are supposed to decay something around like the frequency to
the power of 2. So if we believe in this model we could take the
following approach: We could take the Laplacian filter, for example the
5 point Laplacian filter and compute a filter which is the square root
of that filter, meaning that D is a filter such that if I would
convolve D with itself I would obtain a Laplacian filter. There are
actually many such separations, but there is only one that is symmetric
and since L is symmetric such decomposition should exist.
So D is a filter that if I would convolve it with itself I will obtain
the ususal 5 point Laplacian or any other type of Laplacian. So let’s
see what we can do with such a filter. So I would convolve the image
with this filter and look at the power spectrum of this signal. So for
Fourier space we have the Convolution Theorem that separates it into
the image and at times the differentiation filter squared. The
differentiation filter squared is actually the Laplacian. It has the
same spatial response as the Laplacian and we know that Laplacian’s
behave like omega squared. Every differentiation adds multiplication
and for Fourier space by omega, meaning that this filter actually
whitens the spectrum of the image.
Okay. So incase of a blurry image we could just convolve with this
whitening filter and get this identity. I think we are missing one
absolute value part here. And we know that those two cancel one
another so we are, I am sorry, this one cancels this one so we remain
with a power spectrum of the blur-kernel which is like half way, right.
We want to know the blur-kernel; here is its power spectrum. We are
missing the phase, we could use some phase recovery method, and there
are quite a few of them. So this is the general direction that we
would follow, except that there are some complications that we will
have to deal with before.
So another useful identity that we will have to use is the WienerKhinchin Theorem that relates the autocorrelation of the Fourier form
over the autocorrelation of a function with the absolute value with a
power spectrum of that signal. Okay, so autocorrelation is actually
the convolution of the signal with a mirrored vision of itself, which
is in Fourier space point wise multiplication between the signal and
its conjugate. So we get the absolute variant in Fourier space; it’s a
very useful relation.
So this relation basically introduces a spatial counterpart to what we
have seen before, right. If we know that in Fourier space we are
expecting to get a white spectrum for the image converged without
whitening filter that would mean that the autocorrelation of the image
in real-space converged with that filter is supposed to give us the
delta function. And similarly if we are looking at the blurry image,
which gave the power spectrum of the blur-kernel, now we are supposed
to obtain in real-space the autocorrelation of the blur-kernel. Okay
we can see those [inaudible].
Anyway, here we see the power spectrum of the whitened image. So it’s
I converged with D, that this is autocorrelation in real-space. What
we see here is the, that guy up here. So this is whitening the image
with the Laplacian and this is the autocorrelation that we get in
space. So some people have the impression that every differentiation
filter is supposed to whiten the natural image, but it’s not the case.
I think that you should be able to agree that this is more constant
than this is. It’s not constant, although we expect it to be and we
will talk about that too, but it’s more constant than this. I mean
both of these images seem to decrease as we go to the origin.
So there was work by [indiscernible] that actually whitened the
spectrum of the image using the Laplacian other then the square root of
the Laplacian and they had very severe biases in their recovery of the
blur-kernel.
>> So the spectrum is for a generic image, this is what you are
showing?
>> Raanan Fattal: It’s the power spectrum of some image, yes, some
natural image. Although actually two of them we will see them, but
yes; I should have shown the image. By the way in one dimensional
space, so if you assume that the blur is 1 dimensional you can
basically do the same, but the differentiation can be done along that
particular direction. So you could scan all the directions and check
with dimension the image seems to be most blurry, take that direction
and now we use a differentiation filter to deblur the image.
And actually in pervious work by [indiscernible] they used the correct
filter, so it was first auto-differentiation and they should have
received the [indiscernible] whitening, although there is this issue
that phase recovery is harder for 1D signals than 2D signals; it’s not
unique. In 2D it’s thought to be unique. It has to do with the
decomposition of multi-variant polynomials and there are some open
questions in the algebra, but the assumption is that it’s unique in 2D
and not unique in 1D. I mean that could be shown.
Okay. So as we have seen there is a “but” here, this model predicts in
isotropic behavior in the power spectrum of natural images. There is
no dependence on the angle, but we know that the edge content in the
images do influence the power spectrum of images. So those are the two
images that we have seen earlier. These are the power spectrum of
these images and not the whitened ones. And we see that these
functions are very far from being isotropic.
>> How accurate is this exponent of 2, like if you take a million
images and calculate [inaudible]? Is it [inaudible]?
>> Raanan Fattal: So, well maybe on a larger sample. Well in manmade
scenes you would always expect to see these vertical and horizontal
ones, so I would expect some biases, even in larger samples. But, we
don’t have that privilege anyway.
What we did is we came up with a more refined power low and the idea
was to kind of look at the profile of these functions at various
orientations. And this is what we get on a log scale. And we see that
we are actually getting these linear functions and they just seem to be
shifted by constant. Meaning that there is some multiplicative factor
that is different along each direction, which explains why this model
was kind of used so frequently. So there is some small deviation from
that. And we use this model of the power spectrum of natural images to
model for recovery of the blur-kernel.
So we assume that there is some scale of C that depends on the angel
data here, which we can recover by the coordinate of the Fourier mode;
it’s the inverse [indiscernible]. Okay. So given that observation we
might need this tool, so let’s look at it. The Fourier Slice Theorem
gives us this relation. It’s a relation between [indiscernible] of a
projection of a 2D signal into 1D line with a line in Fourier space.
So what we have here is we pick some direction and we project the image
just by integrating along this direction. This gives us a 1D signal.
Now we compute the transformed Fourier of this 1D signal that was
integrated along this direction defined by data. And the identity
tells us that this equals to a slice, just these values in Fourier
space.
This can be derived by thinking about the projection of integration
steps just as if we are convolving the image with the function that is
a delta function on one axis and a constant function on another axis.
And then we are looking on 1D line in Fourier space because the other’s
are just constant so there is no content there and this is what we get.
So what we have here is a parameterization over lying in Fourier space
thought orientation vector is our rotator and this is a scalar. So
omega is 2 dimensional coordinates in Fourier space and C is a 1
dimensional code in Fourier space.
So applying this relation gives us the following final relations. So
this is the Wiener-Khinchin relation and now we have this relation, the
Fourier Slice Theorem. So at the end of the day we have a relation
between the transformed Fourier of the autocorrelation of a function
after the image has been projected to a generic behavior in Fourier
space. And we are kind of decoupling the C of data variables this way.
So let’s just apply this same relation over the blurry image and we get
basically this relation. That if we take the image, we convolve it
with this --. I am sorry, we would first project it to 1D along some
angle, we differentiate it in 1D, and we compute the autocorrelation
and then go to Fourier space. Then we are getting a slice of the power
spectrum in Fourier space times some unknown constant. But those
constants are decoupled from one another. I am working with different
projections, orientations and in each one I am getting this unique
number C of data that I do not know and I have to recover.
So this has a real-space counterpart so here we are just projecting the
image, whitening it, computing the autocorrelation and we are expecting
to see the projected kernel, it’s autocorrelation times some scalar.
So the algorithm: How do we use these relations to recover the blurkernel? I guess you can guess it by the relations that we have seen.
So we take the image, we project it, whiten it, compute the
autocorrelation to obtain these functions F of data. So for each
direction orientation we have F of data. So F of data gives us almost
what we need, which is the autocorrelation of the blur-kernel along one
direction on the projected one or slices in Fourier space, but there
are two numbers that we do not know.
So there is the C of data that we don’t know and we will discuss it and
there is also another quantity which is a scalar, an indicative scalar
that we do not know. And that results from the fact that D is a
differentiation filter so it loses the DC formation of the signal and
that remains to be the case also in the autocorrelation of a function.
So there is the uncertainty between F of data and the two
autocorrelations of that function.
Okay. So to account for these we have some physical priors over the
kernel. One thing is that we expect the blur-kernel to be positive
value, but it has to have limited support. So it’s not infinite, we
can bound it and we actually assume that the user is giving us a bound
and that it has a constant sum. We can assume that it sums to 1. So
we show that one can reduce both these unknowns and the constant sum to
a different unknown which describes the extent of the support of the
kernel in each direction.
Okay, because if I know the extent of the kernel and I know that by
that time the kernel should be at 0 I know to offset it and recover M
of data. And I know that it needs to sum of 1 so I know to recover C
of data. So if I get F of data and I have these two unknowns plus this
assumption given the extent of the kernel I am supposed to obtain, I am
sorry, both M of data and C of data.
So the algorithm basically recovers the blur-kernel, it extent and its
support. These are the three variables and it does that from the F of
data that we compute. So we get the image, we start projecting it at
different orientations. I will summarize that: We are differentiating
1D, we compute the 1 dimensional autocorrelation and we get these F of
data functions for every data. Now we are recovering, so this is done
once, now we are recovering the blur-kernel and its support basically
intuitively using the following iteration, but this iteration is done
only at the scale of the blur-kernel, not the dimension of the image.
The dimension of the image only affected the computation of these one
dimensional functions.
So we start with some initial guess over S of data, some conservative
initial guess which is the point where we chose minimal value, the
autocorrelation that we computed, basically F of data. And we estimate
the kernel using the assumed support variables. So we get F of data
and as we said we can recover C of data, M of data and obtain the power
spectrum of the kernel. And now we can use some phase retrieval
algorithm, I will talk a little about it, to recover K. And then we,
given K, we re-estimate the extent of the support of the kernel along
each orientation by projecting and computing the autocorrelation and
considering a point where the autocorrelation has decayed enough. And
we repeat these steps intuitively.
So here we can see an example. So we start with some initial guess
over the support of the kernel. So this is the data, the angle and
this is the extent of the kernel that we start with. And we start
these iterations, step 1 and step 2 and we are recovering both the
kernel and we are converging to the two supports, which is blue. So
this is an actual synthetic example where we have blurred the image
with the kernel, which we knew. So there is convergence both of the
support and the kernel itself.
And apparently phase table algorithms really benefit the most
restricted prediction that you have over the support of the kernel that
you are trying to recover. So every point where you can say it should
be 0 it’s a great help for the [indiscernible] algorithm to converge.
>> So you have the sum of from one constraint.
>> Raanan Fattal: All angles, yes.
Is it for all angels?
>> What [inaudible]? The projection of the kernel over [inaudible]?
>> Raanan Fattal: So the sum of the 2 dimensional kernel if you assume
that this is 1 you are basically assuming that the [indiscernible] form
of the kernel in 0 is 1.
>> Okay, yeah, right.
>> Raanan Fattal: So the autocorrelation which is the squared of 1 is 1
and also its projected versions. So basically when you project a
kernel in 1D you maintain its sum, right.
>> So what happens if it’s a multi-model in some direction?
>> Raanan Fattal: It could be and it’s actually what you see, this one
mode here and another here.
>> So if you are [inaudible]?
>> Raanan Fattal: So you might have here, yeah, large value in the
kernel; large value here, small and another large value. It might be
the case; there is no limitation here.
>>: [inaudible].
>> Raanan Fattal: Yes, it would be [indiscernible] over here. One low,
another low, I mean there is no assumption of [inaudible] modality
here.
So how do we recover a phase? We have the kernel, so we can recover
many kernels, K of V so all of them have the correct power spectrum.
So given the power spectrum we can recover all of these kernels and
they only differ by their phase content right. So, well there are
infinite solutions as we can see here is the parameterization of these.
So in one dimensional space one can show that there are a finite number
of solutions when we include these constraints.
And as I said it’s thought that in 2D space and higher there is only
one unique solution, but the problem is that all the phase retrieval
algorithms out there are not guaranteed to converge to that right
solution and it’s very jumpy. So they do converge to different
solutions. So it’s not the matter of uniqueness that is the issue, but
the ability to converge to the right phase. Well especially when you
have noise and you can’t measure; you never know when you are there or
not.
So the typical phase recovery that people use is an algorithm from the
70s. It does very simple iteration. So we start with some given power
spectrum, it’s F. And what we do is we start with some initial guess
of the function in real space, the kernel in real space. We compute
its [indiscernible] form now we correct the constraints in Fourier,
which basically tells us to change it’s magnitudes to the given
magnitudes, but we maintain the same phase. Then we go back to real
space and now we apply the spatial constraints which are the positivity
and limited support. We’ll just start setting values to 0 and clamp
them below 1 if needed or maintain some integration. And we will
repeat this process again.
state of the art.
This is a very old strategy, but it still
So let’s summarize the algorithm. So I am given a blurry image, at
first I am computing these F of data functions. So I am taking that
image, I am rotating it, projecting it by just summing the pixels along
one direction and then I am whitening it. So I am differentiating it
in one dimension and computing the autocorrelation and we see these 1D
functions stacked here. So this is the [indiscernible] coordinate and
these are the slices, okay. So each one of them is multiplied by some
arbitrary C of data that I need to recover and there is also the
indicative M of data. This is why we see these discontinuities and so
on, right.
And now comes the iterative phase where we are trying to estimate M of
data and C of data, but we are doing it actually indirectly through the
S of data, the extent of the blur-kernel that we are updating. And
what we are doing here is the following iteration basically, this is
iteration. So we are assuming that we know the extent of the kernel.
It’s not M of data and C of data; it’s S of data the extent. And given
those we normalize each of these 1 dimensional functions. If we assume
that we have S of data we know M of data and C of data; so we could
normalize them. This is a converged image, so it looks much nicer than
this. So this image differs from this by just multiplying by some
indicative scalar of each of the rows here.
Each line here corresponds to some slice in Fourier space which we just
copy and these are the power spectrum. This just gives us the power
spectrum of the blur-kernel. Now we use phase retrieval, an iterative
phase retrieval to recover the kernel from this kernel. We start
estimating the soft data, the compactness and the size of the kernel
along each direction. We go back; we re-normalize and do this
repeatedly until we converge.
So one note about the noise: We assume that we, in the original blur
model, we have the convolution plus the noise. So the noise is
actually being reduced by this projection operation because we have
independent noise in each pixel and by transferring from the blurry
image into these F of data image functions we are integrating along one
direction of the image. And if we have N pixels then we are reducing
the noise by 1 over square root of N.
>> [inaudible]. So now that you have a [inaudible] of the image, is
there any value [inaudible]?
>> Raanan Fattal: Of the kernel?
>> [inaudible]?
>> Raanan Fattal: So in this process we never recover the image.
>> Okay, well, you can’t, I mean, okay, so fine, all right.
>> Raanan Fattal: But, I can use some non-blind deconvolution to
recover the image; it doesn’t help me.
>> Okay, you could do that and then you could get a much more accurate
estimate of the power spectrum of the image and then you could repeat
the whole process by modifying your whitening. Would there be any
value in that?
>> Raanan Fattal: You mean C of data?
>> Not necessarily, maybe it’s not exactly a power law, maybe it’s
something slightly different that you could actually measure what it
was and then go through your whole process again.
>> Raanan Fattal: Right, we never tried that. We never tried that,
what the most immediate thing that I think is just actually recovering
C of data. That sounds like the immediate thing I would have tried.
No, we haven’t tried that; it’s interesting. We were actually in this
mind set where we said, “Okay, no playing with the image”. We wanted
to go from this high dimensional image to these tiny kernels and just
work here as much as we can and that’s it.
>> So what do you do when you reset the image, because each projection
has a slightly different size and the Fourier transfer [inaudible]?
>> Raanan Fattal: True, right, so that is basically the reason why we
haven’t --. Well I kept on mentioning the spatial counterparts. I
didn’t really need them; I could have done everything in Fourier space,
but the problem is that the image is not cyclic so in Fourier we start
seeing these biases. So that’s what made us compute the
autocorrelations in real space where there are a lot of very reasonable
ways of dealing with the boundaries.
So what you are saying is that
here is actually narrow and as
the thing is that we need F of
weight of the blur-kernel. So
the autocorrelation that I can compute
I am rotating I have a larger one. But
data just at the size of 2 times the
we just don’t have that problem.
Okay. So input image, if we could dim the light?
because of the recording right?
Oh, we don’t do it
>> Rick Szeliski: It’s okay when you say dim the lights the person in
the back does this.
>> Raanan Fattal: I see, okay.
[laughter]
>> Raanan Fattal: So input image/output image. The blink is supposed
to blind you. Input image/output image, comparison with other methods.
Input/output.
>> I am confused, why do you do the blinding?
>> Raanan Fattal: I did not prepare these slides and I forgot to remove
it. I know it’s very annoying, it’s my student, he, yeah.
Here is comparison again with other methods. So I think that in terms
of quality we are kind of meeting the state of the art which is
somewhere between Levin and Cho and Lee. Input/output/comparison.
Input/output. It should be comparison of that input/output.
So the nice thing is that this power load, 1 over omega squared is
actually the result of the presence of step edges in the image. I mean
you could do this exercise, just take a heavy side function, compute
its [indiscernible] form, see that’s 1 over omega and the square root
of that would be 1 over omega squared.
The thing is that apparently also [indiscernible] also gave this
spectrum. And it actually gives the spectrum from even small chunks of
the image. So we have this advantage over methods that search for well
resolved edges and do this local fitting and so on, where
[indiscernible] actually reveals the, for us, I mean it gives the
whitening. The [indiscernible] of 1 over omega squared which we can
use for whitening and recover the blur-kernel. So very cluttered are
the images of desert or foliage is actually playing into our hands.
So output --.
>> Can you use a small enough spatial support that you can get a nicely
spatially adapted kernel or something?
>> Raanan Fattal: I think we played that just a bit. So if the image
is large enough we could just run the algorithm here, here, here and
recover these different blur-kernels, but now you need some
generalization of this model for the entire image. But, at this point
we decided why should we do it with our algorithm? I mean many
algorithms can be applied at different points in space and then recover
the 3D blur-kernel somehow. But, it’s an interesting direction; I
don’t think I have seen someone doing it. So it’s kind of surprising.
>> [inaudible].
>> Raanan Fattal: Yes.
>> [inaudible].
>> Raanan Fattal: I don’t remember the details, but what there were
suggesting is apply a translation variant algorithm in this part, and
this part and this part, just sample them. So you get different 2
dimensional blur-kernels. Now you have to search for the 3 dimensional
blur-kernels, such that if you would kind of like project it. So it’s
not what they have been doing. They have been working with a 3D --.
>> It came from rotation. You are saying everything is from
translation, but if you sort of added a rotation, let’s say around the
center for example [inaudible].
>> Raanan Fattal: Right, so this is what I am calling the 3 dimensional
blur-kernel, because it has another axis of rotation. And then given a
few 2 dimensional blur-kernels could you estimate the 3 dimensional’s?
It’s an interesting question.
>> They have a [indiscernible] paper and [indiscernible] have a similar
paper that really do that. They take these 2D estimates as a bootstrap
for trying [inaudible].
>> Raanan Fattal: I see, right. But that wasn’t some global
optimization over the 3 dimensional kernel?
>> [inaudible].
>> Raanan Fattal: Okay, some numbers.
So these numbers are for a 1 megapixel blurry image. So it takes us
about 2 seconds to compute all the F of data and then project the image
onto the different orientations. Then 12 seconds was spent on the
phase retrieval in the outer iteration. There are lots of iterations
there. We are actually starting with 30 initial guesses for the phase
retrieval. The phase retrieval is an issue, nevertheless this is
supposed to scale with the blur-kernel size and this is supposed to
scale with the image size.
So it could be cases where it might be better or worse. Both of them
are really trivially parallelizable because projecting the image into
different directions can be done in different CPUs. And the phase
retrieval when you start from different initial guesses and just start
iterating it’s just this genetic optimization algorithm with can be
done on multiple processors.
We run also the quantitative comparison. So we used a data set of 8
images from nature images to manmade images. It’s not the images --.
So we used 9 kernels from the data set that Levine built. We did not
use her images, because her images are 256 by 256 pixels which is too
small for our algorithm to work, because our algorithm kind of
estimates statistics basically. So the smallest images on which have
been enough successful were 512 squared. So her data set was just too
small for us.
And then we used her metric which was actually asking what is the
deconvolved image that you get from your kernel minus the 2 sharp image
verses the deconvolved image that you get from the 2 kernel minus the 2
image --.
>> Which deconvolution technique?
>> Raanan Fattal: We used Dilip’s Hyper-Laplacian and then we look at
the fraction of the images that is below epsilon if I remember
correctly.
So this is what we obtained. So those are images you are not familiar
with. We used them, but we did use the same blur patterns that are not
used. And these are the results that we get.
So we are in power with most of the state of the art and
[indiscernible] of this measure. And we are not doing that great here,
but okay. One failure case is --. So what happens when we have a
repetitive pattern in the input image itself intrinsically? So this
image is kind of similar to itself in the short version of it, right.
So here is a blurry image, this is our result; it’s very
[indiscernible]. So I don’t have the sharp version of this image
because I just found it on the web, but I did search for another sharp
image of the same bridge and I computed the autocorrelation of that
function.
So that should have been a delta function and look what we get. We get
something that is a different form, just because we have these periods,
consistent periods. It’s this distance that if you move you get a peak
in the autocorrelation function. By the way if they were large enough
and the blur-kernel was smaller then we might have a small region in
which it was sufficient for us to recover the blur-kernel, but in this
case it’s relatively small and this is a classic scenario where we
cannot expect our method to work.
>> Is your orientation, the kernel orientation, if you sample the
orientation [inaudible] then can you deconvolute that as well?
>> Raanan Fattal: No, it won’t help me to [indiscernible] this.
>> Because each light had a little bit different angle.
>> Raanan Fattal: Right, but still at that particular --. Well I think
that the autocorrelation --. So this is supposed to give me an
autocorrelation along this direction, right. If I move the image in
this direction I am supposed to see these additional peaks. But I
think that even if I move a little bit in the orientation I would still
see --. So it’s not like a [inaudible] of high precision, like the
correlation that you see is substantial. I think it is a not very fine
line, that’s my guess. Unless if the image is huge, but I don’t think
in practice that would help much.
>> I am sorry to ask this question, but what assumption is it about the
image that says that it’s whitened out of correlation to [inaudible]?
I mean I guess this is [inaudible].
>> Raanan Fattal: Yes, so if you think about derivative images, so you
have these peaks somewhere, right. Now you want to say that
statistically if I saw a peak which was basically an edge there is no
fixed distance that I would move and on that particular fixed distance
I would see another edge. So we have edges in images, but just 2 edges
--. There is no reason for 2 edges to be always at this particular
distance from one another, but here it happens.
>> And one just ignored that direction and computed on the kernel.
>> Raanan Fattal: Yeah, so I might have a hole in the kernel, but then
again I don’t know how to do it [inaudible], because I don’t know that
was the direction. I mean I don’t know if it was a blur-kernel that
did it. I don’t know if it’s the real image. There is this major
ambiguity that I don’t --. I mean if you will tell me this direction
is bad then maybe I could do something, but I could not do it --. It
might be the result of a blur-kernel, like a real blur-kernel will have
delta, delta, delta that it produces.
>> Yeah, I mean, not looking at total image, but just looking at 6 of
these stripes it could have shown why your blur-kernel [inaudible].
>> Raanan Fattal: Yes, exactly.
>> So it’s interesting that the tree regions don’t seem to have that
problem, because they are sort of periodic.
>> Raanan Fattal: They are very similar in your eyes.
>> They are periodic, but maybe not as strictly periodic.
>> Raanan Fattal: Yes, but they might have dominant scales, but they
have enough variation around them. Yeah, I know it’s kind of
surprising, you might think that there is a very particular scale that
would play here, but if you look here it’s kind of different then
there. Then you’re integrating across the center image and then there
is here, so somehow they, yeah.
So let me summarize. It’s easily parallelizable algorithm and image
size only affects only one component of the image, the projection step.
So the larger the image is the lower the noise is, but I am guessing
that many other algorithms would enjoy that. So we are not repeatedly
recovering the latent image as the MAP algorithms do and so therefore
we are much faster than them. And we are not relying on our ability to
detect the specific edges. So for example a multi-model blur that
Dilip mentioned it’s always hard to say, “Where was the edge”. So we
don’t have that issue.
But then this model in all the [indiscernible] form that we have been
using so much requires the model to be like the convolution blur. And
also we cannot operation on small images just because the statistics
are not converged and definitely not spatial varying blur that’s out of
the question.
Okay.
That’s it; thank you.
[clapping]
>> Raanan Fattal: Questions?
Yes.
>> Sorry I am stuck on the image of the bridge. So suppose you ran it
through your algorithm [inaudible]. You get some blur-kernel that
would be a more succinct explanation of that image. You could find
some [inaudible] bridge image that didn’t have the repeated
[inaudible].
>> Well that’s what the middle image is, right.
hallucinates repetitions of --.
It basically
>> Raanan Fattal: This is the whitened one.
>> The middle image is the result of your technique, right?
>> Raanan Fattal: Yes.
>> So what it does is it hallucinates repetitive structure here in
order to make these things sharper, right? Well maybe not sharper, but
somehow more statistically right.
>> Raanan Fattal: Exactly, so if I would differentiate this image and
look at the autocorrelation that would be white.
>> So this image has a much higher total variation probably, but if one
tried to minimize that then --.
>> Raanan Fattal: Now it’s a fight between a very long kernel and the
natural image prior. And actually this algorithm for recovery has
that, right.
>> Right.
>> And then is it a property of this image or this image algorithm?
Like I am trying to think, would other [inaudible] algorithms be a
puzzle for them too or would they be able to find it?
>> It’s how many checks and balances you put in, right. Like for
example the edge-based technique, which I am most familiar with.
>> Raanan Fattal: They won’t suffer from it.
>> As long as it, you know, the top edges of the pylon are focused on
that we would get the right thing. If it happened to zoom in on the
strings and the search area was big enough that it actually found the
string it would be very convinced from that. But of course if you put
in enough checks and balance and say, “It’s just an isolated thing on
this image region”, you know if you had some sort of a ransacked like
thing that maybe said, “Sometimes there is real repetition so we will
do that”.
In other words the more you look at regions where there is a single
step edge and nothing else, because a lot of the things that strike me
looking at your earlier examples is like our technique we would have a
harder time with tulips, where the white tulip is over the background
of a very green textured thing, because it really zooms local to color.
Here likely the sky and the pylons are very good, but there where there
was that faint little echo of white tulip against green textured leaves
our technique would be struggling.
Yeah, there is no free lunch.
there is a glitch somewhere.
The best thing you can do is hope that
[laughter]
>> Raanan Fattal: Yeah.
>> And that still isn’t even that easy because your glitch is usually
saturated.
>> Raanan Fattal: Right, right.
>> Rick Szeliski: Okay, thank you.
>> Raanan Fattal: Okay, thank you.
Download