Harvard with his PhD a while ago, and after that,... Rensselaer followed by a stint at multiple industrial research places...

advertisement
>> Larry Zitnick: All right. It's my pleasure to introduce Daniel Freedman. He graduated from
Harvard with his PhD a while ago, and after that, went on to be a tenure professor at
Rensselaer followed by a stint at multiple industrial research places such as HP and IBM and
finally ended up at MSR in Israel. And some your earlier work was active contours and
segmentation I believe. And today he's going to be talking about a lot of his more recent work.
So looking forward to it.
>> Daniel Freedman: Thank you Larry. So, yeah. So I gave this sort of whimsical title. I decided
that it would be more interesting to stick a whole bunch of things together rather than talk
about one thing just to keep you interested. So to start, I want to start with a Lemma. If we
have a simplicial complex K and we, I’m just joking. We’re not going to start with that. That
Lemma actually appeared in a paper of mine from about three years ago, and I will talk about
something related to that a little later, but hopefully in a way that's a little more fun than that.
Okay. So the idea is just to give a high-level overview of a few different topics I’ve work on the
last three-ish years. Hopefully you'll stay awake and maybe even find it fun.
Okay. So just a little bit quickly about me. Larry already mentioned, but I joined MSR, the ATL
in Israel, the Senior Researcher just about a year ago. Before that I did, I was a physics
[inaudible] undergraduate at Princeton, and then I finished my PhD at Harvard in 2000. I was a
Professor of Computer Science at RPI for nine years, and at some point in the middle got
tenure. I came to Israel on a Fulbright. I was visiting a professor of Applied Math at Weizmann
Institute and, well, this is Troy in the winter where I used to live and that’s Zikhran Ya’aqov also
in the winter where I live now. So we were happy there, and so we, I bounced around a few
different places in Israel, but now I'm at MSR and it's been fun.
Let me tell you little bit about our group, which is currently called the one vision group, so
basically the ATL initials divided between two sites in Haifa and Herzliya, our group in Haifa is
the OneVision Group. We try to do computer vision research which is on the boundary with
machine learning, generally, although not always. The idea is that it is often is useful if it relates
to Microsoft business. And our main focus recently has been on things like 3-D sensing and
object recognition. I'll talk a little bit later about some of the 3-D sensing stuff, but
unfortunately, like I was telling Larry, there's some NDA sort of things, so I'll just give a kind of
high-level overview of that.
So general things I want to talk about: so the first part of the talk I want to talk about things
that I did that are related to image editing. And this was actually an interesting departure for
me because I didn't, until that point I was spending my time doing two things basically: vision
and computational geometry slash topology, and this is stuff that's on the boundary with
graphics. It's not real graphics. It's more, sort of, but it's as far as I went into that world. And it
was fun because we could actually draw pictures of things and it looks, the results often look
nicer than that vision results.
Second thing is algebraic topology. Obviously, I’m not a topologist. I play with my TV. I'm
interested in how to use this in problems that are of interest to us. We did do some work
which I won't talk about where that theorem comes from, which was actually like more in the
CS theory sort of boundary with this kind of math stuff, so we published in [inaudible] and
discrete and computational geometry and places like that. But of what I'm interested in talking
about today is how you can use these tools in computer vision, in particular,
tomology[phonetic] groups; and hopefully, it's like a bit of a, it’s a heavy field from the point of
view of machinery, but I'm going to try and give a nice sort of overview, and if you like it, I'd be
happy to point you in the direction of some interesting papers you can read, not just from
myself, obviously, but sort of overview papers. And I'll talk a little bit about the 3-D sensing
work. I guess I'm a little restricted what I can say because of the NDA, but the problem’s an
interesting one. So I'll give a flavor of that. And finally, fancy pants features. Some sort of
features that we’re trying to develop for images. Also, this is, there is no NDA there. It’s
ongoing, and I just want to give a little taste of the sort of thing we’re working on. So those are
the things I want to talk about.
So let's start with an image editing. And this is joint work with Pavel Kisilev and Zachi Karni who
were at HP labs, Craig Gotsman, who is at the Technion in Israel, and Renjie Chen and Ligang Liu
who were at Zhejiang University in China. Okay. So the first problem here is the image resizing
or retargeting problem. Some of you may know this, but just to give the overview of what
we're trying to do, we want to change the aspect ratio of an image. So we have an image that's
of some aspect ratio, we want to stretch it or squash it or something like that, and this is a
common problem that people want to, a common operation people want to do in, you know,
sort of print things for different size albums or whatever. And, of course, you could just rescale
the axis in the ordinary way just by stretching. You could do that and that would be fine if it's
okay for people to become fatter, or alternatively to become thinner, which you'd think people
would like to do, but it actually looks very strange and artificial. So the question really is: is
there a better way of doing this?
Okay. So that’s the problem. So now what does this look like? So if we just, so here's an image
and if we just do the regular uniform sort of squeezing, right? What we get is something that
looks like that, and you can see that here there is no person, I mean it's not sort of, an obvious
thing that bothers, but it's not the same image that we were looking at before clearly, right? So
what we'd like to do instead is do something like that. Now what's happened here is you can
notice that the parts that we care about, namely the island and so forth, the aspect ratio’s more
or less been preserved so it looks realistic. Now, of course, that has to come at a cost
somewhere else, so if you look at the sky, that isn't the case there. And here there's been some
more stretching, but the principle is you don't really care about what happened over there or in
the water, you don't really see those details, okay? So to draw it again, there's the picture of
the important parts, so to speak, where we want to maintain the aspect ratio, or salient parts,
and then the yellow parts are unimportant and we are allowed to have some sort of distortion
there. And that’s the whole game, is basically to do that. Okay?
Now how do you measure what parts are important? Well, there's lots of ways to do that. You
can do things that are sort of the very, very simple like gradients, which aren’t very
sophisticated, and that will at least give you something like, well I know this is a white wall or
maybe this part of this sky, and so I know that that's not important. Or you can do something
more interesting, and there's a lot of work in the vision community on saliency, recently there's
even a salience data set and people run on this. All that can be used and we take that as sort of
external or exogenous for now, we just take that is given, and we are more interested in given
that measure, how do you do the deformation correctly? Okay.
So seam carving, I can't say, I'm not 100 percent sure that was the very first thing, if it wasn’t
the very first approach it was the second approach to this. But it was one that popularized the
problem. So I guess a lot of times the, that's sort of one of the great achievements of an
algorithm is not necessarily what the algorithm is but popularizing the problem it’s trying to
solve, that's what the seam carving guys did. And it was a very nice work. And what they
wanted to do is basically do this by removing unimportant seams from the image. Okay? So if
you have an image that looks, well this isn’t a very interesting image, but let's imagine there
was something inside, and we want a seam would be a kind of a curved column. And I'm going
to measure, based on how much saliency there is in that curved column, I’m going to try and
remove a seam that's unimportant, like maybe something like that, and assume that there's
unimportant stuff under the red and it would then get squashed. And we just removed seams
one after the other. And that was a very nice idea.
Now the problem with doing this is that it's discrete and anything that's discrete will have the
[inaudible] descrete sort of artifact. So what you might get, so here's an image of a bunch of
women, and if you apply seam carving, you can look at this image for a second and you can
start to notice there's something strange; she has no leg. Right? This is not uncommon, this
sort of thing, especially if you have narrow structures like that. This is maybe more than 4 to 3,
3 to 2, but it's still in the realm, it’s not squashing something to a very, very thin kind of wedge,
and you end up getting this thing where people lose limbs or whatever.
Okay. So what we did was we decided we would go towards a more continuous approach.
Now, obviously, the idea behind that is to avoid these kind of discrete artifacts, okay? In
addition, it's going to just generally give the whole thing, even when there aren't sort of strong
artifacts, it's going to give the whole thing sort of the smoother look. The problem is that it’s
sort of heavier to compute. So you trade one versus the other. So what does this look like?
This is the image again. And what we want to do is, so we already saw that that’s sort of the
result that we want. Now what does it look like actually? Sorry. So we’re going to overlay a
quadrangulation, which begins just as a bunch of squares or rectangles or whatever, let's say
squares, and we’re going to deform that quadrangulation into another quadrangulation which
might look like that. Now if you look at it, you can see that the thing we were advertising in the
previous slide hold, so what is that, what do I mean by that? So over here in this section, right
where you have the island, you have lots of little squares, and they're still squares, almost. So
you haven’t much deformation in terms of the aspect ratios. And up here they’ve become
these sort of elongated rectangles. And that's sort of the type of thing that we're going to
expect, and of course it's not, I mean in the end these things are no longer rectangles at all,
they’re some sort of trapezoids or whatever, but the point is that they’re more or less close to
that.
>>: May I ask a quick question?
>> Daniel Freedman: Yeah.
>>: Basically you're saying to try to preserve the [inaudible] make things even worse for
[inaudible].
>> Daniel Freedman: Exactly. Yeah, yeah. You have to pay somewhere.
>>: [inaudible] term, I mean, the other thing that happened here is, of course, is the salience
got smaller.
>> Daniel Freedman: That's right.
>>: [inaudible] scale>> Daniel Freedman: That's a very good point. And, in fact, there is, in about three slides I’ll
show you, but to begin with there's no term and then one can add in the term like that. Okay.
So before I just go on, just to make clear, the question was sort of you saw that these things
have actually, it’s sort of undesirable, they become smaller, you don't want that. So to begin
with we won’t have that. So let’s see what the original sort of objective function is going to do
in constraints.
Okay. So each quad is going to be defined by four points, which we’ll label one corner I, J, and
then I plus 1, J and so forth around. So each, we're going to have a term that’s basically
applying to each of the four edges, and I’m just when to show you one of them because they're
all identical, okay? So let's say we are looking at this edge right here. And what we want is we
want, we are given the X’s and the Y’s. Here are the original coordinates of the four points
around the quadrangle, and the tilde variables are the new coordinates, and we’re going to
compute those new coordinates.
We are also going to compute some intermediate variables, which we call the stretch variables,
which are A and B. So what do we want? We want that the new quad should be more or less
an axis-aligned affine transformation. In other words, just basically a square going to a
rectangle of the old variables, of the old square. So how does that look in theory? Well, we
have this term E, I, J, which basically just looks at the difference along this edge of the X’s and
the Y’s. And we're saying that it should basically, the delta X should just, the new guy should
just be a scaled version of the delta X of the old guy and likewise for the Y’s. That scaling factor
might be different for X and Y, here it's A and here it’s B, but that's what we are aiming for. So
if it was exactly an axis-aligned deformation, then this energy would be zero, and if it is not
exactly, if it becomes some sort of trapezoid, well hopefully it will be more or less a rectangle.
So that's the energy, that energy term.
Now, that doesn't say anything at all so far about what's important and what's not important.
So let's go on and talk about how we’re going to introduce that, and we’ll introduce that via
constraints. So we have three types of constraints. The first are not very interesting. It's just
boundary constraints on X tilde and Y tilde, namely if you live along the boundary you have to
live along the boundary. You can't, the thing in the end, the image has to be a rectangle. So
that's sort of clear. The second sort of constraints is that these are things that must be positive,
these stretch variables. In other words, I can't have some sort of reflections and things flipping
over. Or at least I'm going to try not to.
And the third is the interesting one. And this is where we get this importance, and it says that
the ratio of A to B, I’m sorry, before we look at this let's suppose that we've stretched more in
the X direction, so it's some square thing and we’ve pulled it, okay? So the ratio of A to B,
remember A is the stretch in the X and B is the stretch in Y is lower bounded by one, in other
words I must try to stretch more in X than Y, and is upper bounded by something, okay? That's
something is a per quad measurement which depends on the importance.
So, in fact, this upper bound is going to be a decreasing function of importance. If the thing is
super important, I'm going to set si to one, which effectively says I have to scale by the exact
same amount in both a and B. Or X and Y, if you like. If it's not important at all I can set this
thing to infinity and then I can do whatever I want and anything in between. And that’s the
constraint. Now, you could do this with a soft constraint. And we tried that too, rather than a
hard constraint like this. But we found that the hard constraint worked much better. Of
course, if you stretch more in Y than X then this would just flip.
Okay. That's about the optimization. I'm sorry that's the general problem. So the optimization
itself, we’ve got these variables, these X tildes, Y tildes, A's and B's that we want to solve for,
and if we sort of just collect them together into one big Z vector, then it's not too hard to see
that it's going to be a quadratic objective function; and we have linear constraints, so we have a
quadratic program which we can solve.
We solve this in several ways. So MatLab is, you could use, is much too slow to do this on
something of any reasonable size, so you can use, we used CVX, we used some commercial
packages at the time, and they all worked fine. Now, some comments about some sort of
variations you can do to this problem, so one is you could use a different norm in energy. So
we had a square term. So instead squares we could use a one norm, an infinity norm or
something, and of course it still remains convex. It made absolutely no difference. I mean, very
little. You could see some minor changes. So that wasn't very interesting.
One thing that is more interesting is prevention of fold-overs. So what do I mean by that? So
this is a problem that plays continuous methods as opposed to discrete methods. So suppose
you have two quads, one beside each other. There's nothing to prevent me from mapping,
remember I'm just giving you X tildes and Y tildes, and I could get something like this and then it
folded over on itself. And this is something that graphics people know very well and vision
people, or at least this vision person knew less before I did this, but it’s sort of a standard thing,
you can say that now you don't know how to render and so you can prevent this easily by
having extra constraints which just say, can't do that. I have to have this X tilde, for example, is
bigger than this one, so I'll going to always be on the side. And the way we did that, in fact, is
not just that it's greater than, but it's greater than with a certain margin. So you can't have an
infinitely, I’m sorry, a zero width rectangle. You have to have some width. And so you can
prevent that.
And this was interesting, by the way, just because there were sort of methods that try to do this
in a local optimization sense that can’t do this. So it's not, it’s very simple here, but it actually
added something to it. And then finally, the question from before, which is how do we deal in
large important regions. And here it’s very easy. You can just add in a linear term. It would be
a minus some term, we would maximize a linear term in the importance or minimize a minus a
linear term in the importance. So just an extra linear term and the thing remains a quadratic
program.
So let's see some pictures of this. So we’ll start with the image we had from before, and that
was the original, right? And there's the uniform scaling. And here, obviously, it’s just uniform.
You don't see any terrible artifacts, just what we said before. Here is the seam carving, which
as we pointed out, she’d lost her leg as we saw before, but actually something else I didn’t
point out over there, this person sort of disappears. It's not as sort of abrupt as losing the leg,
but just kind of disappears. And here's our result. The quadratic programming result, you can,
if you look you can see certain artifacts as well. Like, for example, the fact that you get sort of a
certain curvature of this line, which was straight before. But on the other hand, the women all
look relatively normal compared to the way they looked before simply because their saliency
was considered to be high. Of course, I should say that when we compared all methods, seam
carving and I'll show you some other methods, we use the same saliency measure across all of
them. So that isn’t an issue.
Okay. Some other pictures. So here's an example which addresses exactly the issue of small
things, salient things becoming small. So here we had the extra linear term which encourages a
large saliency and the house was deemed salient and so actually becomes very large here. It's a
bit of matter of taste whether you like that or not. But you can clearly see the effect there. The
other methods we’re showing, so you can see the seam carving, and there were some there,
there were some successes to seam carving, there was something called multi-operator. There
was two methods, one of ours, which was the Local-Global, that’s a previous method of ours
which didn't have this size term, and another one which was called Optimize Scale and Stretch.
I won't go into the details there, but you can see, and again, these things are bit of matter of
taste. There was an attempt to have a competition at one point organized by, I think it was
[Shioby Dunn], who was the original seam carving person. Here, you can see, I think we do a
little bit better at keeping the table and the wine and more or less in its right aspect as opposed
to pretty much all methods except the Local-Global, which is very similar. That was our
previous method and does sort of similar type of results, although it's a little smaller there. And
here again, you see the size issue with the fish. As I said, there was a competition at some point
run on some several sort of standard sets of images.
Now let me show you something we're not good at at all. And I sort of alluded to this before,
which is preserving straight lines. You saw that a little bit with the curve of that cement block.
Here you can see something a little bit more blatant. If you look at the bend of the, and it’s
terrible, right? And you can do, there's a hack, you can fix this semi automatically if you have
someone sort of say I want this line to be straight. And you get something reasonable, and
what they're basically, it's just along that line you're putting very strong constraints on those
stretch variables to say that they’re more or less the same. There's nothing magic, it’s not a
very satisfying result though. Okay. So that's a sort of, that’s the image resizing and retargeting
type problem that continues right now. What you could do is you could take an extension of
this and do something related, which is a little bit more graphics-y maybe, which is not images,
just shape deformations, and we want sort of natural shape deformations.
Okay. So what is this idea? So here shapes are going to replace images and shapes are some
sort of 2-D contour in what's inside of it. And triangles are going to replace quads, and now it
sort of the standard graphics formulation with triangulations. And now we want to say that
each triangle should undergo a nearly natural deformation. And there were a few pieces of
work that tried to do this. I'll tell you our little innovation on it, if you're familiar with the work,
and it has to do with what we consider a natural deformation. So natural could be something
like Euclidean or similarity, right? Or it could be something a little more complicated. And the
more complicated thing that we did was something that, it was actually, both of course
Euclidean and similarity are groups of transformations, we ended up coming up with a set that
sort of interpolates, we took similarity as sort of the gold standard, in other words, if you had
something that was really salient, you'd want it to be a similarity transformation in the new
deformation, and if it wasn't salient, you could do something that was a bit simpler. I'm sorry, a
bit more deformative. And what was that?
Okay, so let's say if something’s really not salient at all you could do whatever you want with it.
And that would just be a general affine deformation. So we want to interpolate between
similarity and affine somehow based on how salient you are. So there's lots of ways to do this,
and the way we did our little sort of secret sauce here with the following: you just take, so any
deformation in 2-D is going to look like some two by two matrix times the thing plus another
two by one vector for the translation. So let's focus on the two by two matrix. If you take the
singular value, the SVD, what one finds with, of course, a general transformation, the singular
values can be anything positive or nonnegative. If you take the SVD of a similarity
transformation, the singular values must be the same. Okay? If you just take a look at the ratio
of the singular values, the big one to little one, you can say, okay now I'm going to basically
transform from one to the other based on how big that is. And if the thing is sort of, it’s
supposed to be really salient, I'm going to make that thing be close to one, that ratio, and if it’s
not salient, it can be whatever it wants. So it's very similar in that regard to what we saw
before, only instead of those A's and B's [inaudible] variables, we are now looking at singular
values. And that was sort of what we did.
Now how does this look in pictures? So here the game is the following: you can see that there's
the Kool-Aid guy. And what you do is you see these little green dots here, you are going to
basically, someone's going to move them manually to somewhere that they want, and we're
going to compute the new transformation based on the constraints that those green things
have to move where we want them to. So you can get something like this, so someone
specified, I want the hand to move down and whatever, and it computes the deformation.
Now, actually it’s pretty good. You could see that the left, sorry, his left arm, our right, this one
here is a bit weird, right? Because it’s all 2-D. So you just know out of playing anything. But
other than that it looks sort of semi-reasonable. There's the frog doing some sort of dance.
And you can do this in images too and you can get something like that. So we start here and we
move to here and you get a reasonable looking transformation.
Okay. So that's the end of that problem. Now I want to talk about a different problem that I
did in sort of this image editing world which was recoloring. So we are on to something new
but still within image editing. And here, this is sort of a problem of transferring color scheme.
So I want, I'm going to have the two images, a source and a target, I'm going to take these
colors from the source and transfer them to the target. And the key is to retain, this is some
sort of vague hand wavy thing, I want to retain the original object’s look and feel. So I'll show
you a picture and you'll see what I mean by that. So here's the source, some sort of crest, and
here's the target, if you like football that's Adrian Peterson, and I want to somehow take the
colors from the source and transfer them to the target. And of course, if you can imagine,
there's lots of different ways to do that.
To begin with, I'm going to assume I have a segmentation and I'm not going to worry about
that, and you'll see later on that there are some errors occasionally due to the segmentation,
and that’s something that I take as given, and I don't really care about the segmentation. I use
some tool of my own that I developed a while ago, but you could do anything. Now, based on
what's within the segmentation, the mask, I’m going to take some sort of distribution over
colors, okay whatever space, you can specify what space you want, and basically I'm going to
compute a color transform, which we'll call Big Si, and then I'll just run the part that's in the
mask through that and I will get this and that's the final result.
Now what do I mean by the look and feel here other than the fact that yeah, okay, I got the
colors from the source to the target? If you look at the red over here, the little red rectangle,
and the corresponding red rectangle, I don't know if you can see with the granularity of the
projector, but you sort of retain the wrinkles. And you can do things that are, sort of give you
something rough but don't get that level of thing, but that's what we’re going to look for is that
level of detail is preserved. Okay. So that's what we want to do.
So, by the way, this appeared at CPBR a year or two ago I guess. Two years ago. And a variant
appeared at ICPR afterwards. Okay. So which colors match? So we're going to basically be
agnostic about this and we’re going to say that the user can specify some sort of ground
distance between colors. So I could have something like RGB, and that's generally not a good
idea, or I could do HSV or LAB or something like that and match colors on that basis, and that's
going to be our ground distance. Or I could do something that's weirder, I could match things
based on their brightness or something like the opposite, I call it one minus brightness, the
opposite of brightness on the edge of the bright and the source matches to something that
starts in the target or whatever or something more exotic, whatever it is I'm not going to
specify that. I'm going to use that as a sort of a basic tool and then compute things that sort of
work based on the ground that's specified. What's interesting is you'll see is because we do
something a little bit more sophisticated, we actually, if you want to stick to perceptual color
spaces, it turns out based on what we do that using RGB, is actually just about as good as using
LAB, which is sort of interesting, it's not something we anticipated.
So what do we do? There's three steps here. The third step is simple; it's just the first two
steps that matter. The first step is based on the ground distance. We're going to compute a
coarse color transform by the transportation problem. So the transportation problem is
basically the earthmovers distance. So, in fact, what we are going to do is we’re going to take
something like the EMD, almost very similar, but we're going to add two wrinkles to it. So the
earthmovers distance just remind you sort of you have the earth in the source and the earth in
the target and it’s just within the transformation, within the distributions themselves, and
based on how much work I have to do to move it, then I'm going to minimize that amount of
work. Now, what do we do this different from the earthmover’s distance? Well, the first thing
is we relax the conservation constraints. That’s the most important. So why do we do this?
The conservation constraints say that you have to move the right amount of earth from here
over to here. I have a certain amount of earth in the target; I'm going to move it all over the
source. I don't want to do that for the following reasons.
Suppose, I'll give you a little toy example. But this is something we saw in practice. Suppose
you have two images and we're going to match based on brightness. And one image is blue and
the other is red. So we’re just matching the light blue to light red and dark blue to dark red.
But one image is 50 percent light blue and 50 percent dark blue and the other image is 40
percent light red and 60 percent dark red. Well, I'm going to be mismatching some of the light
and not light because I don't have the same amounts in both ones, right? And that's something
I want to avoid. So I’m going to relax that conservation and say the conservation must be true
within a certain slack. You can't do anything, if I totally relaxed it then I’d just be doing nearest
neighbor effectively. And then that would mean that if there's one little noisy pixel, well, I'm
matching everything to that possibly. So I don't do that. But I say, I allow some slack. It doesn't
have to be totally conserved, but it has to be conserved and so I can't move more than twice
the amount of what's here over to there or whatever and you set that slack.
So that was one change. And that still leaves the whole program to be a linear program. It just
adds in some more, instead of equality constraints I have a linear inequalities that get added
into the convex program. The second term actually changes that a little bit and says
smoothness term and what do we want, what do I mean by flow smoothness term, well in the
original earthmovers distance, you don't have any constraint that says if two pixels in the target
are nearby, they ought to [mott] to nearby, I’m sorry, two colors not pixels, two colors that are
similar in the target, excuse me, in the source, I don't have anything that says that, so I want to
put something like that in. And that just, depending on how you do it, that basically will lead
you either to if it's an L, 1 type term you'll get a still linear program, if it's an L, 2 term you'll get
a quadratic program and so forth. That's less important than this but it still does add something
to it.
Okay. So what I said here was this is going to give us a coarse color transformation. But a
coarse color transformation won't keep the look and feel. Why? Because I'm binning things.
I've got a bunch of bins and I'm just mapping one bin to another bin and so forth. Right? So
basically remember the sort of look and feel part was the wrinkles ought to match to other
wrinkles. Wrinkles means subtle changes in color, it’s the same color but subtle changes in
brightness. Here's it’s sort of, it’s getting to be something really blocky. So I don't want that.
So that's going to give me the second step which is going to sort of give you within each bin I'm
going to get, instead of having it mapping to another bin, one sort of a bin center to another bin
center, it's going to give you a linear transformation within that bin. And that's going to give
you what you want.
So how do we do this? So if we look at flow to a given got target bin, right? We have certain
source bins that map to it. Remember the earthmovers distance isn't combinatorial; it basically
allows a few different, each bin can have a bunch of bins that map to it with different weights.
So what I can do is I can compute the mean covariance of the source bin’s mapping to a target
bin and I can compute easily the mean and covariance within the target bin just by looking at
the size of it. And now, basically, what I want to do is my second step is say I have a mean
covariance of one thing and map to a mean covariance of another thing and I'm going to do this
via some sort of affine transformation in color space. So we have all this local little affine
transformations, linear transformations plus an off set, and then I'll stitch them together in
some simple way, which I'll talk about afterwards, but basically now what’s going to happen is
basically it’s a local linear thing which gives you a global nonlinear thing, and that's good. So
the question is really how do I do this affine transformation? And we call that, it's a stupid
name, but it was, I couldn’t think, it’s an SMSP, the stretch minimizing structure preserving
transformation. So why do I need this? Isn’t mapping between two Gaussians just a really easy
thing to--Yeah.
>>: Just quickly. These are 2, 3-D mappings bins?
>> Daniel Freedman: That's right.
>>: And you're moving the, these little bins themselves [inaudible] important too?
>> Daniel Freedman: Exactly. That's right. And this one is easy, right? We could just do
something like scale so the individual variances match and then do whatever off set we need
for the means. You could do that. And that's, in fact, there was a nice paper on recoloring from
2001, which everyone seems to like to compare with by Reinhard, which was in SIGGRAPH I
think that year, and that's, so you could do that, and that's a good idea for LAB sometimes.
Okay? But, in general, if you're in some sort of general color space, it's a bad idea and a
warning here, this is an animation which I don't think quite worked out the way I wanted, but
we’ll try to anyways. So here we go. So let's say we are now in 2-D not 3-D. So here's one and
here's the other. There's the source and the target and I want to map one to the other.
So if I just squash individual variances, what's going to happen here? I'm going to push this guy
down and I’m going to pull this this way, and obviously that's a bad idea because I'm just sort of
simply amplifying noise and squashing signal and this is the part that doesn't work well. Yeah. I
thought it worked when I did it. But anyways, the point is you end up squashing and stretching
the noise, and that's not a good idea. So stretching amplifies noise. Instead, a better idea
might be to rotate. And that's a sort of very natural. And, again, it didn't work the way I
wanted, but the rotation is better. So what we want to do when we want to rotate, we want to
keep axis lengths preserved. And so what we call that is this SMSP. We want to minimize the
stretch and we want to preserve the structure. And preserving the structure basically means I
want to keep the orthogonal, I want to keep the orthogonal axis orthogonal. So that's the basic
idea there.
So what does this mean in practice? How do we do this? We take, when I say the orthogonal
axis, I mean to the principal axis, and the principal axis, obviously of the covariance matrix
should, the goal is that they should remain orthogonal and then whatever their mapping to I
should sort of squash as little as possible. And you can formulate this in a way that is, you can
find the global optimum of this thing. In fact, you can find, in 3-D it's very simple and you can
actually sort of, it ends up leading to a mixed continuous combinatorial thing which is easy to
solve because 3-D in the combinatorial thing there’s only sort of six options. But, in general, if
you did it in N, D and this might be a useful thing, then you can solve the combinatorial part by
the Hungarian method. And you can do it in a reasonable amount of time.
So that's sort of the, and I didn't go into great detail about how one does that, but that's sort of
the one nice thing pieces in the paper. And the third step is how do you stitch these together?
You're just going to do some sort of convex combination of these transformations. So in a given
bin I just look at nearby bins and see what they map to, what their thing is, and based on where
I am in the bin, I sort of take some sort of weighted average and that's just a simple sort of hack
that gives you something where you don't get any artifacts. That's all.
So let me show you some pictures. So the source here is the blue car and the target is the
green car and we want to color the green car blue. Reinhard, this method which basically says
let's look in LAB space, let’s just squash L’s and A's and B's and that's all, works really well here
because it's a good perceptual thing and there's only one color there. So it does what it's
supposed to. Interestingly here, I don't think we do any better. I don't think we do any worse,
we do fine. What's interesting is we use RGB in this method. So what sort of, which is not, if
you did that with Reinhard you wouldn’t get good results, not great results anyways, and what's
interesting is somehow this SMSP transformation more or less figures out sort of that it's, what
to do. But that wasn't the original sort of impetus for that, the original impetus for this thing is
that sometimes you have by-modal or are tri-modal or things like that, here's the example you
saw, and Reinhard, because it's trying to basically this method that’s sort of trying to fit a
Gaussian to something that's multimodal, invents colors. Not surprisingly it mixes the colors
together, so that blue and yellow sort of become sort of one Gaussian, and this green, purple
and white become another color. And so you end up getting some sort of funny things in here.
Obviously, you get something that's a little better.
Again, here the distance, the ground distance was RGB. Now you could say well, let's look at
different ground distances. So here was an example where the ground distance was matching
brightness to something that was bright to dark and dark to bright. You see some artifacts
here. In particular, you see the edges of the segmentation weren't terrific. But in general,
that’s sort of a reasonable in the rest of it. Now here's another example where it's tri-modal. It
was hard to find tri-modal examples. I ended up finding this; apparently this is the most hated
third jersey in all of sports. And we, you get, maybe it looks better here, I don't know. But
anyways, it does the color matching. These are sort of standard examples.
You could also do something a little bit different. You could say, what if my source and my
target where the same thing except that in the source I take the source to be that lit part and
the target be the part in shadow. So the mask for the source is the part that's illuminated, all of
this, and the mask for the target is the shadowed part and you could do that then; and now the
matching, the ground distance would be based not on brightness but actually on RGB
normalized by brightness, right? So RGB over R plus G plus B and then you get something which
does shadow removal pretty well. Okay, it's not perfect, I don’t know if you can see from there,
there are definitely artifacts around the edges and there are probably better things one can do
for shadow removal, but it wasn't supposed to do that so we thought that was kind of cute.
And then just for fun, you could show what happens if you do different kind of transformations.
So here you have a source and a target. There their transformed, we’re trying to match bright
colors, and here their transformed, we’re trying to match bright to dark and dark to bright, and
of course you get something that looks a little different.
Okay. So that's the end of that section on image-editing. Now let's switch gears entirely and
talk about some algebraic topology work. And this is joint work with my former student, who is
now in his finishing his second postdoc, Chao Chen, and Christoph Lampert, who is at IST in
Austria. Okay. So a little short intro about algebraic topology. Obviously, traditionally
considered one of the purest areas of math, in the last 10 years or so there's been an interest in
two different directions of things one can do with algebraic topology. One is computational
slash algorithmic results. So this is, again, the sort of stock soda crowd or whatever, and they're
interested in things that you can say about complexity of different things you want to do,
computing homology groups and whatever.
And what's interesting, by the way, I once talked to a, so I did some work in this as well, and I
once talked to a guy who was just a pure mathematician about this. He said oh, yeah. We've
done things like that too. We found this one algorithm, so if N is the size of this complex which
is basically the size of the space, it goes, the complexity is something like this, and he writes like
that. And I said well, what is that? I don't know that notation. He said, well, that's our own
notation. That means two to the two to two to the two to the end. That’s 256 times. So
basically, if you have something that’s a size two, even one, you have problems. But he thought
this was interesting. So that's the perils of being a pure mathematician and dealing with
complexity.
Anyways, so that's sort of one direction. Another direction is to use algebraic topology and
fields that are of interest to people in either computer science or engineering. This includes
computer vision, sensor networks was quite popular, some work in biochemistry as well from
the computational biochemistry crowd. So this is sort of a way we were riding. In particular,
we did some work in applying this division in addition to the complexity type work. The key
thing is that obviously this is a very technical field. There's a lot of machinery one has to use in
order to really get up to speed in it, and that's sort of a little bit annoying in trying to give it in a
survey. But anyways, I think that I will try to give the overview of what's involved. And like I
said, if you're interested I'd be happy to point you to some papers that are very good as intros
for this area.
Now, in general we can say that topology is about invariance to continuous deformations. So
invariance, so this is Felix Klein, the 19th century German geometer, talked about Euclidean
geometry, geometry being the study of invariances. So Euclidean geometry is the study of
invariances to the Euclidean, the worse set view group of Euclidean transformations. And
affine geometry is the set of invariances to the group of affine deformations and so forth. So
here we’re interested in the much more general idea of invariances to all continuous
deformations. And so here's the picture. So there's a cube, and that is homeomorphic to this
sphere, okay, now we’re talking about, let's say, what could be the interior too, but let’s say
we’re talking about the surfaces, and it's not homeomorphic to the dot or torus. So this is a
standard example, right?
Now the gold standard for saying these things is what’s called homomorphism, and we say X is
homeomorphic to Y if there is a function which maps X to Y such that F and its inverse are
continuous. That's the definition and so that cube and the, or the surface of the cube and the
surface of the sphere will have such an F whereas the surface of the sphere and the surface of
the torus will not have such an F. That's just not at all useful for computation, right? We are
computational people. How can I use this? It's hard enough to find the F, what is there is no F,
right? What if the point here is that there's no such F? So I have to show that there's no F. So
you can't really compute with this.
So that's what leads us to the algebraic part. This is point set topology; this is just general. Now
we want to compute, we want to define algebraic invariance. So we're going to define, on each
space like this, we’re going to find a set of groups. There's two main sets. There's homotopy
groups and homology groups. Homotopy groups are much more natural. They make more
sense. They involve things like a rubber band is sliding around and we are not going to use
those because they are harder to compute with. Homology groups are less intuitive, but they
are easier to compute with. So that's what we use.
Now why, again, what is the idea here? If you have homeomorphic spaces, this implies that you
have isomorphic groups. And, in fact, the groups often have simple structures which allows us
to say our two groups isomorphic and that way I can compute the groups for this, for this. Are
they isomorphic based on the simple rank kind of computation? Yes, no? And I'm done. The
converse is not always true. If two are not, if two isomorphic it doesn't necessarily mean
they’re homeomorphic, but for most nice-looking spaces okay, that's hand wavy, the converse
does hold. Not exactly true, but it's sort of, it’s strictly weaker, but it’s still interesting to
compute with.
Okay. So that's the intro. Now what are homology groups? Okay, now this is sort of the crux of
the matter, and I'm going to give you some intuition, maybe some of you know this already.
But this is the part which involves lots of machinery and let's see if we can get the intuition. So
informally, these things count the number of holes in the space, holes of different dimensions.
So here there are three holes. That was easy, right? Here, let's think of this thing, I haven't
drawn it right because I'm not very good at drawing, but let's think of this thing as the surface
of the cylinder. Nothing inside. How many holes? Well, I'm actually talking about kind of a 1-D
hole, so there's really one there. A Kind of thing that goes around. So you can think of this as a
tunnel, in that case. So those are 1-D examples. What if I have this and I talk about a 2-D
example, again, this being the surface, nothing’s filled in? Well, there's one 2-D hole inside,
right? And that’s a 2-D example. A single 2-D hole. So here we are talking about 1-D holes,
here we’re talking about 2-D holes. So one often called these tunnels and these ones voids, but
you can have 3-D holes and higher. You can also have 0-D holes, which is a strange concept,
but we can deal with that in a second.
So that's informally. Formally, these things count the number of non-bounding cycles. So let's
draw a picture here to explain what this means. So here again is the torus, always the surface.
I'm not interested in what’s inside. So here's the surface of the torus, here's a cycle, now this is
clearly a cycle. Sometimes when we talk about cycles later it’s something a little bit more
abstruse but a cycle and it bounds something. That's not what I wanted. That's not interesting.
And this is a cycle which doesn't bound anything, right? It’s not the bounding of anything
because remember in the inside it's empty, and that is interesting. I mean, maybe you have to,
it’s not interesting to everyone. But it basically tells us about the number of holes in the space.
This thing actually has two non-bounding cycles. You see the other one is the one that goes
around this way. Right?
Now, even more formally, H is Z [inaudible] quotient out B. What does that mean? Z is the set
of all cycles, B is the side of boundary cycles, and I performed this quotient operation. So let
me explain, let's ignore that for a second. Let's say each element is not actually just a nonbounding cycle but it's a whole family of them. So here's an example. These two are basically
the same guy, right? I just moved one to the other. In other words, if you like, and this is not
accurate here exactly, but you can think of these guys as rubber bands, I can just kind it along
and get from here to here. So this idea means that I'll just call all of these guys the same thing
and that’s my coset. So this is one and the same thing you can imagine with the other one.
There’s actually more than two. There's actually three or four depending on how you count
them. Because I can add these two guys together. I can also take something that looks like this
plus something that looks like this, there’s sort of an additional operation, and I can also take
the thing that doesn't bound anything and say, well that's not interesting to me, but that's also
sort of an element. And the key is, the way to think of it is I'm not going to consider those other
things because I’m going to think of it as like the same way I think about rank, or the basis
rather, of a vector space. Well, there’s kinda, here’s two orthogonal vectors and then I can add
in also to things that are the sum of them, right? But there's only two that are really sort of
making anything up. And the same thing here. I’m going to sort of take the generators, as you
say in group theory.
Now, do this formally, there's lots of ways to do it, there's different types of homology, singular
homology, simplicial homology. So basically the simplicial homology, you build, think of
building a mesh or a simpler complex, which is a more general type mesh. And then I can start
doing things like adding together triangles and I give an operation. That gives me these what
are called chains, and then based on that I can look at boundary groups and cycle groups based
on whether something has a boundary or not. We can do all this formally, but for now this is
the intuition of what I want. So I'm going to look and count the number of sort of nonbounding cycles. Now, let’s get, that sort of algebraic topology or homology theory in one slide.
And now let's get to vision.
So I want to do this problem. I want to say curve evolution with topological control. So back to
vision. And hopefully, if you have questions now about that, I'm going to deal with a much
more specific case, and so the general theory won't be as important. So what’s curve
evolution? So standard in vision, like snakes, I’m going to evolve a curve because I want it to go
somewhere. I want to segment something; I want to track something. Usually segmentation.
And these generalized surfaces, so I have generally a partial differential equation which looks
like this. DCDT equals some force times the normal. It pushes things out in the normal
direction and the strength of that force depends on what I'm trying to do. If I'm trying to drive
it to a place where there are strong gradients, that the standard thing to do, then I can find
something that has strong gradients, or I can find something more interesting that maybe
depends on the interior of the curve or the surface and drive it to something. It's a standard
technique.
Now what I want to do here, I want to do this, take a general equation like this, it doesn't
matter what the F is, okay the F will be application-specific, but I want to evolve a curve like
that while fixing its topology. So here's the example of the sorts of things we use to get. Level
sets with a nice technique for doing this which allowed you to get this sort of thing. I start with
a curve out here, the big red curve, it evolves inwards, and eventually it snaps right around
these blue things, and look, the topology has changed. And this was considered an advantage.
People said this is great because we can have arbitrary topology. I can start with one, I get
another one, and sometimes it is an advantage. And then sometimes it isn't.
Why isn't it? Sometimes I know the topology of the thing I'm looking for. If you're into medical
imaging, for example, you might know something like the liver has the topology, at least the
surface has the topology of a sphere. The prostate does not. It has the topology of a torus
because there is this thing going through the middle of it. I might want to use that fact. There
are other examples, obviously, envision itself where we want to get something which is just a
single piece. So we might want instead something that looks more like that. Now you can
argue whether you want that or not, but here the idea is this white part is noise. I know
somehow that it has to be one piece.
There are other methods which do this. There are some more digital methods which do digital
type topology. They end up with some strange artifacts, like you get sort of these long skinny
fingers. Continuous methods, there are also continuous methods, but they don’t really address
topology directly, they say things like, I want to ensure that two pieces which are far in geodesic
distance but close in Euclidean distance don't touch each other. What does that mean? I get
something that foreign geodesic distance but close you can use, somebody that’s trying to close
in on each other I will not let it close in. I will repel. Again, that's like heuristic, and it works
sometimes and not other times.
In our case, we're going to look at a simple example here where we are looking at just curves,
not surfaces, and the topology is just the topology of a circle, but this can be generalized to
more interesting cases with surfaces. Now here is the term that we are going to add. So we
have to talk about something called robustness. We’re going to focus on curves which are level
sets of a function phi. And this works nicely with the existing level set formulations of all this
kind of curve evolution stuff. So usually you take the zero level set of something.
Now the robustness of a homology class, so a homology class, going back to what we said, is a
piece of topology. The robustness of that homology class is basically how hard or easy is it to
eliminate that class by perturbing the level set function. So let's take a look to make this clear.
Here is a level set function and here is say the zero level set and it’s three pieces, right? And
that looks just like this. There's the zero level set. So we have three homology classes. Now
I'm going to cheat a little. I'm going to say I want to destroy one, I’m going to say let's destroy
all of them so there's just one left. So let's perturb phi to get exactly one homology class. Now
in this case this is simple, right? Let's stop calling it a homology class and call it components.
But in general, one could do this for weirder cases.
So we just want to get component back. What could we do? Well, we can do that. Let's see
what happened. That piece went down, that piece of function, and this piece of function went
up. Right? When that happens, when I now take the slice, I've lost, let's go back here, I have
lost this piece entirely and these two have become joint. And you can see that exactly there.
So what's interesting is if you look at what happened here, what did we raise? We raised the
local minimum. Here, we lowered a saddle point. And that's not an accident. Critical points
are actually crucial to this whole thing. So let's see that.
So we want to talk about total robustness now. And that's related to this idea of destroying
everything but one. The key is that the robustness of class is closely related to critical points.
Let's see it again. So here we have a bunch of different critical points. We have, I’ve not drawn
all of them, but we have this that we labeled M zero, which we’ll ignore for a second, we have
this saddle, and we have this local minimum. And what we find is that it's not just that I want
to raise one and lower the other, but the robustness is actually just equal to the absolute value,
let's take this as the zero level set, okay? It's actually called the absolute value of the actual
function value, the critical point.
So here the robustness of this guy was just equal to this little, little distance here. I just had
lower this a little bit and I can destroy the thing. Here I had to do a little more work. I had to
raise it that high and then I’ve destroyed that class. So that, and as I said, this is not an
accident. Now if you know some, if you're familiar with Moore's theory, this is just a sort of
generalization of Moore's theory. So Moore’s theory says I can look at smooth functions and
smooth functions on spaces, if I look at their critical points, is very closely related to the
topology. And it's a very intimate relationship. Here, these are actually not smooth functions
at all, right? They're often signed whatever distance functions. We’ve generalized, these aren't
real critical points, actually. But it's okay, we can work our way around that, and you can define
things which act just like critical points but are defined for non-smooth functions. Something
called tame functions which is a much more general class, but anyways I won't get into details,
but it works.
So what do I want to do? I want to destroy everything but one component because in the end I
want to have just one component right? I want to have the topology be the topology of a
circle. So I'm going to define the degree to total robustness to be the sum of the squares
instead of using absolute values because it would just be nicer, of all the critical points except
for the global minimum. Why am I eliminating the global minimum? I want to keep one
component. So this is the degree to total robustness, and then what I do is I want to now go
ahead and note that I have this theorem that says, which we’ve proven in the paper, which says
that I minimize the total robustness that ensures that the topology remains the topology of a
circle.
Now why is that critical to me is that, obviously if I just minimize this thing I’ll get this topology
that I want, but what's more interesting is that the advantage of this is that if I add it to the flow
it drives this towards the correct topology, but it doesn't restrict us. What do I mean by that?
It doesn't say we have to have this topology of a curve the whole time; it will just drive us
towards that. We can start with some other topology, and eventually it will drive us in that
direction.
Restriction to some existing topology leads to artifacts. Like I said, you end up with often long
skinny fingers which are, you could have somebody that has the topology of a circle but it's got
all these weird little things coming off it and that's not really what, that's kind of fake topology,
right? It is the correct topology, but it’s weird looking. So this gets around that. We can ignore
this is sort of, this is useful for taking derivatives, that is continuous in phi, so what ends up with
is this flow. This is the old, if you know sort of the level set formulation, this is the standard
flow one gets is a hyperbolic equation: D phi to T equals minus phi times the magnitude of the
gradient, that I don't want to get into but it's standard converting from a curve evolution to a
level set evolution. The new term is this, and this is a topological control term, and what's
interesting here is basically you have a sum of delta functions around the critical points and this
thing is equal to zero or one depending whether that critical point is relevant or not for the
computation. And that's the new term. That drives everything. So let me show you some
pictures of what you get with this.
So here is this, here’s some toy examples then we’ll see some more real examples in a second,
here is what's called that geodesic active contour. It’s basically like a fancy snake. You can see
there are three components there. And here, when you add in topological control, you get one
component and you can see in what we've changed here is we've merged components. Now
that's not the only way to remove topology. Let me show you another example. Here's an
octopus. And here's the octopus with topology control. And here we have a different situation.
If you look at the green, excuse me, the blue rectangles, what we've done there is we've
actually done something different which is to remove holes. That's not the same as merging
components. It’s different. I’m actually taking away holes. But we've actually done a third
thing too, which you can see over by the green parts, and that's we've torn handles. So this
handle here has actually been torn and now you get something which has the correct topology
and not all these extra handles. Here's another example, and you can see similarly a lot of hole
removal and some handle tearing.
Here some pictures of brains. This is 3-D now, although I didn’t talk about 3-D, but you can do
it for 3-D. That's the Chan-Vese flow, which is a different geodesic active contours, basically
tries to segment by keeping the interior of the two parts to be sort of uniform. So one should
have a, if you like, a certain gray level or a certain distribution of the outside should have a
different distribution. And there you get, you can see that there's non-surface topology. You
can see these handles here and here. It's maybe a little hard to see. And now what I’m going
to show you is something you can get by doing this, and you’ll notice that's a little hard to see,
but you don't get those handles anymore. The topology is fixed. But we added other artifacts
which became sort of grainy. That was actually due to the fact that there are many almost
critical points and they contribute. It didn't look good there. In terms of the 2-D slices it does
look a lot better, but that was a subject of ongoing investigation by Chao to see whether we
could improve that.
In the meantime though he said, and this appeared in CPR a couple years ago with Chao and
Christoph Lampert, he said can we do this with GrabCuts? So this would be more interesting
maybe. And this is harder to do because GrabCut is this sort of a global optimization, and how
do we introduced this here? Everything else there was local. So the way we do this, remember
you do GrabCuts to get segmentation, I'm talking about min-cut, really. This is like that sort of,
so you label some points, so we used examples from the graph cut database, you label some
interior points, some exterior points and you let the thing run and give you the best
segmentation. So how do we do this here? We run the whole thing once without any, just the
regular min-cut, and then we take the inside and we say to ourselves, okay, let's readjust the
unary term. We’ll readjust the unary term based on having run our topology. So the unary
term on its own afterwards would give you something which wasn't quite the right topology.
We could adjust the topology of that. So we rerun the whole thing and then use that as the
unary term afterwards. And you can show some very, very simple theorems on what this
guarantees. It doesn't guarantee everything. Chao’s sort of a theory guy, so he likes to prove
everything, and he was disappointed he couldn't prove anything here. But it does give us some
nice pictures.
So here's the standard example. Here is an image and there is the tri-map, and the tri-map the
idea is you know this is inside and then you know that this is outside and you have to compute
the rest, this black part. It's outside and then you can do the rest. And so min-cut gives you
this. And we get the proper topology here, and it’s very good at finding these sort of thin
structures, but also things like this, at filling in this sort of piece or several pieces, and again
here. Christoph did a very sort of in-depth kind of work on actually showing the performance
rather than just showing pictures. I like pictures.
Okay. So that's it for topology. And, like I said, if you find that interesting I'd be happy to point
you in other directions. Now some very, I want to switch onto things I've done recently in the
last year.
>>: We’re kind of running out of time.
>> Daniel Freedman: Oh, are we? It's an hour not an hour and a half?
>>: Yeah, it’s more like an hour.
>> Daniel Freedman: More like an hour. Okay. How about we’ll do 3-D sensing in five minutes
and we’ll skip the last part. So 3-D sensing, and I can’t tell you too much anyways due to the
NDA so this is the problem, this is joint work with Eyal Krupka, who’s our lab, who is our
manager, Yoni Smolin, my intern, and Ido Leichter. So time of flight sensors sends out some
light and measures how long it takes until it comes back. So here's the picture. We have, the
little thing on the left is the camera, the little thing on the right is the piece of surface. I sent
out a beam, measure it coming back, and now I have the time. And of course, what I end up
doing is not measuring time, but I measure phase. And the time of flight sensor does this for all
directions within the field of view. And that's how you get your picture. Now, in fact what you
get is an integral over time with a received signal to get rid of noise.
Multipath. So this is the problem we tackled. More than one path arrives at the sensor along
the same way. And so there's the camera, there's the surface, that's what we had before. Now
let’s add in a floor. And now, I could have another path that went like that, and of course that
would come back along the same ray, and I don't really know which one's which; the blue one is
correct, though one is incorrect, but the problem is that it causes, multipath causes these
problems because the depth is based on the time of flight; and I don't get either time of flight,
in fact what I get is some sort of weird mixture of both paths together. Now this causes major
problems. So our idea was to diagnose and hopefully eliminate it. And we wanted to take into
many kinds of multipath, not just the simple kind, and the idea was to do it with a theoretically
well justified yet light weight algorithm. Why lightweight? Because we really have no time to
do this. This isn’t real-time, this is like crazy real-time with almost no computational budget.
Now what’s Lambertian Multipath? Just very quickly, here we have, so an ideal Lambertian
surface will give an infinitesimal amount of light equally in all directions, so we have a picture
like that. So if you have an infinitesimal amount coming that's not going to bother you but you
have to have an infinite number of infinitesimal amounts and add those up and then you get
something. So what you get? So something like this. Let's say they're all bouncing off of
different points to the same nearby and each one’s giving off a little bit in that direction, and
then end up with this very ugly smeary thing which looks like this. So this is the regular two
path. Here's intensity versus distance. Here's what you get with Lambertian. You get
something that’s sort of smeared out like this. And then, of course, you can get three path or
two path plus Lambertian that could be related if the surface or whatever. And that's what I
wanted to tell you. This is a hard problem, an interesting problem, and we have a nice solution,
which is great, and I can't tell you too much about it, but that's it. Thank you.
Download