>> Nikhil Devanur Rangarajan: Hello and welcome. It's my great... to introduce Vijay from Georgia Tech. Vijay was my Ph.D....

advertisement
>> Nikhil Devanur Rangarajan: Hello and welcome. It's my great pleasure
to introduce Vijay from Georgia Tech. Vijay was my Ph.D. advisor, and
he's very well known for a lot of his work: algorithms, exact and
approximation algorithms, complexity, algorithmic game theory and so
on. And today he is going to talk about, I think, what is his favorite
research: Micali-Vazirani algorithm for finding maximum matching in
general graphs. So this -- I think this algorithm was done a long time
ago. It was when Vijay was a Ph.D. student. He was a first-year
graduate student, and I think Micali was in his second year. So an
amazing algorithm, beautiful, and Vijay is going to tell us why it's
even more beautiful than we think it is.
>> Vijay Vazirani: Thank you all for coming. It's really an honor to
give this talk in front of such a distinguished audience. Yeah, so as
Nikhil was recalling some incidents from the last millennium which seem
like prehistoric times now, let me say a few more words about it.
So in the fall of '79 I started my Ph.D. in Berkeley. And I think the
first class I walked into was Dick [Inaudible]' [inaudible]
optimization class. And within a few days of the class I met up with
Silvio Micali, and we became the best of friends. And within a few
weeks we started working on this algorithm. And basically we didn't
know how hard it was. We were too naïve to understand what we had
started working on. And basically we put all our eggs into this one
basket 24/7 for the rest of the year missing out on courses and
everything else. And after many ups and downs, lo and behold, it got
done by spring of 1980 and was submitted to FCS where it got accepted
which was a big surprise for us. And we had both planned a trip to
Sicily in the summer, but we took along those, you know, big sheets of
paper on which you had to paste the FCS paper. It was all typed up in
Sicily and sent over to the U.S. And then we were so exhausted with the
whole thing that we just gave up on it for many, many years.
So I'll pick up on this story somewhere later in this talk. But let me
start with matching itself which, as you all know, occupies a very,
very central position in the theory of algorithms. In almost every
realm of algorithms that we know of matching has been one of the first
problems to be solved and has almost always yielded some very powerful
tools and techniques and sometimes entire paradigms for the entire
field of algorithms.
And among the paradigms, I'll list a few here. So Kuhn gave the very
powerful primary-duel paradigm while following bipartite weighted
matching. Edmonds gave the definition of polynomial [inaudible]
solvability itself while [inaudible] non-bipartite matching. Valiant
gave the definition of sharp P while finding the complexity of
computing the number of perfect matchings in a bipartite graph or zeroone permanence. And in this work while I was a post-doc at Harvard in
'83-'84 we gave the equivalence between random generation and
approximate counting for self-reducible problems. And this was what
lies at the core of what is today an entire field, or an entire area
namely the MCMC method. I think my battery has finally given up on me.
Is there a pointer? There must be, right? Ah, here. Yeah?
>> : Yeah.
>> Vijay Vazirani: Okay, good. Yeah, excellent. Okay. So but this is
not -- And let me just say that wherever you turn in algorithms,
whether it's randomized algorithms [inaudible] famous algorithm by
lugging in random numbers which, you know, made us aware of the Schwarz
lemma, or it's online algorithms, or it's parallel algorithms, or it's
zero-one phenomena in [inaudible] in any graph or it is [inaudible],
time and time again it's matching that gave us paradigms always for our
field.
And in my work in game theory and economics, I realize that matching
has given paradigms in that field also. The Gale-Shapley stable
marriage algorithm is a key algorithm that studies how to obtain
solutions in the core in polynomial time. The Shapley-Shubik doublesided matching markets is again a very basic result. And in the
Internet age, building up on previous work on online bipartite matching
we gave an algorithm for the adwords market. And this has spawned off a
huge amount of research on this very important question which supports
almost single-handedly the revenues of Google.
So at this point let me start. This is enough of philosophizing. I
think Lassey's book makes many attempts at trying to say why this has
been the case, why matching has been so central. But let me get on to
absolutely nuts and bolts from now on and into actual matching.
So just to get everybody on board, let's even define matching. So given
a bipartite graph on bipartition UV -- By the way, so a set of edges is
a matching if they don't share any vertices. So like these 2 red edges
are matching. And I'll not draw the separate vertices, but I'll always
draw unmatched vertices with these big blobs. The separate vertices are
clear where they are. And a matching of maximum cardinality in a graph
is called a maximum matching, and the maximum cardinality may be only
one. Like in the star graph there, you cannot pick a second edge
without meeting at a vertex namely the center.
An alternating path with respect to a matching M is just a simple path
consisting alternately of matched and unmatched edges like this 3length path. And an augmenting path is an alternating path between 2
unmatched vertices like 1, 2, 3, 4, 5, 5-length path. It obviously has
to be odd because it starts and ends with unmatched edges. And the
important property of such a path is that if you flip unmatched and
matched edges on this, you still get a valid matching. And it's
[inaudible] cardinality because like here there are 3 unmatched edges
and 2 matched edges, and when you flip you get 3 matches and 2
unmatched edges. And then you can find another augmenting path here and
flip matched and unmatched edges there. And that's a perfect matching;
you cannot do better than that obviously. And that's how combinatorial
matching algorithms operate. Here's a non-bipartite graph that's a
matching with these 2 unmatched edges, and there's an augmenting path.
We flip and we get a maximum matching which again here is a perfect
matching.
So as I said all maximum matching algorithms combinatorial ones go this
way. There are -- Thanks to Lassey's paper there are very beautiful
algebraic algorithms now based on finding inverses and determinants of
matrices but less [inaudible]. So you start with empty matching. While
there's an augmenting path with respect to M, find any one augment. And
when there are no more augmenting paths, this will be maximum. That's a
simple proof. So, of course, this cannot go on more than N over 2 times
before you get maximum matching. So order n iterations. It's very, very
simple to implement this in order m time where m is the number of
edges. Okay? So this gives you an order mn algorithm. But it's highly
wasteful because as you can imagine that right in the beginning all you
get is 1-length paths and you are paying an entire order m time for
that.
Right? Initially you should take an unmatched vertex and your
augmenting path will be just a simple edge. Right? So this is highly
wasteful. So what we should do is go to this better scheme which is,
for now, let's just say that we always augment along the shortest path.
Always augment along a shortest path with respect to the current
matching. And if so, the length of the path will be non-decreasing. And
so first we'll get some 1-length paths and then we'll augment along 3length paths and along 5-length paths and so on.
And moreover, there are more properties here. If this is the matching
after these augmentations, these 4 augmentations, then not only is this
an augmenting path with respect to M symmetric difference p but in fact
this is an augmenting path with respect to M itself. So each one of
these paths of the same length must be all disjoint, is what I'm trying
to say, from each other.
Okay, so these paths of the same length are all
disjoint, and they're all augmenting paths with
here. Again, these 2 are disjoint, and they are
with respect to the matching here. Okay? And in
maximal set of minimum-length paths; maximal in
cannot add another path of the same length into
one. And, okay, is that clear?
disjoint, vertex
respect to the matching
both augmenting paths
fact this set will be a
the sense that you
this set, a disjoint
So that gives us the following scheme. With respect to the current
matching, find a maximal set of disjoint minimum-length length paths
all at once in one iteration. Augment along all of them -- They all
disjoint; you can augment all of them -- and then go to the next
iteration. What's the advantage? So let's call this a phase. And that
advantage is that only order square-root of n phases are needed even
for general graphs.
And it's a 2-line proof basically. You carry this process for squareroot n phases, okay, and look at the matching at the end of square-root
n phases. And take its symmetric difference with maximum matching.
Every path there must be at least square-root n long, so there can be
at most square-root n such paths. So the deficiency at this point is
only square-root n, so you'll need only square-root n more phases.
Okay?
And in bipartite graphs it is fairly easy to come up with an order M
algorithm per phase for finding all of these paths at once. So, again,
we are not finding a maximum set of disjoint paths, just a maximal set
of disjoint paths. Right? Otherwise, we would be done in phase one
which is maximum set of disjoint paths and phase one would be a maximum
matching obviously. That's what we have set out to do.
Okay. So as you can notice this work was done simultaneously on both
sides of the Iron Curtain. I don't know how that happened but those are
greats. So the big open question here was how to do it for general
graphs, and that's what got done in 1980, that incident that I
described to you.
So this was our claim in that paper, in the FCS paper. It turns out
that there were almost no proofs in that paper and hardly any
definitions either so -- well, a few definitions but many more were
needed. And so 10 years later I took up this task of proving it. And it
took me like more time that it was needed to find the algorithm. It
took about 2 years to find the proofs. And all claims but one got
proven at that point. And I'll tell you which claim it was, but in the
meantime Gabow and Tarjan had already given a data structure that would
ensure linear time. Okay, otherwise, with the technology of 1980 this
should've had an [inaudible] function there.
But there's still an open question there, and I'll say more about it
much later. Okay, so I'm going to jump right into algorithms now, and
please stop me any time because everything is built vertically up from
now on. So we start with the bipartite case. And with respect to the
current matching, let's define the level of a vertex to be the
shortest, length of the shortest alternating path from an unmatched
vertex on the U side to the vertex v. Okay? So for instance here there
are 2 unmatched vertices on the U side and obviously the level of these
2 will be zero. The level of this guy will be 1, 2, 3 -- Oh, no. Not 3.
One also -- and so on. Right? One, 2, 3. So the unmatched vertex here,
having the smallest level will be the length of the augmenting path.
And let's just look into the problem of assigning levels to all
vertices. And obviously what needs to be done here is an alternating
breadth-first search starting from all vertices on the U side. And at
level i, we look from all level i vertices to find level i plus 1
vertices along the edges of appropriate parity.
Okay. So here's the process. We start with all the unmatched vertices
on the U side. This is not for that graph but in general. And we at
such level zero we look along unmatched edges and give these guys level
1. And then at such level 1, we look along matched edges to give these
guys level 2, level 3 and so on. Now one more thing we do is that all
the vertices that assign this guy a level, they'll become predecessors
of this vertex. Okay, so every vertex has a set of predecessors. So in
the case of this vertex, they'll be only one predecessor. But in the
case of these odd level vertices, there could be many predecessors. But
we need to keep track of all those. But that can be done really fast,
you know, in the same time that you scan along this edge. You can plug
yourself into this Link Lift.
Okay. So 3 give rise to 4. Can such an edge happen? No because this is
an odd cycle. We are in the bipartite case. Can such an edge happen?
Yes, this is only an even cycle. But what good is this edge? If you
were to use this edge, you'll find a longer path than if you went
directly. So we might as well forget about this edge. So the picture
looks rather simple. Okay, and at some point we will hit the unmatched
vertices on the V side and that'll be the length of the minimum
augmenting path. And at that point we don't need to search any more. We
can drop the rest of the search. And at this point we reap the benefits
of the search which is that we actually find augmenting paths. So we
pick an unmatched vertex here and pick an arbitrary predecessor at each
step and throw away this path. We're going to augment along this path.
We can throw it away because we want maximal set of disjoint paths, so
everything else has to be disjoint from this so we can throw that way.
And since we throw this away, we can also throw away all edges incident
at these vertices, obviously. Right? And this vertex. And now look this
guy has no more predecessors. So if we came down here, we'll never be
able to complete that into a path. So there's no reason to keep this
vertex. Right? It'll never useful. And in this manner we throw away all
of that, so all of this is linear time and the number of edges. Okay?
And at this point we pick any other unmatched vertex and trace down
predecessors. And we got 3 -- we get the second path.
So what we got is these 2 disjoint paths even though the graph really
had 3 disjoint paths. But we don't want a maximum set of disjoint
paths; we want a maximal set of disjoint paths, and we managed to get
it in Order m time. Yeah, you have a question. No? Okay.
So general graphs. All right. Okay. One thing to notice right away is
that now there is no U side and V side, so we just have some unmatched
vertices. And vertices can have paths of both parity. So this guy has a
path of length 4 and a path of length 5. Right? And we are, of course,
looking for minimum-length paths so we want minimum length paths from
an unmatched vertex. So let's give a few definitions.
So with respect to the current matching define the: even level of a
vertex to be the length of a minimum even length alternating path from
an unmatched vertex, any unmatched vertex, to v; odd level to be the
length of the minimum odd length alternating path from an unmatched
vertex to v -- Right? -- min level to be the smaller of the 2; and max
level to be the bigger of these 2.
Okay? So here for instance the min level of this guy is 4. Its max
level is 5. And its even level is 4, and its odd level is 5, and so on.
So we have to define 2 different levels for the 2 parities. And this is
the big difference between the bipartite and nonbipartite case: in the
bipartite case, minimum length alternating paths are breadth-first
search honest in the sense that if I took a minimum alternating path
from f to some vertex v and u sits on it then this is the minimum path
from f to u itself. There cannot be a shorter path. Not so in bipartite
graphs. Why? Because maybe v or some other vertex on this spot appears
on the minimum path from f to u. So if you combine that with this path
you get a self-intersecting path, and so you need a longer path from f
to u t find the minimum path from f to v.
And the reason it can sit over here without contradicting the fact that
this is the shortest path is that it sits here with opposite parity.
And let me give you an example right away. So the odd level of this
vertex is 7. One, 2, 3, 4, 5, 6, 7. But that doesn't help you find an
odd level to this or an even level to this because those will be selfintersecting paths. But if we arrive here with the path of length 9
then we can come down here and find an even level to this guy.
And if we come here with a path of length 11 to v then we can find an
even level to b. Okay? And this is not just an academic exercise, you
know, we are not doing this for fun because, after all, there may be an
augmentation right here. And so we need to find the even level of b to
find this unique augmenting path in this graph.
Right? So what's the moral of the story? The moral of this story is
that we need to find longer and longer alternating paths to v just in
order to find the minimum alternating paths to some other vertices. Now
what's the complexity of finding long paths? Suppose I want a
Hamiltonian path or half a Hamiltonian path. Okay. And be complete. So
it looks like this should take exponential time. So how on Earth are we
going to do it in linear time? Right? Well, because of the incredible
structure of matching and these blossoms. And that's the whole story
here. So let me jump into that part of the whole story.
So again we start with this alternating breadth-first search and mark
predecessors as before. And now this edge can happen, right? We are
non-bipartite so there could be odd cycles. What do we do here? So
let's see what Edmonds would have done here. What Edmonds would've done
is that he would have collapsed these 5 vertices into 1 vertex all at a
macronode. And all the edges incident at these which are all unmatched
because, you know, all the matched edges are -- this thing is fully
matched to the extent possible. So all these edges are unmatched. So
the macronode basically says the following, you know, "Think of me as 1
single node. Find an augmenting path. Once you are done I will do the
right thing inside this. I will manage the insides of this macronode,
and I will produce to you a correct augmenting path -- alternating path
inside the original graph." So if you come in here the manager of this
blossom will say, "Oh, do this." If you come in here, it'll say, "Do
that." Right? And then of course blossoms can get [inaudible] and all
that but that's the whole story. And you can see that it should not
take more than [inaudible] and that was it.
But of course if you collapse these into 1 node, you lost all the
length information. How are you going to find minimum length augmenting
paths in this situation? So the right thing to do here is to somehow
give max levels to these 4 vertices and not more. Not to this guy. Just
to these 4 vertices. And if you don't then, for instance, at first
level 6 you'll not find this 7 and you will miss out some augmenting
path. Right? So I will call this edge a bridge and this the base of the
blossom. I'll redefine blossom completely from this perspective but
it'll take some work to get to the definition. But I can -- I'll really
define a bridge. So I'm going to say that an edge u, v is a prop if you
u is a predecessor of v. And any edge that's not a prop, I'll say that
it's a bridge. Okay, so for instance here all these edges are props and
this edge is a bridge. More or less in all these figures, edges that
are vertical or almost vertical will be props, and edges that are
horizontal typically will be bridges. Okay?
And this way over drawing the figures is really -- I mean you can get
led into theorems already without proving anything because you almost
see that it's true. But that's fallacious. Anyway, so what is a base
now? So in this case this is a bridge. What is a base? The base is the
-- Well, follow predecessors from both these end points until you come
to unmatched vertices and look for bottlenecks and pick the highest
such bottleneck and call that the base. Okay? So for instance here,
follow from here and here following predecessors. You came down here.
Predecessors. Came down here following predecessors and you have to go
through this vertex. And this is the highest such vertex that you have
to go through. So this is the highest bottleneck, and we'll call that a
base. In this case, however, I just added some more edges to that
previous figure. This is not a base anymore because we can come from
here all the way here and from here all the way here in a disjoint
manner. So the only base is this one. Okay? So that's the base. And in
fact this "blossom" should be giving max levels not only to these 2
guys but also a max level of 7 to these guys and 8 to these guys and of
course no 9 here because that would be self-intersecting.
So that's the base that we need to find inside this blossom. Okay? So
let's list what we want to find, what we hope to find through some
process. We want to find this "blossom" by identifying the base which
we want to be the highest bottleneck with respect to predecessors.
Okay, but now look at this particular bridge of 5. We can follow
predecessors here. We can follow predecessors here. And the base is
this. But there's already a blossom here, right, which was found
earlier. And if we come here, we don't want to traverse this whole path
one edge at a time. In particular this could be a very, very long path
and we may have to do this many, many times. We don't want to keep
going through this path, otherwise, we'll lose the linear time part of
the algorithm. So somehow we want to be able to skip blossoms, the
previous blossoms. Okay?
And all of this is accomplished by this new procedure called double
depth-first search which was discovered 32 years ago. But I still call
it new because nobody knows about it. So I'm going to show it to you in
a very simple setting of a layered graph. Okay? So in this layered
graph I'll assume that the endpoints of the bridge are a and b. So
there are layers all the way from the highest to the lowest layer being
zero. And the layers correspond to min levels of the vertices in the
previous graph. And a and b are at the highest layer. And edges go from
high layers to lower layers skipping layers which corresponds to
skipping nested blossoms. And we want to accomplish these 2 tasks. And
the time that we have allowed for this is linear in the number of edges
above the base because the stem after the base could be very long and
we don't want to keep running over that over and over again. Okay?
So here, again, is the whole picture. We have this layered graph. At
the highest level are these 2 vertices a and b, and a is red and b is
blue. And every edge skips down. And we want to find the highest
bottleneck. So look at all the paths from a to layer zero and b to
layers zero. Is there a vertex that they all have to go through? In
this case of course it's this guy. Right? And there could have been
more but this would be the highest even if this path went down.
Whatever. And as you can see some edges are long but some edges are
only length 1. So we start with the -- So this double depth-first
search just to break the ice, red will stay one step ahead of blue. So
red starts.
>> : That way the long edges are skipping blossoms?
>> Vijay Vazirani: Yes. And you'll see how in the next figure what
exactly -- But that's what they're supposed to represent. Okay, so I'm
just defining this whole process on an arbitrary layered graph of this
kind. And later we'll see how to patch it to the original graph. So red
starts first. Blue catches up to be at the same layer as red. Red goes.
Blue goes. And they're met up. Question: is this a bottleneck? Okay, so
the first thing that blue does -- at first blue backtracks to find
something as low as this. So backtracks. Are there any edges out of
this? No. Then backtrack. Any edges out of this? Yes. If so, take it.
Take an edge. Now they are both at the same level, so they proceed with
the search. Red goes here. Blue goes here -- Oh, blue goes below red so
red has to catch up. And, oh, look they met again. Is this a
bottleneck? So blue says, "I'm going to backtrack first." So backtracks
here. No more edges out of it. Backtracks. No more edges. Backtracks.
All the way to b. That means all of this has been searched from and
nothing out of these vertices. So it puts a barrier here saying, "Don't
backtrack beyond this point." And now red backtracks; it says, "Let me
try to find something as low as this." So backtracks. Nothing out of
this. Backtracks. Yes, there's something out of this. Go here. They can
start search again. Now blue says, "Okay, I'm going to backtrack first
and find something as low as this." Nothing out of this. Nothing out of
this. And encounters the barrier. That means no need to backtrack any
more. So moves the barrier here. Then red says, "Okay, I'm going to
look for something as low as this." Backtracks, backtracks, backtracks,
backtracks. Once red backtracks all the way to a, that's the end of the
story.
Okay. And this whole thing is linear in the number of edges of this
graph, not of the original graph. We'll come to that. That's a very
long story. And what we got here is a red tree and a blue tree. And
every red vertex has a red path from a to it. Every blue vertex has a
blue path for a to it. And there's a red path from a to the base, and a
blue path from a to the base. Right? Now what happens here? What's the
highest common bottleneck here for the bridge 5, 5?
[ Silence ]
There is no bottleneck because this guy can reach all the way to this
unmatched vertex and this guy can reach all the way to this unmatched
vertex. So there's no bottleneck but there's an augmenting path. Okay?
So we add to our wish list here one more. If no bottleneck then reach 2
unmatched vertices and the min aug path will be found. So I'm going to
change this figure a bit by adding these 2 edges.
And what happens is I'm going to take you back to the place where these
2 met. And now blue backtracks saying, "I want to find something as
deep as this." So nothing out of this? Yes, there's something out of
this. So they continue. And they found 2 different vertices at level
zero. Okay, so what we got is a bit more than what we asked for;
namely, we also got these properties. That for each red vertex there's
a red path from a to v and there's a red path from a to the base, in
this case. And in the other case we have a red path from a to one
vertex and a disjoint blue path from b to the other unmatched vertex.
And that gives us the augmentation.
>> : So [inaudible] vertex [inaudible]...
>> Vijay Vazirani: Time you meet it -- you reach it. Yeah, yeah.
There's lots of tie-breaking but it doesn't matter how you broke the
tie. All right? Is that okay? So all of this will happen in the
original graph where these would be predecessors or you skip blossoms,
and as you can see in Edmonds case, blossom is just one odd cycle at
the end of a stem which ends with a matched edge. In our case this
whole thing is a blossom, and there are all kinds of paths going from
the 2 endpoints all the way to the common bottleneck. So it could be
huge humongous thing. Okay? Okay. So, yeah. Here coming to this
requirement that we should be able to jump to the base of the base of
the blossom. Well, let me say that one part of it is easy and one part
of it is still a mystery, an unsolved mystery. So let me say the easy
part. The easy part is, I say to you how do you make sure you can jump
from here to the base? Well, when you discovered this blossom you
create a new node here, the blossom node. It points to the base and all
the vertices of this blossom point to this so that when you came down
here, you jump directly here in 1 step -- or 2 steps.
Okay. And when the new blossom is created, it comes up with its own new
blossom node which points to its own base and all vertices including
this guy. This guy is in this second blossom. They all point to this
blossom node.
Okay, now let's define a very central concept, that of the tenacity of
a vertex. So the tenacity of a vertex will be the sum of its even level
plus the odd level. Okay? So for instance these vertices are up to
nasty 9. 6 plus 3, 5 plus 4, 5 plus 4, 6 plus 3. Right? These green
vertices are all of the green tenacity, whatever that is. These purple
vertices are all of the purple tenacity. These yellow vertices are all
of the yellow tenacity. And purple is bigger than green and yellow is
bigger than purple. You're thinking about it. Okay. Okay.
Now let's come to this big difference about breadth-first search
honesty, and let me point out one of the big significances of this
notion of tenacity. Basically it helps us limit the amount of breadthfirst search dishonesty, rather. [Inaudible]. So there's a limited
amount of breadth-first search honesty to the following extent. Okay?
Now I'm giving you a definition. So on this path, p, from f to u we'll
say that v is breadth-first search honest on this path if this s the
correct length from f to u. And by correct I mean if this is even then
this is the even level of u, and if this is odd then this is the odd
level of u. So it's not occurring at a bigger distance than it should.
Okay? And the theorem is suppose u lies on an even level v or odd level
v path and has tenacity at least that of v then it must be breadthfirst search-honest on each such path. It cannot be at a wrong
distance.
So for instance let's look at any even level v or odd level v to these
green vertices. So like this guy is an even level path. This is an odd
level path. And notice that the purple vertices which are of tenacity
bigger than green are at the correct distance because this is the right
even level of this purple vertex. And the yellow vertices are also at
the correct distance. So it's only the vertices of lower tenacity on
this path that can be at a wrong distance. Okay? And that's important.
>> : Why can't you take a purple one then the green ones continue...
>> Vijay Vazirani: Right. So for instance if you want the even level to
this, the green guy is at a wrong level. This is more than the odd
level of this. This is the odd level of this. So the green guy which is
of lower tenacity is at the wrong length. Now I'm giving one more
definition. So let v be a vertex of tenacity t, and let p be any min
level or max level v path. So let's say this is a min level v path, p.
Then let H v, p be the highest vertex of tenacity bigger than t,
strictly bigger than t on p. So highest I mean [inaudible] from the
unmatched vertex. So go from here on this path from here to v, and
certainly some vertices are of higher tenacity, in particular this
fellow.
So look at the highest
for the max level path
from here -- I'm sorry
for the highest vertex
one which of bigger tenacity. And do the same
to v. So this is the max level path to v. I go
-- this unmatched vertex to v, and I again look
of tenacity bigger than t on this path.
>> : So all these are correct [inaudible]?
>> Vijay Vazirani: No. There could be vertices of very low tenacity
here which are at totally wrong levels. Right? But this is at the
correct level. Right? It's a good point. But, yeah, it's not as simple
as that. Okay? So this is a definition with respect to a given path. So
this is with respect to this path. This is with respect to this path.
But the point is that this set is a singleton. Edge v, p where p where
p is any min level v or max level v path. There's only one such vertex
that they all go through, all these paths go through and that of course
[inaudible] at the right distance. And we'll denote it by the base of
this vertex. Okay? And of course base v must be breadth-first search on
each such path.
So for instance here the base of these green vertices is purple. And
every even and odd path to them goes through purple. The base of the
purple vertices is yellow, this guy. And the base of the yellow
vertices is this white guy. So finally I can define the notion of this
blossom from this perspective of minimum length alternating paths. So
let b be any vertex of tenacity bigger than t. Then the blossom of
tenacity t with base b is all those vertices which have tenacity at
most t and base b.
Okay. So for instance for the greens -- Oh so sorry. The blossom of
tenacity green with base this guy are these 4 vertices. The blossom of
tenacity purple with this base is all the purple and green vertices.
The blossom of tenacity yellow with this base is all but this vertex.
Is that okay? So blossoms may not be as simple as this. Things can be
even more complicated. Like this bridge is of tenacity 9 and this is of
eleven. The blossom of tenacity green with this base are these 4 green
vertices. The blossom of tenacity purple with this base is the purple
and the green vertices. Sorry. The blossom of tenacity 9 with this base
is the green vertices, and the blossom of tenacity eleven with this
base is the purple and the green vertices.
Is that okay? You are thinking.
>> : So why are the green vertices part of that [inaudible]?
>> Vijay Vazirani: Because I just defined the blossom like this: any
vertex whose tenacity is at most t and whose base is b, is in the
blossom. So if I want the blossom of tenacity eleven with this base,
these guys have tenacity at most purple and they have base this guy, so
I have to say that these green and these purple are in this blossom of
tenacity eleven. It'll all make sense in a moment. I mean, everything
is going to come together slowly, slowly, what's the reason for all
this and....
But if there are any questions, I'll be happy to answer them. Okay.
Sorry. So this was vertex b. So this is -- Yeah, okay. Nevermind.
Okay, now this is a vertex v. It's base is this. The base of this base
is this. This base of this guy is this. Okay? So I'm going to call this
the base square of v; so this is the base of the base. And the base of
the base of the base of v is this, base cube of v. Okay? And here's a
theorem. Every even level v or odd level v path must go through the
base, base square or base cube and so on. And this is very important.
We'll apply all of these facts to the algorithm.
Anything? Okay.
>> : This order it has to...
>> Vijay Vazirani: Sorry?
>> : Does it have to go in this order?
>> Vijay Vazirani: It'll go through the order base, base -- Sorry.
Yeah, yeah, yeah. Yeah. It'll go through the biggest base and then the
next smaller until it hits the -- Yeah. Yeah. As you go to bigger and
bigger bases, you'll go lower and lower and higher and higher tenacity
until you go to the unmatched vertex. Now I'm going to define the
tenacity of an edge. Next. So if the edge is unmatched it is the even
level of u plus the even level of v plus 1, so that this whole this
thing forms some kind of a walk. But we'll come to its significance in
a moment. If it is a mashed edge then it's the odd level of u plus 1
plus the odd level of v. Okay?
So here in this graph remember these vertices were all tenacity 9? And
in fact these edges are all of tenacity 9 because like this edge has
tenacity 9 because it's 4 plus 4 plus 1. This edge is odd plus odd plus
1, so 5 plus 3 plus 1 and so on. Here, yeah, the edges are marked with
the color of the edge is tenacity just to make the picture slightly
prettier. So these are the green tenacity edges. These are the purple
tenacity edges. These are the yellow tenacity edges.
>> : So are they just finding the [inaudible] structure of the...
>> Vijay Vazirani: Hmm?
>> : Are they just finding this [inaudible]...
>> Vijay Vazirani: Blossom structure, yeah.
>> : Blossom structure.
>> Vijay Vazirani: Yeah. Yeah. Well, you'll see what's going on. This
tenacity of the edge is key. Okay, so let u, v be a bridge of tenacity
t. Then I'm going to call -- This is an informal definition right now.
I'm going to give you a formal definition a little bit later. So
informally the support of this bridge is all the vertices of tenacity t
found in the double depth-first search when you go from u and v down.
All the vertices of tenacity t that you encounter. And some times the
vertices that define the new blossom in some sense with a very big
spoon of salt thrown in with many grains of salt. Okay?
So the bridge is responsible for assigning max levels to these
vertices. Okay? So the support of this green bridge is the green
vertices. The support of the purple bridge is the purple vertices. And
the support of the yellow bridge is the yellow vertices. Okay.
>> : So when you say the [inaudible]?
>> Vijay Vazirani: Well -- Well...
>> : [Inaudible]...
>> Vijay Vazirani: You will find it, yeah. If you go from the 2
endpoints, you will encounter all the vertices. We should get their max
levels right now. So what do I mean by right now? That is this thing.
So there are 2 main ideas in the algorithm. One is double depth-first
search which I already said what it is. But the other is this very
precise synchronization of events, and this is what needs to be
described next. So what is the algorithm? What it is doing is some kind
of an intertwined breadth-first search and a double depth-first search.
The breadth-first search step is no different from bipartite case. So
all it does is at search level i you search from all the vertices of
level i. You know, I'm talking about the right parity. Look around the
edges or right parity. Any vertex encountered that way which has not
gotten a level yet should get level i plus 1. And whether that's even
or odd depends on the parity. Okay? And then in this search level
itself, we will look at all bridges of tenacity 2i plus 1, and this'll
become clear in a moment why it should be 2i plus 1 and not anything
else. And we run double depth-first search on each of these bridges to
find their supports and give these guys their max levels. So it's
basically if this is breadth-first searches and these are double depthfirst searches, it's all intertwined like this. One breadth-first
search, one double depth-first search or many double depth-first
searches. Then one entire breadth-first search and then one entire
sequence of double depth-first searches. Okay?
>> : So these double depth-first searches of the fixed tenacity are
going to be touching disjoint sets of edges?
>> Vijay Vazirani: Yes. Good point. Good point. There's a lot to be
said about that. But the way we'll define blossoms which will be
slightly different from this [inaudible] definition because -- Okay. So
ask me this in a moment, and I'll clarify it completely. Okay, so the
purpose of this routine is to find all vertices having min level of i
plus 1. And this is straightforward. I mean, you know, 2 looks at 3, 3
looks at 4, and so on. Right? Now here comes -- When we search from 4
to find 5, we come across this edge. This edge is not going to give min
level to this guy. Oops. It is not going to give a min level to this
guy because it already has a min level. Okay? And this edge is not
going to give a min level to this guy, so this is not a prop. So what
is it? It's a bridge. What is its tenacity? We know it. It's unmatched,
the edge is unmatched, so it must be even plus even plus 1. So it must
be 9. Okay? So when should we do double depth-first search on it? Well,
the formula says 9 minus 1 over 2 which is 4. Okay? So we should do it
having done min on level 4, we should come here and do double depthfirst search on that bridge. Okay? And these double depth-first
searches, their purpose is to find all vertices having tenacity 2i plus
1 and every one of them and give them their max level which will be
this tenacity minus the min levels.
Okay? So at this first level of 4, double depth-first search is done
here to find the support of this bridge, and these vertices get their
max levels. And that way at first level 6, this guy can label this 7
and so on. Now I claim that proving this is straightforward. There is
nothing different from the bipartite case. It's a straightforward
induction. If you've done the right thing so far, if you look from all
the vertices that have first level i along the correct parity, you will
find all the vertices of min level i plus 1. There's no mysteries
there. What about proving this particular claim, that we do find all
vertices having tenacity 2i plus 1? Well, first of all we have to show
that double depth-first search does find the support of the blossom -oh, sorry, support of the bridge and assume that we have done this on
the side separately. The other thing we must do is to show that every
bridge of tenacity 2i plus 1 has been found, has been found by this
time. By the time min finishes all bridges of tenacity 2i plus 1 have
been found, they are in a list and we can start processing them one by
one. And when we are done processing all of them we would have found
all the vertices of tenacity 2i plus 1.
So let's see how hard is this claim. So here is an edge of tenacity,
19. Obviously we would like to do double depth-first search on search
level 9. Let's see if its tenacity is known before search level 9. It
turns out that tenacity of this is 13 minus 1 over 2. We know this even
level. This is 15, so at 15 minus 1 over 2 which is 7. We know this
even level. So at such level 7 itself we know the tenacity of this
fellow, 19, so that's well before search level 9. And at search level 9
we can double depth-first search here and find these 4 vertices which
are of tenacity 19.
Okay? There's one more interesting example. What happens when this
vertex hits this vertex? Remember, in the bipartite case when a higher
level vertex hit a lower level vertex we said that this edge is useless
because why would you take such a path when you can take a shorter path
like this? Right? Should we just throw this edge out? And the answer is
no because we may be able to complete this blossom and thereby use this
edge going that-a-way and, you know, find something there.
Okay? So what is happening here? It turns out this even level is 8 and
this is a bridge of tenacity 13, 8 plus 4 plus 1. And let's see whether
it gets its nasty [inaudible] search level 6 when we should do the
double depth-first search. And it will because at tenacity 5 it will do
double depth-first search on this bridge and assign this 8 and so we
know its tenacity is 13. We keep it in the right bucket and at search
level 6 we double depth-first search on this bridge and find max levels
to these guys, all these guys. Right?
And in general -- I mean I gave you some examples just to provide
[inaudible] but here's the proof that we will have found all bridge of
tenacity 2i plus 1 by search level i. The reason is that pick any edge
u, v. We can show that any endpoint of this edge, both endpoints will
have tenacity at most u, v. And if the tenacity is strictly smaller
then we know both levels of v by the right search level. If the
tenacity of v is exactly equal to u, v and u, v is a bridge then the
relevant level is the min level of v which is known by the right search
level. So in either case, the two endpoints of the bridge are well
known before we need to process it so it'll be processed at the right
time. Okay, so we can cross this off. We have done this. All bridges of
tenacity 2i plus 1 have been found. And at this point I always stop to
ask if we are done. And the obvious answer to this is no. Are we done
in terms of proving this claim?
So I have sort of argued to you that double depth-first search will do
the right thing and that all the bridges of tenacity 2i plus 1 would be
found by this time. But what else is needed to complete the proof by
induction to show that all vertices having tenacity 2i plus 1 will get
their tenacity if you do all of this stuff, namely process all these
bridges at this time in the right way? What else do we need to show?
[ Silence ]
And I want to give a minute because this is the central point of my
second paper. And I've never obtained an answer ever on this point in
all the talks that I have given so far.
[ Silence ]
>> : You haven't obtained the correct answer.
>> Vijay Vazirani: Sorry?
>> : The correct answer.
>> Vijay Vazirani: No, not even an answer.
[ Silence ]
I mean, if you want to show that every vertex of tenacity 2i plus 1,
it'll be encountered by just doing a double depth-first search on all
these bridges. What must we show?
[ Silence ]
So what we must show is that every vertex of tenacity 2i plus 1 lies in
the support of some bridge of tenacity 2i plus 1. How do we know that?
Well in the case of min for instance, we know that if somebody has min
level of i plus 1, there must be a vertex of level i right next to it.
Right? So the culprit is adjacent to this, you can blame it. But in
this case the culprit is far away. Who knows where it is or whether it
is or it is not. And the point is that it is and the theorem is that on
every max level v path there is a unique bridge of tenacity equal to
tenacity v and v lies in its support.
So as I said this is the central theorem of the second paper and
there's a whole slew of development that needs to be done to arrive at
a proof of this. And I'll sketch the proof later on the board if people
are excited and there's time. But the proof of this also requires this
definition of blossoms, requires us to say many things about how
minimum length alternating paths go through blossoms. And all of those
properties go to proving the algorithm. And we'll use those properties
first because I want to complete the algorithm, and then I can go into
the proof of this theorem. And at this point I can give you a purely
graph theoretic definition of support of a bridge. Oops, again. Sorry.
Call this w. So the support of bridge u, v is all those vertices w such
that there tenacity is the same as tenacity of this bridge and there is
a max level w path containing this bridge. Okay?
So at this point sort of similar to your question is are the supports
of all the bridges disjoint? And the answer is no. Like these 2 bridges
have tenacity 7, and they both contain this tenacity 7 vertices in
their support. So in this case whose support will this lie in, in the
algorithm? It doesn't matter, whoever is processed first. If you
process this first, you'll find a blossom node here, find these guys 7.
And then if you process this next, you'll come here and jump directly
here. And you won't go through these all over again. And of course vice
versa if you go through the other bridge first. But, you know, those
are little tie breakings. We don't care for that. And there is one more
point to be made which I'll say later when I come to the white board.
Okay, another question, can there be such an edge, this green edge?
[Inaudible] there can be. Right? Nobody says such and such edge is not
around. We are non-bipartite. Every edge can be present. But what is
it? It's certainly not a prop because this guy gets its min level from
here. This guy gets its min level from here. So it's not a prop, so
what is it? It's a bridge. But it's a bridge to nowhere. Right? What is
a bridge? It has empty support. It's a little worse than Sarah Palin's
bridge. Right? This is another bridge to nowhere. So for instance this
has tenacity 17 and there's no vertex of tenacity of 17 which has this
on its max level path. And the point is that you will have to process
this bridge at the right search level but you'll immediately jump to -from both endpoints you'll jump to this guy and realize that it lies
within the blossom and there's no support.
Another point, very important point, is this I said was a bridge of
tenacity 13 and should be processed at search level 6, 13 minus 1 over
2. But we know its tenacity at search level 5 when we process this
bridge. Right? Why do we not do double depth-first search on this right
away? At search level 5? And the reason will become clear in this
example. So here this is a bridge of tenacity 11. And at first level 5
it assigns 8 here. And at search level 6 we know this tenacity which is
15. The correct thing is to process it at search level 15 minus 1 over
2 which is 7. But suppose we say, "Okay, at search level 6 we already
know its tenacity, why not process it?"
We do double depth-first search from these two end points and assign 15
to all these which is correct. But here it is not correct because these
two guys should get -- they have a tenacity of 13 through this bridge.
And the point is that we were assuming that all the vertices of lower
tenacity -- sorry, all the vertices that we encounter in double depthfirst search which haven't got a tenacity must get this tenacity.
Right? And that we can assume only if we do this double depth-first
search at the right search level. So for instance here we should
process this bridge at search level 6. We should find tenacities of
these 2. And then at 7 when we process this, these 2 already belong to
this blossom so we will jump directly here, never even encounter these
2, and give 15 to only the right ones.
And so all of that is needed to make sure that the proof of induction
goes through. And this is what I meant by the precise synchronization
of events. There's nothing different that can be done here.
Okay, so one more thing which is what I will keep coming back to is we
are to mark blossoms so that we jump to their base efficiently. Now
look what happens here. This blossom is nested inside this blossom
actually; this is a bigger tenacity blossom. But when we come down
here, we'll come in two steps here and two steps here. And this nesting
could've been order m big and we may have to come down this chain of
blossoms order n times. And that's order n square work already. So what
should we do?
Well, there are many things to do. We need to compute the base of the
maximal blossom at each point. And either use the set-union algorithm
of Bob Tarjan's which will multiply the running time by inverse
[inaudible] function. Or we can use the Gabow-Tarjan result which does
this in linear time for certain class of unit [inaudible]. But then it
results to a RAM model of computation which assumes that [inaudible]
log m bit numbers is unit time which you may or may not like. This is
the [inaudible] model.
But in the 1980 paper, we claim that path compression suffices because
once you do this once and everybody points to their base star, you
don't have to go through -- If you just jump to the lowest base, you
won't do that much work. And there are tons and tons of vertices and
edges here to charge it to. But this is one claim I couldn't verify in
1989 when the paper was written. But recently Bob Tarjan himself has
gotten excited about this and hopefully there'll be a revolution. I
don't know. That'll be good to clear up this last one thing. Okay, so
finally at some point, at some search level, we'll search from a bridge
and encounter 2 unmatched vertices. And at that point we need the path.
So assume that we can find this path and this path somehow by the same
process. And we need the path from here to here. Well, so what we do is
-- So I'm just concentrating on this part. What we do is we want a path
from here to the base. So we go -- So for this purpose the blossom also
points to the 2 endpoints of the bridge. This guy knows he's red. He
goes to the red endpoint and finds a path from here to v using red
edges. And from here to the base using blue edges. And in the second
search -- Sorry. In the second search this guy jumps directly to its
base, okay, and again needs a path from here to here. But this guy
knows he's red and he needs a red path which is easy.
Okay? And, you know, what we got more was used. Remember? And so we got
this whole path here. We patch it together with these two paths and
that'll be guaranteed to be a minimum length augmenting path because of
theorems that I haven't stated which will come when we prove the main
theorem. Okay?
Then we remove this path and we augment along it. But there's a lot
more stuff left in this blossom. Right? There are many, many more
vertices and edges left in this blossom. And are we sure that we have
the right structures left over for those vertices? Right? How will the
next path be found through the half-eaten blossom? And the point is, we
never will need those because of the theorem that base, base square and
so on occur on every even level v and odd level v path.
Right? So every even level v and odd level v path use base, base square
and so on. So once you got rid of this part, the rest of the vertices
and the blossoms can never reach this unmatched vertex with a shortest
path because they also need base, base square, base cube and so on. And
all of those are thrown out.
So that's the point that these are all removed and the half-eaten
blossom itself can be thrown out. And that's how -- once we got this
one path, we can throw out all this structure and go to the next and
get a maximal set of disjoint paths just like the bipartite case.
So now I made two outrageous-sounding claims maybe in my abstract that
this algorithm is simple and natural. I'd like to ask you, you know,
given the difficulties and given the structure that has to be overcome,
what could be simpler or more natural than this? You do have to find
the min levels. You do have to find the max levels. I mean, I don't
have a theorem to this effect but....
And my second claim was that it can fit into one's mind as a single
thought. And for that I have a visual reason because I really see it as
a one-thought algorithm. Which is that, you know, you do this. You
know, you find these bridges and find max levels and find bridges and
find max levels and at some level you find [inaudible] 3 vertices and
that's it. I mean it sits in front of you as one thought. I think.
That's all. Thank you. I think I have 15 minutes for the proof if
somebody is interested.
>> Nikhil Devanur Rangarajan: Okay. Any questions?
>> : So for doing this [inaudible]?
>> Vijay Vazirani: So if this were to work then so would the bipartite.
You know, [inaudible]. And given that more than 32 years have -- since
1973 to now, so 30 years almost, so probably unlikely. So I presented
this algorithm in Moscow last month, and there was a colleague of
[Inaudible], Mikhail [Inaudible]. He came up with some very nice
questions about maybe embedability of the structure into metric spaces
and some properties there. So those I thought were the most relevant
future directions. I don't think this will help, weight matching or I
don't know. One more thing he mentioned was that this algorithm was
thought by [Inaudible] in 1984. And up to that point I was under the
impression that nobody had understood it besides the authors, and so
now I was corrected and there was at least one person who had
understood it.
So I can get into a couple of theorems if you wish. Yeah?
>> : [Inaudible].
>> Vijay Vazirani: [Inaudible]. Should we -- I just -- I have about 10
more minutes. Should I just do this? Finish it off.
>> : Just take it off.
>> Vijay Vazirani: Take it offline. Okay. Okay.
>> : Thank you [inaudible].
[ Audience applause ]
Download