>> Nikhil Devanur Rangarajan: Hello and welcome. It's my great pleasure to introduce Vijay from Georgia Tech. Vijay was my Ph.D. advisor, and he's very well known for a lot of his work: algorithms, exact and approximation algorithms, complexity, algorithmic game theory and so on. And today he is going to talk about, I think, what is his favorite research: Micali-Vazirani algorithm for finding maximum matching in general graphs. So this -- I think this algorithm was done a long time ago. It was when Vijay was a Ph.D. student. He was a first-year graduate student, and I think Micali was in his second year. So an amazing algorithm, beautiful, and Vijay is going to tell us why it's even more beautiful than we think it is. >> Vijay Vazirani: Thank you all for coming. It's really an honor to give this talk in front of such a distinguished audience. Yeah, so as Nikhil was recalling some incidents from the last millennium which seem like prehistoric times now, let me say a few more words about it. So in the fall of '79 I started my Ph.D. in Berkeley. And I think the first class I walked into was Dick [Inaudible]' [inaudible] optimization class. And within a few days of the class I met up with Silvio Micali, and we became the best of friends. And within a few weeks we started working on this algorithm. And basically we didn't know how hard it was. We were too naïve to understand what we had started working on. And basically we put all our eggs into this one basket 24/7 for the rest of the year missing out on courses and everything else. And after many ups and downs, lo and behold, it got done by spring of 1980 and was submitted to FCS where it got accepted which was a big surprise for us. And we had both planned a trip to Sicily in the summer, but we took along those, you know, big sheets of paper on which you had to paste the FCS paper. It was all typed up in Sicily and sent over to the U.S. And then we were so exhausted with the whole thing that we just gave up on it for many, many years. So I'll pick up on this story somewhere later in this talk. But let me start with matching itself which, as you all know, occupies a very, very central position in the theory of algorithms. In almost every realm of algorithms that we know of matching has been one of the first problems to be solved and has almost always yielded some very powerful tools and techniques and sometimes entire paradigms for the entire field of algorithms. And among the paradigms, I'll list a few here. So Kuhn gave the very powerful primary-duel paradigm while following bipartite weighted matching. Edmonds gave the definition of polynomial [inaudible] solvability itself while [inaudible] non-bipartite matching. Valiant gave the definition of sharp P while finding the complexity of computing the number of perfect matchings in a bipartite graph or zeroone permanence. And in this work while I was a post-doc at Harvard in '83-'84 we gave the equivalence between random generation and approximate counting for self-reducible problems. And this was what lies at the core of what is today an entire field, or an entire area namely the MCMC method. I think my battery has finally given up on me. Is there a pointer? There must be, right? Ah, here. Yeah? >> : Yeah. >> Vijay Vazirani: Okay, good. Yeah, excellent. Okay. So but this is not -- And let me just say that wherever you turn in algorithms, whether it's randomized algorithms [inaudible] famous algorithm by lugging in random numbers which, you know, made us aware of the Schwarz lemma, or it's online algorithms, or it's parallel algorithms, or it's zero-one phenomena in [inaudible] in any graph or it is [inaudible], time and time again it's matching that gave us paradigms always for our field. And in my work in game theory and economics, I realize that matching has given paradigms in that field also. The Gale-Shapley stable marriage algorithm is a key algorithm that studies how to obtain solutions in the core in polynomial time. The Shapley-Shubik doublesided matching markets is again a very basic result. And in the Internet age, building up on previous work on online bipartite matching we gave an algorithm for the adwords market. And this has spawned off a huge amount of research on this very important question which supports almost single-handedly the revenues of Google. So at this point let me start. This is enough of philosophizing. I think Lassey's book makes many attempts at trying to say why this has been the case, why matching has been so central. But let me get on to absolutely nuts and bolts from now on and into actual matching. So just to get everybody on board, let's even define matching. So given a bipartite graph on bipartition UV -- By the way, so a set of edges is a matching if they don't share any vertices. So like these 2 red edges are matching. And I'll not draw the separate vertices, but I'll always draw unmatched vertices with these big blobs. The separate vertices are clear where they are. And a matching of maximum cardinality in a graph is called a maximum matching, and the maximum cardinality may be only one. Like in the star graph there, you cannot pick a second edge without meeting at a vertex namely the center. An alternating path with respect to a matching M is just a simple path consisting alternately of matched and unmatched edges like this 3length path. And an augmenting path is an alternating path between 2 unmatched vertices like 1, 2, 3, 4, 5, 5-length path. It obviously has to be odd because it starts and ends with unmatched edges. And the important property of such a path is that if you flip unmatched and matched edges on this, you still get a valid matching. And it's [inaudible] cardinality because like here there are 3 unmatched edges and 2 matched edges, and when you flip you get 3 matches and 2 unmatched edges. And then you can find another augmenting path here and flip matched and unmatched edges there. And that's a perfect matching; you cannot do better than that obviously. And that's how combinatorial matching algorithms operate. Here's a non-bipartite graph that's a matching with these 2 unmatched edges, and there's an augmenting path. We flip and we get a maximum matching which again here is a perfect matching. So as I said all maximum matching algorithms combinatorial ones go this way. There are -- Thanks to Lassey's paper there are very beautiful algebraic algorithms now based on finding inverses and determinants of matrices but less [inaudible]. So you start with empty matching. While there's an augmenting path with respect to M, find any one augment. And when there are no more augmenting paths, this will be maximum. That's a simple proof. So, of course, this cannot go on more than N over 2 times before you get maximum matching. So order n iterations. It's very, very simple to implement this in order m time where m is the number of edges. Okay? So this gives you an order mn algorithm. But it's highly wasteful because as you can imagine that right in the beginning all you get is 1-length paths and you are paying an entire order m time for that. Right? Initially you should take an unmatched vertex and your augmenting path will be just a simple edge. Right? So this is highly wasteful. So what we should do is go to this better scheme which is, for now, let's just say that we always augment along the shortest path. Always augment along a shortest path with respect to the current matching. And if so, the length of the path will be non-decreasing. And so first we'll get some 1-length paths and then we'll augment along 3length paths and along 5-length paths and so on. And moreover, there are more properties here. If this is the matching after these augmentations, these 4 augmentations, then not only is this an augmenting path with respect to M symmetric difference p but in fact this is an augmenting path with respect to M itself. So each one of these paths of the same length must be all disjoint, is what I'm trying to say, from each other. Okay, so these paths of the same length are all disjoint, and they're all augmenting paths with here. Again, these 2 are disjoint, and they are with respect to the matching here. Okay? And in maximal set of minimum-length paths; maximal in cannot add another path of the same length into one. And, okay, is that clear? disjoint, vertex respect to the matching both augmenting paths fact this set will be a the sense that you this set, a disjoint So that gives us the following scheme. With respect to the current matching, find a maximal set of disjoint minimum-length length paths all at once in one iteration. Augment along all of them -- They all disjoint; you can augment all of them -- and then go to the next iteration. What's the advantage? So let's call this a phase. And that advantage is that only order square-root of n phases are needed even for general graphs. And it's a 2-line proof basically. You carry this process for squareroot n phases, okay, and look at the matching at the end of square-root n phases. And take its symmetric difference with maximum matching. Every path there must be at least square-root n long, so there can be at most square-root n such paths. So the deficiency at this point is only square-root n, so you'll need only square-root n more phases. Okay? And in bipartite graphs it is fairly easy to come up with an order M algorithm per phase for finding all of these paths at once. So, again, we are not finding a maximum set of disjoint paths, just a maximal set of disjoint paths. Right? Otherwise, we would be done in phase one which is maximum set of disjoint paths and phase one would be a maximum matching obviously. That's what we have set out to do. Okay. So as you can notice this work was done simultaneously on both sides of the Iron Curtain. I don't know how that happened but those are greats. So the big open question here was how to do it for general graphs, and that's what got done in 1980, that incident that I described to you. So this was our claim in that paper, in the FCS paper. It turns out that there were almost no proofs in that paper and hardly any definitions either so -- well, a few definitions but many more were needed. And so 10 years later I took up this task of proving it. And it took me like more time that it was needed to find the algorithm. It took about 2 years to find the proofs. And all claims but one got proven at that point. And I'll tell you which claim it was, but in the meantime Gabow and Tarjan had already given a data structure that would ensure linear time. Okay, otherwise, with the technology of 1980 this should've had an [inaudible] function there. But there's still an open question there, and I'll say more about it much later. Okay, so I'm going to jump right into algorithms now, and please stop me any time because everything is built vertically up from now on. So we start with the bipartite case. And with respect to the current matching, let's define the level of a vertex to be the shortest, length of the shortest alternating path from an unmatched vertex on the U side to the vertex v. Okay? So for instance here there are 2 unmatched vertices on the U side and obviously the level of these 2 will be zero. The level of this guy will be 1, 2, 3 -- Oh, no. Not 3. One also -- and so on. Right? One, 2, 3. So the unmatched vertex here, having the smallest level will be the length of the augmenting path. And let's just look into the problem of assigning levels to all vertices. And obviously what needs to be done here is an alternating breadth-first search starting from all vertices on the U side. And at level i, we look from all level i vertices to find level i plus 1 vertices along the edges of appropriate parity. Okay. So here's the process. We start with all the unmatched vertices on the U side. This is not for that graph but in general. And we at such level zero we look along unmatched edges and give these guys level 1. And then at such level 1, we look along matched edges to give these guys level 2, level 3 and so on. Now one more thing we do is that all the vertices that assign this guy a level, they'll become predecessors of this vertex. Okay, so every vertex has a set of predecessors. So in the case of this vertex, they'll be only one predecessor. But in the case of these odd level vertices, there could be many predecessors. But we need to keep track of all those. But that can be done really fast, you know, in the same time that you scan along this edge. You can plug yourself into this Link Lift. Okay. So 3 give rise to 4. Can such an edge happen? No because this is an odd cycle. We are in the bipartite case. Can such an edge happen? Yes, this is only an even cycle. But what good is this edge? If you were to use this edge, you'll find a longer path than if you went directly. So we might as well forget about this edge. So the picture looks rather simple. Okay, and at some point we will hit the unmatched vertices on the V side and that'll be the length of the minimum augmenting path. And at that point we don't need to search any more. We can drop the rest of the search. And at this point we reap the benefits of the search which is that we actually find augmenting paths. So we pick an unmatched vertex here and pick an arbitrary predecessor at each step and throw away this path. We're going to augment along this path. We can throw it away because we want maximal set of disjoint paths, so everything else has to be disjoint from this so we can throw that way. And since we throw this away, we can also throw away all edges incident at these vertices, obviously. Right? And this vertex. And now look this guy has no more predecessors. So if we came down here, we'll never be able to complete that into a path. So there's no reason to keep this vertex. Right? It'll never useful. And in this manner we throw away all of that, so all of this is linear time and the number of edges. Okay? And at this point we pick any other unmatched vertex and trace down predecessors. And we got 3 -- we get the second path. So what we got is these 2 disjoint paths even though the graph really had 3 disjoint paths. But we don't want a maximum set of disjoint paths; we want a maximal set of disjoint paths, and we managed to get it in Order m time. Yeah, you have a question. No? Okay. So general graphs. All right. Okay. One thing to notice right away is that now there is no U side and V side, so we just have some unmatched vertices. And vertices can have paths of both parity. So this guy has a path of length 4 and a path of length 5. Right? And we are, of course, looking for minimum-length paths so we want minimum length paths from an unmatched vertex. So let's give a few definitions. So with respect to the current matching define the: even level of a vertex to be the length of a minimum even length alternating path from an unmatched vertex, any unmatched vertex, to v; odd level to be the length of the minimum odd length alternating path from an unmatched vertex to v -- Right? -- min level to be the smaller of the 2; and max level to be the bigger of these 2. Okay? So here for instance the min level of this guy is 4. Its max level is 5. And its even level is 4, and its odd level is 5, and so on. So we have to define 2 different levels for the 2 parities. And this is the big difference between the bipartite and nonbipartite case: in the bipartite case, minimum length alternating paths are breadth-first search honest in the sense that if I took a minimum alternating path from f to some vertex v and u sits on it then this is the minimum path from f to u itself. There cannot be a shorter path. Not so in bipartite graphs. Why? Because maybe v or some other vertex on this spot appears on the minimum path from f to u. So if you combine that with this path you get a self-intersecting path, and so you need a longer path from f to u t find the minimum path from f to v. And the reason it can sit over here without contradicting the fact that this is the shortest path is that it sits here with opposite parity. And let me give you an example right away. So the odd level of this vertex is 7. One, 2, 3, 4, 5, 6, 7. But that doesn't help you find an odd level to this or an even level to this because those will be selfintersecting paths. But if we arrive here with the path of length 9 then we can come down here and find an even level to this guy. And if we come here with a path of length 11 to v then we can find an even level to b. Okay? And this is not just an academic exercise, you know, we are not doing this for fun because, after all, there may be an augmentation right here. And so we need to find the even level of b to find this unique augmenting path in this graph. Right? So what's the moral of the story? The moral of this story is that we need to find longer and longer alternating paths to v just in order to find the minimum alternating paths to some other vertices. Now what's the complexity of finding long paths? Suppose I want a Hamiltonian path or half a Hamiltonian path. Okay. And be complete. So it looks like this should take exponential time. So how on Earth are we going to do it in linear time? Right? Well, because of the incredible structure of matching and these blossoms. And that's the whole story here. So let me jump into that part of the whole story. So again we start with this alternating breadth-first search and mark predecessors as before. And now this edge can happen, right? We are non-bipartite so there could be odd cycles. What do we do here? So let's see what Edmonds would have done here. What Edmonds would've done is that he would have collapsed these 5 vertices into 1 vertex all at a macronode. And all the edges incident at these which are all unmatched because, you know, all the matched edges are -- this thing is fully matched to the extent possible. So all these edges are unmatched. So the macronode basically says the following, you know, "Think of me as 1 single node. Find an augmenting path. Once you are done I will do the right thing inside this. I will manage the insides of this macronode, and I will produce to you a correct augmenting path -- alternating path inside the original graph." So if you come in here the manager of this blossom will say, "Oh, do this." If you come in here, it'll say, "Do that." Right? And then of course blossoms can get [inaudible] and all that but that's the whole story. And you can see that it should not take more than [inaudible] and that was it. But of course if you collapse these into 1 node, you lost all the length information. How are you going to find minimum length augmenting paths in this situation? So the right thing to do here is to somehow give max levels to these 4 vertices and not more. Not to this guy. Just to these 4 vertices. And if you don't then, for instance, at first level 6 you'll not find this 7 and you will miss out some augmenting path. Right? So I will call this edge a bridge and this the base of the blossom. I'll redefine blossom completely from this perspective but it'll take some work to get to the definition. But I can -- I'll really define a bridge. So I'm going to say that an edge u, v is a prop if you u is a predecessor of v. And any edge that's not a prop, I'll say that it's a bridge. Okay, so for instance here all these edges are props and this edge is a bridge. More or less in all these figures, edges that are vertical or almost vertical will be props, and edges that are horizontal typically will be bridges. Okay? And this way over drawing the figures is really -- I mean you can get led into theorems already without proving anything because you almost see that it's true. But that's fallacious. Anyway, so what is a base now? So in this case this is a bridge. What is a base? The base is the -- Well, follow predecessors from both these end points until you come to unmatched vertices and look for bottlenecks and pick the highest such bottleneck and call that the base. Okay? So for instance here, follow from here and here following predecessors. You came down here. Predecessors. Came down here following predecessors and you have to go through this vertex. And this is the highest such vertex that you have to go through. So this is the highest bottleneck, and we'll call that a base. In this case, however, I just added some more edges to that previous figure. This is not a base anymore because we can come from here all the way here and from here all the way here in a disjoint manner. So the only base is this one. Okay? So that's the base. And in fact this "blossom" should be giving max levels not only to these 2 guys but also a max level of 7 to these guys and 8 to these guys and of course no 9 here because that would be self-intersecting. So that's the base that we need to find inside this blossom. Okay? So let's list what we want to find, what we hope to find through some process. We want to find this "blossom" by identifying the base which we want to be the highest bottleneck with respect to predecessors. Okay, but now look at this particular bridge of 5. We can follow predecessors here. We can follow predecessors here. And the base is this. But there's already a blossom here, right, which was found earlier. And if we come here, we don't want to traverse this whole path one edge at a time. In particular this could be a very, very long path and we may have to do this many, many times. We don't want to keep going through this path, otherwise, we'll lose the linear time part of the algorithm. So somehow we want to be able to skip blossoms, the previous blossoms. Okay? And all of this is accomplished by this new procedure called double depth-first search which was discovered 32 years ago. But I still call it new because nobody knows about it. So I'm going to show it to you in a very simple setting of a layered graph. Okay? So in this layered graph I'll assume that the endpoints of the bridge are a and b. So there are layers all the way from the highest to the lowest layer being zero. And the layers correspond to min levels of the vertices in the previous graph. And a and b are at the highest layer. And edges go from high layers to lower layers skipping layers which corresponds to skipping nested blossoms. And we want to accomplish these 2 tasks. And the time that we have allowed for this is linear in the number of edges above the base because the stem after the base could be very long and we don't want to keep running over that over and over again. Okay? So here, again, is the whole picture. We have this layered graph. At the highest level are these 2 vertices a and b, and a is red and b is blue. And every edge skips down. And we want to find the highest bottleneck. So look at all the paths from a to layer zero and b to layers zero. Is there a vertex that they all have to go through? In this case of course it's this guy. Right? And there could have been more but this would be the highest even if this path went down. Whatever. And as you can see some edges are long but some edges are only length 1. So we start with the -- So this double depth-first search just to break the ice, red will stay one step ahead of blue. So red starts. >> : That way the long edges are skipping blossoms? >> Vijay Vazirani: Yes. And you'll see how in the next figure what exactly -- But that's what they're supposed to represent. Okay, so I'm just defining this whole process on an arbitrary layered graph of this kind. And later we'll see how to patch it to the original graph. So red starts first. Blue catches up to be at the same layer as red. Red goes. Blue goes. And they're met up. Question: is this a bottleneck? Okay, so the first thing that blue does -- at first blue backtracks to find something as low as this. So backtracks. Are there any edges out of this? No. Then backtrack. Any edges out of this? Yes. If so, take it. Take an edge. Now they are both at the same level, so they proceed with the search. Red goes here. Blue goes here -- Oh, blue goes below red so red has to catch up. And, oh, look they met again. Is this a bottleneck? So blue says, "I'm going to backtrack first." So backtracks here. No more edges out of it. Backtracks. No more edges. Backtracks. All the way to b. That means all of this has been searched from and nothing out of these vertices. So it puts a barrier here saying, "Don't backtrack beyond this point." And now red backtracks; it says, "Let me try to find something as low as this." So backtracks. Nothing out of this. Backtracks. Yes, there's something out of this. Go here. They can start search again. Now blue says, "Okay, I'm going to backtrack first and find something as low as this." Nothing out of this. Nothing out of this. And encounters the barrier. That means no need to backtrack any more. So moves the barrier here. Then red says, "Okay, I'm going to look for something as low as this." Backtracks, backtracks, backtracks, backtracks. Once red backtracks all the way to a, that's the end of the story. Okay. And this whole thing is linear in the number of edges of this graph, not of the original graph. We'll come to that. That's a very long story. And what we got here is a red tree and a blue tree. And every red vertex has a red path from a to it. Every blue vertex has a blue path for a to it. And there's a red path from a to the base, and a blue path from a to the base. Right? Now what happens here? What's the highest common bottleneck here for the bridge 5, 5? [ Silence ] There is no bottleneck because this guy can reach all the way to this unmatched vertex and this guy can reach all the way to this unmatched vertex. So there's no bottleneck but there's an augmenting path. Okay? So we add to our wish list here one more. If no bottleneck then reach 2 unmatched vertices and the min aug path will be found. So I'm going to change this figure a bit by adding these 2 edges. And what happens is I'm going to take you back to the place where these 2 met. And now blue backtracks saying, "I want to find something as deep as this." So nothing out of this? Yes, there's something out of this. So they continue. And they found 2 different vertices at level zero. Okay, so what we got is a bit more than what we asked for; namely, we also got these properties. That for each red vertex there's a red path from a to v and there's a red path from a to the base, in this case. And in the other case we have a red path from a to one vertex and a disjoint blue path from b to the other unmatched vertex. And that gives us the augmentation. >> : So [inaudible] vertex [inaudible]... >> Vijay Vazirani: Time you meet it -- you reach it. Yeah, yeah. There's lots of tie-breaking but it doesn't matter how you broke the tie. All right? Is that okay? So all of this will happen in the original graph where these would be predecessors or you skip blossoms, and as you can see in Edmonds case, blossom is just one odd cycle at the end of a stem which ends with a matched edge. In our case this whole thing is a blossom, and there are all kinds of paths going from the 2 endpoints all the way to the common bottleneck. So it could be huge humongous thing. Okay? Okay. So, yeah. Here coming to this requirement that we should be able to jump to the base of the base of the blossom. Well, let me say that one part of it is easy and one part of it is still a mystery, an unsolved mystery. So let me say the easy part. The easy part is, I say to you how do you make sure you can jump from here to the base? Well, when you discovered this blossom you create a new node here, the blossom node. It points to the base and all the vertices of this blossom point to this so that when you came down here, you jump directly here in 1 step -- or 2 steps. Okay. And when the new blossom is created, it comes up with its own new blossom node which points to its own base and all vertices including this guy. This guy is in this second blossom. They all point to this blossom node. Okay, now let's define a very central concept, that of the tenacity of a vertex. So the tenacity of a vertex will be the sum of its even level plus the odd level. Okay? So for instance these vertices are up to nasty 9. 6 plus 3, 5 plus 4, 5 plus 4, 6 plus 3. Right? These green vertices are all of the green tenacity, whatever that is. These purple vertices are all of the purple tenacity. These yellow vertices are all of the yellow tenacity. And purple is bigger than green and yellow is bigger than purple. You're thinking about it. Okay. Okay. Now let's come to this big difference about breadth-first search honesty, and let me point out one of the big significances of this notion of tenacity. Basically it helps us limit the amount of breadthfirst search dishonesty, rather. [Inaudible]. So there's a limited amount of breadth-first search honesty to the following extent. Okay? Now I'm giving you a definition. So on this path, p, from f to u we'll say that v is breadth-first search honest on this path if this s the correct length from f to u. And by correct I mean if this is even then this is the even level of u, and if this is odd then this is the odd level of u. So it's not occurring at a bigger distance than it should. Okay? And the theorem is suppose u lies on an even level v or odd level v path and has tenacity at least that of v then it must be breadthfirst search-honest on each such path. It cannot be at a wrong distance. So for instance let's look at any even level v or odd level v to these green vertices. So like this guy is an even level path. This is an odd level path. And notice that the purple vertices which are of tenacity bigger than green are at the correct distance because this is the right even level of this purple vertex. And the yellow vertices are also at the correct distance. So it's only the vertices of lower tenacity on this path that can be at a wrong distance. Okay? And that's important. >> : Why can't you take a purple one then the green ones continue... >> Vijay Vazirani: Right. So for instance if you want the even level to this, the green guy is at a wrong level. This is more than the odd level of this. This is the odd level of this. So the green guy which is of lower tenacity is at the wrong length. Now I'm giving one more definition. So let v be a vertex of tenacity t, and let p be any min level or max level v path. So let's say this is a min level v path, p. Then let H v, p be the highest vertex of tenacity bigger than t, strictly bigger than t on p. So highest I mean [inaudible] from the unmatched vertex. So go from here on this path from here to v, and certainly some vertices are of higher tenacity, in particular this fellow. So look at the highest for the max level path from here -- I'm sorry for the highest vertex one which of bigger tenacity. And do the same to v. So this is the max level path to v. I go -- this unmatched vertex to v, and I again look of tenacity bigger than t on this path. >> : So all these are correct [inaudible]? >> Vijay Vazirani: No. There could be vertices of very low tenacity here which are at totally wrong levels. Right? But this is at the correct level. Right? It's a good point. But, yeah, it's not as simple as that. Okay? So this is a definition with respect to a given path. So this is with respect to this path. This is with respect to this path. But the point is that this set is a singleton. Edge v, p where p where p is any min level v or max level v path. There's only one such vertex that they all go through, all these paths go through and that of course [inaudible] at the right distance. And we'll denote it by the base of this vertex. Okay? And of course base v must be breadth-first search on each such path. So for instance here the base of these green vertices is purple. And every even and odd path to them goes through purple. The base of the purple vertices is yellow, this guy. And the base of the yellow vertices is this white guy. So finally I can define the notion of this blossom from this perspective of minimum length alternating paths. So let b be any vertex of tenacity bigger than t. Then the blossom of tenacity t with base b is all those vertices which have tenacity at most t and base b. Okay. So for instance for the greens -- Oh so sorry. The blossom of tenacity green with base this guy are these 4 vertices. The blossom of tenacity purple with this base is all the purple and green vertices. The blossom of tenacity yellow with this base is all but this vertex. Is that okay? So blossoms may not be as simple as this. Things can be even more complicated. Like this bridge is of tenacity 9 and this is of eleven. The blossom of tenacity green with this base are these 4 green vertices. The blossom of tenacity purple with this base is the purple and the green vertices. Sorry. The blossom of tenacity 9 with this base is the green vertices, and the blossom of tenacity eleven with this base is the purple and the green vertices. Is that okay? You are thinking. >> : So why are the green vertices part of that [inaudible]? >> Vijay Vazirani: Because I just defined the blossom like this: any vertex whose tenacity is at most t and whose base is b, is in the blossom. So if I want the blossom of tenacity eleven with this base, these guys have tenacity at most purple and they have base this guy, so I have to say that these green and these purple are in this blossom of tenacity eleven. It'll all make sense in a moment. I mean, everything is going to come together slowly, slowly, what's the reason for all this and.... But if there are any questions, I'll be happy to answer them. Okay. Sorry. So this was vertex b. So this is -- Yeah, okay. Nevermind. Okay, now this is a vertex v. It's base is this. The base of this base is this. This base of this guy is this. Okay? So I'm going to call this the base square of v; so this is the base of the base. And the base of the base of the base of v is this, base cube of v. Okay? And here's a theorem. Every even level v or odd level v path must go through the base, base square or base cube and so on. And this is very important. We'll apply all of these facts to the algorithm. Anything? Okay. >> : This order it has to... >> Vijay Vazirani: Sorry? >> : Does it have to go in this order? >> Vijay Vazirani: It'll go through the order base, base -- Sorry. Yeah, yeah, yeah. Yeah. It'll go through the biggest base and then the next smaller until it hits the -- Yeah. Yeah. As you go to bigger and bigger bases, you'll go lower and lower and higher and higher tenacity until you go to the unmatched vertex. Now I'm going to define the tenacity of an edge. Next. So if the edge is unmatched it is the even level of u plus the even level of v plus 1, so that this whole this thing forms some kind of a walk. But we'll come to its significance in a moment. If it is a mashed edge then it's the odd level of u plus 1 plus the odd level of v. Okay? So here in this graph remember these vertices were all tenacity 9? And in fact these edges are all of tenacity 9 because like this edge has tenacity 9 because it's 4 plus 4 plus 1. This edge is odd plus odd plus 1, so 5 plus 3 plus 1 and so on. Here, yeah, the edges are marked with the color of the edge is tenacity just to make the picture slightly prettier. So these are the green tenacity edges. These are the purple tenacity edges. These are the yellow tenacity edges. >> : So are they just finding the [inaudible] structure of the... >> Vijay Vazirani: Hmm? >> : Are they just finding this [inaudible]... >> Vijay Vazirani: Blossom structure, yeah. >> : Blossom structure. >> Vijay Vazirani: Yeah. Yeah. Well, you'll see what's going on. This tenacity of the edge is key. Okay, so let u, v be a bridge of tenacity t. Then I'm going to call -- This is an informal definition right now. I'm going to give you a formal definition a little bit later. So informally the support of this bridge is all the vertices of tenacity t found in the double depth-first search when you go from u and v down. All the vertices of tenacity t that you encounter. And some times the vertices that define the new blossom in some sense with a very big spoon of salt thrown in with many grains of salt. Okay? So the bridge is responsible for assigning max levels to these vertices. Okay? So the support of this green bridge is the green vertices. The support of the purple bridge is the purple vertices. And the support of the yellow bridge is the yellow vertices. Okay. >> : So when you say the [inaudible]? >> Vijay Vazirani: Well -- Well... >> : [Inaudible]... >> Vijay Vazirani: You will find it, yeah. If you go from the 2 endpoints, you will encounter all the vertices. We should get their max levels right now. So what do I mean by right now? That is this thing. So there are 2 main ideas in the algorithm. One is double depth-first search which I already said what it is. But the other is this very precise synchronization of events, and this is what needs to be described next. So what is the algorithm? What it is doing is some kind of an intertwined breadth-first search and a double depth-first search. The breadth-first search step is no different from bipartite case. So all it does is at search level i you search from all the vertices of level i. You know, I'm talking about the right parity. Look around the edges or right parity. Any vertex encountered that way which has not gotten a level yet should get level i plus 1. And whether that's even or odd depends on the parity. Okay? And then in this search level itself, we will look at all bridges of tenacity 2i plus 1, and this'll become clear in a moment why it should be 2i plus 1 and not anything else. And we run double depth-first search on each of these bridges to find their supports and give these guys their max levels. So it's basically if this is breadth-first searches and these are double depthfirst searches, it's all intertwined like this. One breadth-first search, one double depth-first search or many double depth-first searches. Then one entire breadth-first search and then one entire sequence of double depth-first searches. Okay? >> : So these double depth-first searches of the fixed tenacity are going to be touching disjoint sets of edges? >> Vijay Vazirani: Yes. Good point. Good point. There's a lot to be said about that. But the way we'll define blossoms which will be slightly different from this [inaudible] definition because -- Okay. So ask me this in a moment, and I'll clarify it completely. Okay, so the purpose of this routine is to find all vertices having min level of i plus 1. And this is straightforward. I mean, you know, 2 looks at 3, 3 looks at 4, and so on. Right? Now here comes -- When we search from 4 to find 5, we come across this edge. This edge is not going to give min level to this guy. Oops. It is not going to give a min level to this guy because it already has a min level. Okay? And this edge is not going to give a min level to this guy, so this is not a prop. So what is it? It's a bridge. What is its tenacity? We know it. It's unmatched, the edge is unmatched, so it must be even plus even plus 1. So it must be 9. Okay? So when should we do double depth-first search on it? Well, the formula says 9 minus 1 over 2 which is 4. Okay? So we should do it having done min on level 4, we should come here and do double depthfirst search on that bridge. Okay? And these double depth-first searches, their purpose is to find all vertices having tenacity 2i plus 1 and every one of them and give them their max level which will be this tenacity minus the min levels. Okay? So at this first level of 4, double depth-first search is done here to find the support of this bridge, and these vertices get their max levels. And that way at first level 6, this guy can label this 7 and so on. Now I claim that proving this is straightforward. There is nothing different from the bipartite case. It's a straightforward induction. If you've done the right thing so far, if you look from all the vertices that have first level i along the correct parity, you will find all the vertices of min level i plus 1. There's no mysteries there. What about proving this particular claim, that we do find all vertices having tenacity 2i plus 1? Well, first of all we have to show that double depth-first search does find the support of the blossom -oh, sorry, support of the bridge and assume that we have done this on the side separately. The other thing we must do is to show that every bridge of tenacity 2i plus 1 has been found, has been found by this time. By the time min finishes all bridges of tenacity 2i plus 1 have been found, they are in a list and we can start processing them one by one. And when we are done processing all of them we would have found all the vertices of tenacity 2i plus 1. So let's see how hard is this claim. So here is an edge of tenacity, 19. Obviously we would like to do double depth-first search on search level 9. Let's see if its tenacity is known before search level 9. It turns out that tenacity of this is 13 minus 1 over 2. We know this even level. This is 15, so at 15 minus 1 over 2 which is 7. We know this even level. So at such level 7 itself we know the tenacity of this fellow, 19, so that's well before search level 9. And at search level 9 we can double depth-first search here and find these 4 vertices which are of tenacity 19. Okay? There's one more interesting example. What happens when this vertex hits this vertex? Remember, in the bipartite case when a higher level vertex hit a lower level vertex we said that this edge is useless because why would you take such a path when you can take a shorter path like this? Right? Should we just throw this edge out? And the answer is no because we may be able to complete this blossom and thereby use this edge going that-a-way and, you know, find something there. Okay? So what is happening here? It turns out this even level is 8 and this is a bridge of tenacity 13, 8 plus 4 plus 1. And let's see whether it gets its nasty [inaudible] search level 6 when we should do the double depth-first search. And it will because at tenacity 5 it will do double depth-first search on this bridge and assign this 8 and so we know its tenacity is 13. We keep it in the right bucket and at search level 6 we double depth-first search on this bridge and find max levels to these guys, all these guys. Right? And in general -- I mean I gave you some examples just to provide [inaudible] but here's the proof that we will have found all bridge of tenacity 2i plus 1 by search level i. The reason is that pick any edge u, v. We can show that any endpoint of this edge, both endpoints will have tenacity at most u, v. And if the tenacity is strictly smaller then we know both levels of v by the right search level. If the tenacity of v is exactly equal to u, v and u, v is a bridge then the relevant level is the min level of v which is known by the right search level. So in either case, the two endpoints of the bridge are well known before we need to process it so it'll be processed at the right time. Okay, so we can cross this off. We have done this. All bridges of tenacity 2i plus 1 have been found. And at this point I always stop to ask if we are done. And the obvious answer to this is no. Are we done in terms of proving this claim? So I have sort of argued to you that double depth-first search will do the right thing and that all the bridges of tenacity 2i plus 1 would be found by this time. But what else is needed to complete the proof by induction to show that all vertices having tenacity 2i plus 1 will get their tenacity if you do all of this stuff, namely process all these bridges at this time in the right way? What else do we need to show? [ Silence ] And I want to give a minute because this is the central point of my second paper. And I've never obtained an answer ever on this point in all the talks that I have given so far. [ Silence ] >> : You haven't obtained the correct answer. >> Vijay Vazirani: Sorry? >> : The correct answer. >> Vijay Vazirani: No, not even an answer. [ Silence ] I mean, if you want to show that every vertex of tenacity 2i plus 1, it'll be encountered by just doing a double depth-first search on all these bridges. What must we show? [ Silence ] So what we must show is that every vertex of tenacity 2i plus 1 lies in the support of some bridge of tenacity 2i plus 1. How do we know that? Well in the case of min for instance, we know that if somebody has min level of i plus 1, there must be a vertex of level i right next to it. Right? So the culprit is adjacent to this, you can blame it. But in this case the culprit is far away. Who knows where it is or whether it is or it is not. And the point is that it is and the theorem is that on every max level v path there is a unique bridge of tenacity equal to tenacity v and v lies in its support. So as I said this is the central theorem of the second paper and there's a whole slew of development that needs to be done to arrive at a proof of this. And I'll sketch the proof later on the board if people are excited and there's time. But the proof of this also requires this definition of blossoms, requires us to say many things about how minimum length alternating paths go through blossoms. And all of those properties go to proving the algorithm. And we'll use those properties first because I want to complete the algorithm, and then I can go into the proof of this theorem. And at this point I can give you a purely graph theoretic definition of support of a bridge. Oops, again. Sorry. Call this w. So the support of bridge u, v is all those vertices w such that there tenacity is the same as tenacity of this bridge and there is a max level w path containing this bridge. Okay? So at this point sort of similar to your question is are the supports of all the bridges disjoint? And the answer is no. Like these 2 bridges have tenacity 7, and they both contain this tenacity 7 vertices in their support. So in this case whose support will this lie in, in the algorithm? It doesn't matter, whoever is processed first. If you process this first, you'll find a blossom node here, find these guys 7. And then if you process this next, you'll come here and jump directly here. And you won't go through these all over again. And of course vice versa if you go through the other bridge first. But, you know, those are little tie breakings. We don't care for that. And there is one more point to be made which I'll say later when I come to the white board. Okay, another question, can there be such an edge, this green edge? [Inaudible] there can be. Right? Nobody says such and such edge is not around. We are non-bipartite. Every edge can be present. But what is it? It's certainly not a prop because this guy gets its min level from here. This guy gets its min level from here. So it's not a prop, so what is it? It's a bridge. But it's a bridge to nowhere. Right? What is a bridge? It has empty support. It's a little worse than Sarah Palin's bridge. Right? This is another bridge to nowhere. So for instance this has tenacity 17 and there's no vertex of tenacity of 17 which has this on its max level path. And the point is that you will have to process this bridge at the right search level but you'll immediately jump to -from both endpoints you'll jump to this guy and realize that it lies within the blossom and there's no support. Another point, very important point, is this I said was a bridge of tenacity 13 and should be processed at search level 6, 13 minus 1 over 2. But we know its tenacity at search level 5 when we process this bridge. Right? Why do we not do double depth-first search on this right away? At search level 5? And the reason will become clear in this example. So here this is a bridge of tenacity 11. And at first level 5 it assigns 8 here. And at search level 6 we know this tenacity which is 15. The correct thing is to process it at search level 15 minus 1 over 2 which is 7. But suppose we say, "Okay, at search level 6 we already know its tenacity, why not process it?" We do double depth-first search from these two end points and assign 15 to all these which is correct. But here it is not correct because these two guys should get -- they have a tenacity of 13 through this bridge. And the point is that we were assuming that all the vertices of lower tenacity -- sorry, all the vertices that we encounter in double depthfirst search which haven't got a tenacity must get this tenacity. Right? And that we can assume only if we do this double depth-first search at the right search level. So for instance here we should process this bridge at search level 6. We should find tenacities of these 2. And then at 7 when we process this, these 2 already belong to this blossom so we will jump directly here, never even encounter these 2, and give 15 to only the right ones. And so all of that is needed to make sure that the proof of induction goes through. And this is what I meant by the precise synchronization of events. There's nothing different that can be done here. Okay, so one more thing which is what I will keep coming back to is we are to mark blossoms so that we jump to their base efficiently. Now look what happens here. This blossom is nested inside this blossom actually; this is a bigger tenacity blossom. But when we come down here, we'll come in two steps here and two steps here. And this nesting could've been order m big and we may have to come down this chain of blossoms order n times. And that's order n square work already. So what should we do? Well, there are many things to do. We need to compute the base of the maximal blossom at each point. And either use the set-union algorithm of Bob Tarjan's which will multiply the running time by inverse [inaudible] function. Or we can use the Gabow-Tarjan result which does this in linear time for certain class of unit [inaudible]. But then it results to a RAM model of computation which assumes that [inaudible] log m bit numbers is unit time which you may or may not like. This is the [inaudible] model. But in the 1980 paper, we claim that path compression suffices because once you do this once and everybody points to their base star, you don't have to go through -- If you just jump to the lowest base, you won't do that much work. And there are tons and tons of vertices and edges here to charge it to. But this is one claim I couldn't verify in 1989 when the paper was written. But recently Bob Tarjan himself has gotten excited about this and hopefully there'll be a revolution. I don't know. That'll be good to clear up this last one thing. Okay, so finally at some point, at some search level, we'll search from a bridge and encounter 2 unmatched vertices. And at that point we need the path. So assume that we can find this path and this path somehow by the same process. And we need the path from here to here. Well, so what we do is -- So I'm just concentrating on this part. What we do is we want a path from here to the base. So we go -- So for this purpose the blossom also points to the 2 endpoints of the bridge. This guy knows he's red. He goes to the red endpoint and finds a path from here to v using red edges. And from here to the base using blue edges. And in the second search -- Sorry. In the second search this guy jumps directly to its base, okay, and again needs a path from here to here. But this guy knows he's red and he needs a red path which is easy. Okay? And, you know, what we got more was used. Remember? And so we got this whole path here. We patch it together with these two paths and that'll be guaranteed to be a minimum length augmenting path because of theorems that I haven't stated which will come when we prove the main theorem. Okay? Then we remove this path and we augment along it. But there's a lot more stuff left in this blossom. Right? There are many, many more vertices and edges left in this blossom. And are we sure that we have the right structures left over for those vertices? Right? How will the next path be found through the half-eaten blossom? And the point is, we never will need those because of the theorem that base, base square and so on occur on every even level v and odd level v path. Right? So every even level v and odd level v path use base, base square and so on. So once you got rid of this part, the rest of the vertices and the blossoms can never reach this unmatched vertex with a shortest path because they also need base, base square, base cube and so on. And all of those are thrown out. So that's the point that these are all removed and the half-eaten blossom itself can be thrown out. And that's how -- once we got this one path, we can throw out all this structure and go to the next and get a maximal set of disjoint paths just like the bipartite case. So now I made two outrageous-sounding claims maybe in my abstract that this algorithm is simple and natural. I'd like to ask you, you know, given the difficulties and given the structure that has to be overcome, what could be simpler or more natural than this? You do have to find the min levels. You do have to find the max levels. I mean, I don't have a theorem to this effect but.... And my second claim was that it can fit into one's mind as a single thought. And for that I have a visual reason because I really see it as a one-thought algorithm. Which is that, you know, you do this. You know, you find these bridges and find max levels and find bridges and find max levels and at some level you find [inaudible] 3 vertices and that's it. I mean it sits in front of you as one thought. I think. That's all. Thank you. I think I have 15 minutes for the proof if somebody is interested. >> Nikhil Devanur Rangarajan: Okay. Any questions? >> : So for doing this [inaudible]? >> Vijay Vazirani: So if this were to work then so would the bipartite. You know, [inaudible]. And given that more than 32 years have -- since 1973 to now, so 30 years almost, so probably unlikely. So I presented this algorithm in Moscow last month, and there was a colleague of [Inaudible], Mikhail [Inaudible]. He came up with some very nice questions about maybe embedability of the structure into metric spaces and some properties there. So those I thought were the most relevant future directions. I don't think this will help, weight matching or I don't know. One more thing he mentioned was that this algorithm was thought by [Inaudible] in 1984. And up to that point I was under the impression that nobody had understood it besides the authors, and so now I was corrected and there was at least one person who had understood it. So I can get into a couple of theorems if you wish. Yeah? >> : [Inaudible]. >> Vijay Vazirani: [Inaudible]. Should we -- I just -- I have about 10 more minutes. Should I just do this? Finish it off. >> : Just take it off. >> Vijay Vazirani: Take it offline. Okay. Okay. >> : Thank you [inaudible]. [ Audience applause ]