>> Yuval Peres: Good morning everyone and we are happy to have Michael from Stanford tell us about algorithms for bipartite matching problems and some applications. >> Michael Kapralov: Thank you Yuval. So I will tell you about algorithms for bipartite matching problems with connections to streaming and sparsification. Let me start with the motivation. The need to process modern massive datasets imposes constraints on the type of algorithms that we can use and very often we have constraints on the space usage for the algorithm and also very often on the type of access that the algorithm can have to the data. So for example, we can no longer assume that we can just load the whole input into memory and have random access to it. Well, this might raise the need to design succinct representations of the input that preserve and perhaps approximately the properties of the input that we care about. So for graph algorithms, which are the main topic of the talk, cut preserving graph sparsification has become an important way to get a consistent representation of the input and has become a fundamental part of the algorithmic toolkit. So since its invention in 1996 by Benczur-Karger it has found numerous applications to undirected flow and cut problems. However the sparsification for directed graphs is still a challenging open problem. So this talk is centered around the following topics. First I will talk about some algorithms for bipartite matching problems that will you sparsification and random walks in novel ways. And here we should note that matchings are in a sense midway between directed and undirected flow. Then we will talk about the question of how we can actually implement cut preserving graph sparsification in modern data models. Then we will also talk about a new notion of sparsification that we have for bipartite matching problems and if time permits I will say some words on some new connections between different notions of sparsification, in particular between spectral sparsification and spanners. So more precisely this talk will have three or four parts depending on how much time I have. I will first present some sub linear time algorithms for finding perfect matching in bipartite regular graphs. Then I will talk about a new notion of sparsification related to matchings in the streaming model and show some applications. Then I will mention some work that we did on getting a distributed streaming implementation of cut sparsification and finally some connections between spectral sparsification and spanners leading to effective algorithms for spectral sparsification. So I'm going to start in the first part I want to talk about sub linear time algorithms for finding perfect matchings in regular bipartite graphs. We will get an algorithm that trains in time order and log n. So I'm going to start with the background. So here we have the bipartite graph G. The slices of the bi-partition will be denoted by P and Q and as a consequence of regularity the slices of P and Q are the same which we denote by N. The number of edges is denoted by M. Recall that the graph G is d-regular if the degree of every vertex is equal to D, so in particular the number of all edges is just N times D. The substantive edge is in M as in matching if no two edges in M share an endpoint and M is a perfect matching if M is a matching and the size of M is exactly N. That is that N matches all of the vertices in the graph. It is easy to see using Hall’s theorem that every d-regular bipartite graph has a perfect matching. And finding one such matching in a dregular bipartite graph is the object of our talk. These graphs having been studied extensively and the context of expander constructions routing scheduling and task assignment and have several applications in communitorial translation. So in particular I will also show applications to two problems. The first is edge coloring of bipartite multi-graphs and the second is Burkhoff-von-Neumann decompositions of doubly stochastic matrices. This problem has actually seen quite a bit of algorithmic history, just close to 100 years. The first algorithm can be dated all the way back to Konig in 1916 when Konig gave an algorithmic proof of existence. At that time of course people were not thinking about algorithms, but one can see that Konig’s proof runs in order mn time. 1974 Hopcroft and Karp gave their famous algorithm for finding maximum matchings in general bipartite graphs that runs in time order m route n. In 1982 Gabow and Kariv considered the question restricted to regular graphs and obtained a very beautiful linear time algorithm for covering imperfect matching when the degree of the graft is a power of two. This is a really nice algorithm; in fact it doesn't use augmenting paths. Instead it does Euler tours to decompose the graph into regular graphs of smaller degrees. After that there were three improvements over about 20 years. The first by Cole and Hopcroft, then by Shrijver and finally by Cole, Ost and Schirra who in 2000 obtained a linear time algorithm that works for general degrees, so they removed the assumption that d is a power two. This algorithm is extremely efficient. Linear time is just the time that we need to revamp it. What else can we hope for here? And the question that we ask is do we actually need to read the whole [inaudible]? So can we go sub linear here? For sub linear algorithms it is of course important to fix the format in which the data is given, so for the purposes of this talk we are assuming that the graph was given an adjacency area of representation, which means that each vertex has an array of incident edges. So a natural conjecture would be the following. What if we take a random sample of the edges of the graph where each edge will be present independently with certain probability and maybe we can prove that for certain sampling rates the matching will be preserved, a perfect matching will be preserved in the sample with high probability. If we could do that we could then run some standard algorithm like Hopcroft and Karp for general graphs and maybe get an improvement. Well, this is a reasonable conjecture and furthermore it turns out to be true. And this is something that we proved in 2009. We show that it is sufficient to sample a uniform subgraph of certain size to, the sizes given by the following expression that depends on n and the degree but the main point is that this is never bigger than n to, than m root n. And if we take such a sample of this regular graph then we show that a perfect matching will be preserved in the sample with high probability. Using Hopcroft-Karp and in the right region for the sampling, gives us an algorithm with runtime n to 1.75 and this is sub linear for dense enough graphs, so this is a result. So we do have a sub linear algorithm but n to 1.75 doesn't look like a natural stopping point. It seems that this should be improvable and also it seems that if uniform sampling works then most probably non-uniform sampling can help improve the runtime. That is also correct. If we show that there is a two-stage sampling scheme that is a uniform sampling followed by a non-uniform sampling process, together with a specialized analysis of the run time of Hopcroft-Karp on these subsample graphs, gives us a runtime which is worstcase n to 1.5 and in fact is linear in the size of this uniform sample. So at this point n to 1.5 is a fairly natural runtime for bipartite matching algorithms especially given the Hopcroft-Karp algorithm, and furthermore one can see that this runtime is optimal if we commit to the scheme that we're using, that is uniform sampling first and then running the Hopcroft-Karp algorithm. However, the structure of worst-case examples suggested to us that perhaps we can get an improvement if we somehow managed to combine the sampling process and the process of augmentation. So this in fact can be done and this is the main result of this part. We show that there exists a randomized algorithm for finding a perfect matching in a d-regular bipartite graph as long as the graph was given an adjacency area of representation that takes order n log n time both in expectation and with high probability. So first… >>: [inaudible]. >> Michael Kapralov: Okay. >>: Probably the proof that two stages? >> Michael Kapralov: Yes. The proof will show you the expectation part and hyper ability will follow easily. So let me know the following. First the runtime of this algorithm is independent of the degree of the graph, so basically we are independent of the size of the input. Furthermore, the runtime is within an algorithmic factor of output complexity because we need m over n time for just output matching. So now I will show you the algorithm which is in fact quite simple and give the analysis. So the algorithm will use augmenting paths, to repeatedly increase the size of the currently constructed matching. At this point let me remind you that an augmenting path with respect to a partial matching is a path that starts on one side of the graph, on the P side of the graph at an unmatched vertex and then alternates between taking unmatched edges until it comes to the Q side of the graph and an unmatched vertex. We need a randomization of this process and a very natural randomization is the following. Instead of taking an arbitrary step at odd steps, let's take a uniformly random outgoing edge which is unmatched at odd steps in this path and still take matched edges at even steps. So this is something that we refer to as the alternating random walk. Let me give an example. Here we have a four regular graph and a matching 53. So the green nodes are unmatched. The blue nodes are matched so the alternating random walk starts at a uniformly random unmatched node on the P side, a green node, takes a uniformly random outgoing edge and takes the matched edge back and proceeds in this way. So note, for example, that it can easily visit certain vertices more than once. Eventually it arrives at an unmatched node on the Q side of the graph. Great. So it should be noted that if we have a sequence of steps taken by the alternating random walk from the P to the Q side, then we can get an augmenting path from this sequence of steps simply by removing loops. Here we have a loop. If we remove it we get a length three augmenting path. And now our algorithm works as follows. We start with the empty matching and then repeatedly for K from one to N. We run the alternating random walk with respect to the matching that we have reconstructed so far and wait until it hits an unmatched vertex in the Q side of the graph. And we augment using the augmenting path that we get from this walk and proceed. So I will now show that the algorithm above finds a perfect matching in order and log n time, so to do that it will be convenient to introduce the following concept where you find the matching graph H, which depends on graph G and a partial matching M in the following way; so let me illustrate this. So here again we have bipartite graph and a matching M on size 3 so let me first orient all edges from P to Q. Then I will add a source and a sink so this source is connected to unmatched vertices on the left and the edges are drawn in thick because in fact they are deep [inaudible] edges for each thick edge. The sink is connected to the nodes on the right and now let's look at the matched edges, and we will just contract all matched edges into super nodes. So this is our matching graph H. Well, our algorithm can be formulated in a very simple way in terms of this matching graph, so what we are doing is the following, we are starting with the empty matching and then repeatedly we run this simple random walk, from the source in this matching graph and wait until it hits the sink. Once we have the sequence of steps, we augment using the path that we obtained from it. So what we need to show right--the main lemma in our analysis will say the following that if we have our d-regular bipartite graph and matching M that leaves two K nodes unmatched, so K nodes on each side, then the expected time until the simple random walk in the matching graph that we start from the source, hits the sink is at most one plus n over K. So when we start with a very small matching, if we use a lot of nodes unmatched, K is large; it will be extremely easy to find an augmenting path with respect to this matching. It will get progressively harder but the cumulative effort will be small anyway. So now let me prove this statement. The proof will be very simple and it will be convenient to modify the matching graph a little bit. Let's look at the nodes at the source and the sink and let's merge them into one supernode. Okay. So the process we were running on the matching graph was the simple random walk from s to t. Now in this case this directly corresponds to starting your random walk at this new supernode s and waiting until it gets back to s. And what we need to analyze than is the expected return time to this vertex s. So what really helps is the fact that the graph that we are getting is a balanced directed graph. As is probably most obvious to most of you that this will be easy to analyze and in fact we know that for a balanced directed graph the distribution of the simple random walk can be described in a simple way. So first let's check that it is actually balanced. We have several types of nodes here. There is the supernode, then there are these blue nodes that corresponded to matching edges that we contracted. Well they have in this case in degree 3 and out degree 3 and in general is D- one and the green nodes have out degree d and in degree d because these edges are thick. So they are balanced too. Good. So we have a balanced directed graph. Now we know that the distribution of the simple random walk on such a graph is proportional, is uniform over edges and so the mass at the vertex is proportional to vertices odd degree. At the same time what we are interested in is the return time to the special mode S. But the return time was just the expected return time was just the inverse of the stationary distribution, and now we can prove the result that we want. Now the degree of the node S, of the supernode is D times K. So K is the number of nodes that were left unmatched, so intuitively when the matching small a lot of nodes are unmatched this random walk will spend a lot of time at S and that is good for us. So now we can do the calculation and the calculation shows that the quantity is at most a 1+N over K. So this proves the main lemma. Now it is easy to get the runtime analysis because we simply have N steps and each step takes expected time, at most 1+N over K. so the runtime was bounded by the summation of these quantities from 1 to N and that is exactly N times 1 plus the harmonic number of N, that's [inaudible] again. Now this was the expected time analysis. To get the hyper ability result we can just apply standard techniques truncate random walks appropriately and use concentration. Okay. Great. So this was order m log n algorithm for recovering one perfect matching. Now let me show some applications. So the first application will be to edge coloring bipartite multigraphs. And here we get an extremely simple order m log n algorithm. Now this is slightly slower than the best-known. The best-known is order m log d where d is the degree, but the algorithm is so simple that I don't want to state it. So the algorithm works in two steps. The first step is standard. We take the bipartite multi-graph and transform it into a bipartite regular graph. Know in the next step we can simply take out matchings from this d-regular graph that we get one by one. Each matching will take n log n time to find, and we will be done in order n log n time in general. Now here, I am skipping this point that when we run the alternating random walk, it is important to be able to sample a uniformly random outgoing edge that is not matched efficiently. This has to be very fine; it can indeed be shown that the sampling can be implemented in constant amortized time. What seems very nice about this is that the fact that our algorithm for recovering one perfect matching is extremely efficient. It takes n log n time irrespective of the size of the input. Now we can find such edge colorings in a very simple manner just by taking out matchings one by one. So another application is to find these matchings in doubly stochastic matrices, and so here if we are given n by n doubly stochastic matrix with m nonzero entries, then the Birkhoff-von-Neumann decomposition theorem says that every such matrix can be represented as a common combination of at most M permutations matrices. The question is can we recover such a decomposition efficiently? What if we just sketch how this works and… >>: [inaudible]. >> Michael Kapralov: So d is the number of bits that we use to represent the numbers in the matrix. So since it is a doubly stochastic matrix we need to specify what kind of a representation we use. So there are some known algorithms that find such decomposition and for example they take order m times b time to find a single matching in the support of the doubly stochastic matrix and they take order mb log n time to compute the whole decomposition. I want to just say that we have a very simple algorithm with a very efficient runtime here because we can view this matrix M as a multi-graph and essentially the same analysis will go through. So we can run our algorithm as long as we can implement the sampling stage which is sampling the uniformly random outgoing edge. This is a little harder in this case than in the edge coloring case, but we can in fact implement this in order log n time and we get some efficient algorithms. Let me skip this. Great. So these, this was the main algorithm and two applications. Now I want to mention some lower bounds. We proved two statements here. The first we proved that randomization was crucial to obtaining sub linear time algorithms and particular any deterministic algorithm has to take at least linear time. So the algorithm of Cole Austin Sherrer which found a matching in linear time on the size of the input was essentially optimal. Furthermore, we show that we cannot improve upon the n log n runtime if we want an algorithm that works with high probability. Essentially, what this shows is that while we cannot rule out the existence of an algorithm that finds a matching n order n expected time and terminates with probability one half, let's say but if we want an algorithm that terminates with high probability, then it has to take at least n log n time in the worst case. Great. So this completes the first part of the talk. So I talked about sub linear time algorithms for finding regular matchings in bipartite graphs and some of them use sparsification at least the first ones of them. So now I want to spend a few minutes mentioning a different project that we worked on. And this is about graph sparsification and how we can implement graph sparsification in modern data architectures, in particular in ActiveDHTs. I will define what that is in just a few minutes. Even though we use graph sparsification in the first part to obtain the first use of linear term algorithms, I didn't define it, so let me define now. If we have a graph G, a weighted undirected graph G, and there is some graph H is a cut sparsifier and epsilon cut sparsifier of G, if all cuts in H are within a one plus minus epsilon factor of the corresponding cuts in G. So this is a great concept because if H is sparse then we can use H instead of G and cut over cut-based optimization problems getting better runtimes. The famous theorem of Benczur-Karger proved in ‘96 shows how to construct such sparsifiers. In particular, they show that one can calculate probabilities for each edge when calculating the probability of P such that if edges of the graph are sampled with these probabilities and then we weight the sampled edges appropriately, the resulting weighted graph H will be a sparsifier of G with high probability. Furthermore, BenczurKarger also gave a nearly linear time algorithm for finding these weights and hence constructing this sparsifier. Since 1996 this has found numerous applications and cut and flow problems, and in fact has become an integral part of the algorithmic toolkit, arguably alongside such fundamental primitives such as BFS and DFS. So this motivates the need to obtain efficient implementations of sparsification in modern data models. So the question that we ask here is can one get an efficient implementation of cut sparsification in a distributed streaming setting. Ideally we want an algorithm that works in a single pass in a distributed streaming setting. To put this in perspective, one might think of the situation where the nodes of the graph do not fit into the memory of one compute node. And our architecture for this will be ActiveDHT which I will define in a few moments. It should be noted that efficient implementations are known for random access model and one pass streaming model, but we also want to be efficient in the distributed setting. So let me say a few words about what ActiveDHT is and before I do that I need to remind you of MapReduce steps. MapReduce is an immensely successful paradigm that transformed off-line analytics and bulk data processing recently. And MapReduce data is stored as key value pairs in a distributed file system and computations are the sequence of iterations of certain MapReduce steps. MapReducers are essentially processes that essentially are compute nodes and there is a programming paradigm that specifies how they interact. The main point here is MapReduce is great for off-line data processing. ActiveDHT, on the other hand, may potentially become as important for online data processing as MapReduce is for off-line problems. ActiveDHT here in fact stands for active distributed hash table and the hash table is active in the sense that besides supporting lookups, deletions and insertions, it also supports running arbitrary functions on key value pairs. There are some examples of these systems implemented, for example Twitter’s Storm system and Yahoo's S4 and the main applications are distributed stream processing and continuous MapReduce. So one might think of this as MapReduce where the mappers and reducers do not interact according to this rigid paradigm and iterations, well, they have the ability to talk to each other continuously. These are fairly new systems and in fact they are our challenges in implementing them which have not yet been fully solved such as the inefficiency of small network requests, robustness, but people are working on that. So this is all that I will say about ActiveDHTs. Now what we are interested in here is constructing a sparsifier on ActiveDHT. So let me sketch how this will work. To do that at first I would like to look at how standard efficient algorithms for constructing sparsifiers work. In general, there are two steps. First one needs to find these probabilities PE, find a sampling probabilities and once we have the probabilities we can sample independently using these rates and weight edges appropriately. The most important step here is, of course, how we find the probabilities and at a high level the observation that we use is that one can estimate these sampling probabilities using a hierarchy of union find data structures. The benefit of this will be that union find will be fairly easy to distribute. Of course there are some challenges that we need to work on to make this work. One of those is the fact that we need to estimate connectivities and sample at the same time. We have to control the size of the sampling when this happens, but this can be done. Another interesting point is we have to, when we distribute the union find data structure, we have to ensure two things, first that our distributed implementation does not lead to excessive communication, and furthermore, we need to ensure some load-balancing properties. That is, not only do we have small communication, also that the communication is somewhat evenly spread across compute nodes. These are some challenges that we can overcome and let me just state what we are getting, so we get an efficient distributed stream processing algorithm which computes a sparsifier on ActiveDHTs in one pass. And it has some favorable space usage properties and good communication and load balance. So I have to skip most of the details here but I am happy to chat off-line if somebody is interested. Great. Now I just, so far I have been talking about sub linear algorithms for matchings and cut sparsification. Now in the remaining time I would like to talk a little bit about a new notion of sparsification that is related to bipartite matching problems that we recently introduced, and I will show you some applications for approximating matching in one pass in the streaming model. So let me now introduce this definition. Suppose that we have a bipartite graph G. The sides of the bi-partition are P and Q and for simplicity we will assume that the sides are equal. So the size of P is the size of Q equal to n. Now we call this subgraph H an epsilon cover of G, if H preserves sizes of matchings between any pair of subsets A and P and B and Q, up to an epsilon n additive error. So here is an example. Suppose we have a graph G here, so this is the P side. This is a Q side. And the condition that H is an epsilon cover says that which ever two sets A and B that we look at and we compute the maximum matching between the two in G, and then we compare it to the maximum matching of H, the maximum matching of H should only be at most an epsilon n additive term smaller. Of course, the main question that we are interested in here is what is the optimal size of an epsilon cover for a graph on two n nodes and nodes on each side? So this question asks for general trade-off. We are given n and we are given epsilon so what is the optimal size of the cover? Now we will also be interested in the following twist of this question. Suppose that I want an efficient cover. I want to represent the matchings in graph that we are using with two edges. So I constrain my cover to have O to the n edges and so n polylog n edges and this is a standard notion of small let's say for streaming algorithms. Now the question becomes what is the smallest epsilon for which an epsilon cover with few O ~ n always a give. So these are the two questions that we are interested in. To the best of our knowledge there is no prior work on this so I will just go to our results. Here we prove the following. On the positive side we prove that were given an efficient construction of a one half cover of [inaudible] G that has a linear number of edges. Furthermore, we show that this is in fact tight in the sense that if we constrain the size of the cover to have O ~ n edges and polylog for any polylog we cannot do, we cannot have a cover for epsilon smaller than one half. If you want an epsilon cover for epsilon smaller than one half, then some graphs it will need to have at least n to power one plus O mega get of one over one log n edges which is significantly bigger then any n polylog n. So this essentially completely characterizes the second question that we asked about what is the best approximation that we can get with few edges. And for the general case, the general trade-off, epsilon we show that the optimal size of an excellent cover is essentially equal to the largest possible number of edges and a so-called epsilon RuzaSzemeredi graph. So this is a very interesting family of graphs that come up and [inaudible] constructions, property testing and [inaudible]. And I will say a few words about them at the very end. So these are the results that the question, the natural question is how, how good is it? So what does this mean that we have a one half cover? To put this into perspective let me remind you what our main motivation is. The main motivation is finding approximate matchings in the streaming model. So here we, the edges of the graph are given to us in arbitrary order in a stream and we can only use Õ n memory. The question here is what is the best approximation factor to the maximum matching in the graph that we are given can be obtained in a single pass over the data? So in the context of this one might think that a one half cover may not be useful because a half seems, it seems like the half is half approximation that we can always get by just keeping the maximum matching. That is in fact not correct and we show that the one half cover is in fact roughly corresponds to an approximation factor of two thirds for matching. So here are our results when related to streaming. The techniques that we use to construct our half cover yield us the following. First we get a two thirds approximation to the natural communication problem associated with matchings in one pass, and I will define and a few slides. Furthermore, we get a lower bound of two thirds for one pass streaming algorithms. That is we show that no one pass streaming algorithm achieve a Õ n space can get a better than two thirds approximation to maximum matches. And finally this will also be useful in the, so this was for the communication problem and got a lower bound, but we will also show that our techniques are useful in the streaming, in the general streaming case as long as we make this additional assumption that we don't have edge arrivals, but vertices arriving in the stream. We will talk about this little later. Great. In the remaining time I will do the following. First I will show the construction of what we call the matching skeleton. So this is the matching sparsifier that is our main tool for these results. We show that it is a half cover. I will have to skip proof, however, and then I will talk a bit more about applications to streaming and also the connection between epsilon covers and Ruza-Szemeredi graphs. So the matching skeleton will be a sparse subgraph of G that in a sense preserves some useful information about matchings. And now I will give the construction. First I will make this one technical assumption that in our graph G there is a perfect matching of the P side. The general construction will be very similar. This will be easier to describe. So what this says in particular is that the vertex expansion of all sets on the P side is at least one. So one other thing that I will need is the definition of an alpha matching. So this is a fractional matching that matches each vertex on the P side at exactly alpha times and each vertex in Q at most once. So alpha will be bigger than one. So the construction of this matching skeleton will proceed in two steps. First I will take the graph and come up with a decomposition of the vertex side of the graph into what we call expanding pairs, so these will be pairs as J TJ and these will be vertex in do sub graphs that have increasing vertex expansion. So the subgraph will have expansion denoted by alpha J and this expansion will be the ratio of the size of the graph. So once I have this decomposition, I will choose a fractional matching inside each subgraph so the edges of the fractional matching is supported on will be the edges of the skeleton. Okay so how does the decomposition work? The decomposition works as follows. We start with the graph and we repeatedly find and remove sets S from the P side of the graph that have the smallest vertex expansion. For example, here we find set S not that has the minimal possible vertex expansion and we remove it from the graph. It might seem ill-defined and in fact it is as I just stated it, because there could be a lot of such sets that have the smallest possible vertex expansion, but one can show that there will always be a maximal such set to remove and that’s what we do. So this gets removed from the graph and we recurs on the rest. Now again, we find the smallest expanding set in the P side and we remove it. So this goes on until the remaining part of the graph has essentially the best possible expansion for such a graph that is expansion equal to the ratio of the size of the bi-partition. So this is the decomposition. Now it can be shown in fact that the vertex expansion goes up as we do this process and each piece in the decomposition has vertex expansion which is the ratio of the sizes of the sets in the bipartition. So in particular, there exists a fraction alpha J matching where alpha J is this expansion in each such subgraph. This matching can always be chosen to be a forest just by canceling cycles and the edges that this forest are exactly the edges of our matching skeleton. So this is the construction. Now I have to skip the proof of the main property, but the main property is the following. Suppose we have two graphs, bipartite G1 and G2 and we are interested in the maximum matching of the union of these two graphs. Now if we instead replaced the first graph with its sparsifier, with its matching skeleton, then what we get is a two thirds approximation. So this is the main property and one can in fact derive from this property that the matching skeleton is a half cover. So what this means is that we have a graph with n vertices on each side. Then the matching skeleton of the graph will preserve the sizes of matchings between any pair of subsets up to an additive n over 2 term. Again, let me stress that it might seem that something simple like a maximum matching would have these properties, but in fact that is not true. A maximum matching does not give a better than two thirds cover. So now let me sketch these connections to streaming and so so far I defined this matching skeleton and showed that this is a half cover. Now let me show the connections to streaming and here I need to define the following communication problem. >>: [inaudible] cover? >> Michael Kapralov: Oh, sure. So if we have a graph G then the graph H is an epsilon cover and it's balanced in the sense that there are n vertices on each side. Now H is an epsilon cover if the following is true. We will look at any pair of subsets A on one side and B on the other side and calculate them maximum matching between these sets in G and H and we compare them. Now the maximum matching in H should only be at most an epsilon n additive term smaller. So in this case here we get this property with one half to preserve these matchings up to n over 2 additive term. So the communication problem is the following. We have two communicating parties, Alice and Bob. Now Alice has a graph G1 and Bob has a graph G2 on the same set of vertices but with a different set of edges. Now Alice sends a message to Bob after which Bob is supposed to output a one minus epsilon approximation to the maximum matching of the union of two graphs. So maybe this matching. The questions that we are interested in here are a lot like the ones that we asked for matching covers. First what is the minimum size of the message that Alice needs to send Bob will always let Bob output a one minus epsilon approximate matching the union. So now this is again asking for general trade-off between M and trade-off and a restricted version of the question is, suppose we restrict the communication between Alice and Bob to be Õ n so n polylog n was a linear in the number of vertices in Alice's graph. What is the best approximation that they can achieve? Well, a natural approach to this problem is to just ask Alice to send a maximal matching of her graph to Bob. This will be very little communication. It will take Õ n communication and give a one half approximation. So now, well this is a great problem, but why do we care about this problem? Now the motivation for this problem comes from the problem of approximating maximum matchings in one pass in the streaming setting. And in fact a lower bound for the communication problem will immediately translate into a lower bound for streaming algorithms and an upper bound will not really translate directly into anything, but nevertheless the techniques that will work for the communication problem will also let us get a result for streaming with vertex arrivals. So there is some prior work on this problem. There has been significant progress on approximating matching's in the streaming model in K passes, but for K greater than one. For single pass the best known approximation is still one half and achieved by the trivial algorithm that just keeps a maximal matching. This was very recently improved to 1/2+ epsilon for a small positive constant epsilon, but this is under an additional assumption that the edges arrive in a random order in the string and we are interested in the case when the arrival orders are addressarial. >>: Would you say [inaudible]? >> Michael Kapralov: This is Conrad Minier and Matthew, a very recent thing, maybe a month or two ago. On the lower bound side the only lower bound known is o mega n squared for one pass but this is for computing exact matchings. So let me state our results. It follows immediately using the results that we proved for the matching skeleton, that the communication complexity of obtaining a two thirds approximation to maximum matching is Õ n. In particular, instead of sending a maximal matching of her graph, Alice can compute the matching skeleton which is sparse and send it to Bob. And so this is for the communication problem, but for the general streaming case, we show that in fact, if we use this sparsification procedure given by the matching skeleton, we can just use it repeatedly in the streaming model and as long as we have the assumption that not edges but vertices arrive in the stream, we will get a 1-1 over e approximation to the maximum matching. This will take linear space and we will only use a single pass over the data. So it should be noted here that one minus one over e can also be obtained in this setting using the key to the algorithm for the online version of the problem, but that algorithm is randomized and so our algorithm will be deterministic. So far I showed that one half covers exist and that communication complexity is quasilinear when we want a two thirds approximation. A natural question is what about better covers and better approximation? Well, here we show connections to a family of graphs known as epsilon Ruzsa-Szemeredi graphs. Unfortunately, I do not have enough time to define them properly. But essentially, these graphs are defined by the property that their edge set can be partitioned into a union of induced matchings and each such induced matching will have size at least epsilon n. In fact these graphs come up in applications in property testing and PCP constructions and additive [inaudible] and it is a major problem to determine the optimal size of these graphs as a function of epsilon and n. The gaps between the best-known bounds are immense, so for example, the best-known upper bounds for these graphs is n squared over log*. And the best-known constructions for constant epsilon achieve the number of edges which is n to 1+ Omega of one over log log n. And we show that for the general question of bounding the optimal size of epsilon covers, this question is essentially equivalent to bounding the optimal size of epsilon Ruzsa-Szemeredi graphs. Furthermore, let me say how we obtained the lower bounds for streaming algorithms and lower bounds for the communication complexity problem. And so this is done via an extension of a beautiful result by Fisher and others where they construct Epsilon RuzsaSzemeredi graphs that have constant epsilon. And they achieve by the number of edges n to one plus Omega of one over log log n, but their construction works for constant epsilon and here we extend this construction to work for O epsilon arbitrarily close to one half. Now this immediately gives us lower bounds that say that our bounds on the half covers and linear communication complexity are best possible. That is if we insist on quasi-linear communication, two thirds is the best we can do and if we insist on quasilinear number of edges, one half is the best that we can do for covers. So this also implies a streaming bound and so this one half here is actually the largest epsilon that we can possibly hope for because our construction of a one half cover precludes the existence of these graphs with a large number of edges for larger epsilon. So this concludes the discussion of our notions for sparsification for matching problems and applications to streaming. Now in the remaining minute let me say two words about some other work that we have been doing with Rina Panigraphy and so this is… >>: [inaudible] [laughter]. >> Michael Kapralov: This shows some connections between spectral sparsification and spanners. And in fact we show that one can obtain efficient algorithms for obtaining spectral sparsifiers using spanners of random subgraphs. And yeah. I have done some other work on online matching and prediction strategy problems and differentially private low rank approximation and I thank you for your attention. [applause]. >> Yuval Peres: Are there any questions? >>: [inaudible] matching lower bound for that [inaudible] are there cases of further assumption where it could be could be faster or [inaudible]? >> Michael Kapralov: This algorithm definitely takes time and log n, on the complete graph so yeah, I am not really sure. I don't know of any assumptions that would make it actually order out. Now, but that is a good question. And the lower bound actually the lower bound that we proved in fact precludes order n with hyper ability only for dense graphs, [inaudible] squared [inaudible]. So if the graphs are sparser, it is not clear. And in fact, it cannot be true for very sparse graphs because there is a linear algorithm if the degree is constant. >>: [inaudible]. >> Michael Kapralov: There is nothing that I am aware of. That's interesting. >> Yuval Peres: Any more questions? Thank you. [applause].