>> Konstantin Makarychev: It’s a great pleasure to have Nikhil... here at Microsoft Research. Nikhil is a professor at...

>> Konstantin Makarychev: It’s a great pleasure to have Nikhil Bansal here at Microsoft Research. Nikhil is a professor at Technical University Eindhoven. Before that he was a research manager at IBM research. Nikhil works on various types of algorithms, online algorithms, and approximation algorithms and today he will talk to us about the number of matroids. >> Nikhil Bansal: Okay, thank you Konstantin and thanks for inviting me. So I will talk about this work on the number of matroids and it is joint work with Rudi Pendavingh who is a colleague at TU Eindhoven and a master student Jorn van der Pol. Okay, so probably most of you know what matroids are and have used them extensively. These are central objects in optimization and once nice thing about them is they combine graph theory and linear algebra together. But just to remind you, so matroid is basically a set collection. So you have a universe of elements, let’s say one through N and there is a collection of subsets of U, which we call as independent sets, which satisfies very basic axiom. So the first axiom is that these sets of subsets close. So if I is independent every subset of I is also independent. And a more interesting property is what’s called the extension property. So if you have two independents, let’s say I and I prime and I is strictly bigger than I prime then there is a way to extend I prime to a bigger independent set by including some element which is not in I prime, but in I. So in other words there exists some X in I minus I prime so this is also independent. And actually most of the terminology for matrix comes from linear algebra. So a typical example to keep in mind is that your universe is a collection of vectors. And independent really means they are like independent over some field. So then you should see that okay if some set is independent then its subset is independent. And extension property also holds [indiscernible] and always extend a set. Add some other linear independent vector if some set is smaller. So one big problem in matroid theory is to understand how a matroid looks, because notice that unlike graph theory where you have some kind of structure these matroids are defined by properties. So it’s not really clear what the structure is. And this is of course a whole industry in itself. So a typical question is: How does a random matroid or typical matroid look? And there have also been attempts to define a certain notion of random matroid theory, but it’s much, much less developed then random graph theory for example and we don’t even know very basic questions. In fact one of the most basic questions that we can ask is: How many matroids are there on elements? If you want to define any probability of something you had better be able to at least answer this. And notice one trivial bond is of course two to the two to the N because you have elements. We can have up to two to the N subsets. And it’s any collection of subsets. So this is the [indiscernible]. Now already in 1974, I guess it’s almost forty years ago, Knuth showed the following lower bound: that the number of matroids is at least two to this number. So it’s one over N times the central binomial coefficient and since N [indiscernible]. So this is roughly like this. And the way he actually did it, and we will see this proof in a few slides, is he constructed an explicit class of matroids known as sparse paving matroids and he showed that the number is this much. So if you summarize these two things let’s see what’s already known. So since we will be dealing with such big numbers you start to look at these things on the log, log scale. So we look at log, log number of matroids. So this result is between N minus three half log N minus some constants. And this, after taking log, log will become N. So the number lies somewhere here. Now you might ask: Why do we bother about this tiny three half log N? I mean usually an approximation you just get about constants, so this is negligible. So it depends, you can give several answers. So one thing is this log, log scale is a bit deceptive, right, because even if you have X verses X squared once you take the log, log scale it just translates to an additive one error between X and X squared. And the sort of more important reason is that it has been widely conjectured in matroid literature that most matroids are sparse paving. So people believe that Knuth’s bound is actually the right answer for the number of matroids. And so there are various quantitative versions of what that means, but I won’t bother you with the precise conjectures, but they mean that Knuth’s bound is sort of close to the right answer. Good, so just recall Knuth or this three half log N, right. Now we saw a naive upper bound of two to the two to the N. Notice it can be trivially improved and we can shave off this half log N as follows by just looking at basically matroids of a given rank. So again, what is a rank of a matroid? So again recall because of this, again we all know this, so the maximal independent set in a matroid is called a bases and all of them have the same size because of this exchange property. If not then you could extend one of them. So the size of each base is kind of unique and it’s called the rank of a matroid. And one way to specify a matroid of rank is I just tell you which set of size are the bases. Because if I tell you that you know everything else because every subset of that are an independent set and every other R set is not a base. Okay. So this tells you that the number of matroids on elements of rank R is at most this, right. So good, so if you take log, log of this you will get log --. So we have first log [indiscernible] of this two, then you take log of this and this is maximized when R is N over two, which is like two to the N over root ten. So if you take the log it’s like N minus half log N. Okay. So this is for a matroid of particular rank. And now notice that a total number of matroids is just at most N plus one times matroids of a given rank. And this N plus one is a negligible factor once you take two logs, logs it disappears badly. Okay, good. So this shows that log, log of M N is at most this. And this is like, okay, so to bound M N we can essentially focus on the number of matroid of a given rank. We don’t have to worry it’s just going to add another N plus one factor. So this was a trivial way to get this half log in improvement. And again almost forty years ago Piff had a little bit stronger upper bound where he could shave off another half log N. So he could show N minus log N plus order one. So again, Knuth had three half’s and this was log N. And this was sort of the best state of affairs until now. So what do we show? We tighten this upper bound and show that basically we can recover Knuth’s lower bound up to this additive term. And in fact more precisely our additive term is just one plus order one. So basically what we show is log, log M N is what Knuth had plus one plus little of one as opposed to Piff’s log N gap. If you remove the logs, logs more precisely it looks like this. So Knuth had this lower bound and we have like a factor two extra here, which translates to the plus one when you take two logs. Okay. And the interesting thing is to prove this we actually don’t need to know much about matroids. In fact I am not really a matroid guy. So it’s all very, very basic facts. In fact for like almost 90 percent of the talks we will not even talk of matroids we will just talk about independent sets in a graph. So I will actually call these stable sets using the [indiscernible] knowledge, because independent in matroid means something else, but I might confuse myself anyway because we are usually used to calling these independent sets, but yeah. So this will be our main tool. >>: [inaudible]. >> Nikhil Bansal: Okay. So it’s believed that the number of matroids is close to what I call sparse paving matroids which is the class Knuth constructed, but presumably like sparse paving matroids could be more than this, so one doesn’t know, right. >>: [inaudible]. >> Nikhil Bansal: So I don’t know much about matroids, but. >>: [inaudible]. >> Nikhil Bansal: Yeah, yeah, definitely. Like sparse paving is a special class. We will see actually what I mean. It’s a very, very special class, but somehow [indiscernible]. So like [indiscernible] for matroid for example is not sparse paving, but the point is these are very tiny fractions of this huge space of matroids. >>: [inaudible]. >> Nikhil Bansal: Yeah, exactly. In fact the most explicit examples we know are not sparse paving matroids, but we will see what these are in a moment. Okay. So the outline of the talk will be the following. So first I will tell you this Knuth’s lower bound construction and what are these sparse paving matroids. And then for awhile I will talk about sort of some technology to count the number of stable sets in a graph and then we will see how this connects to matroids in the end. Okay. So the next couple of slides I will tell you how Knuth came up with this lower bound on the number matroids. So the first observation is if you want to specify a matroid of rank R you can also just specify the sets of size R and I will call these R sets, which are non-bases. If you tell me which sets of size R are non-bases the other sets of size R are the bases and all of their subsets are independent sets. So this will give you everything you need to know about the matroid. So it’s like a complete description. And it’s convenient to find the Johnson Graph, which probably many of you know. So this graph has two parameters, N and R and it has N [indiscernible] vertices corresponding to each subset of size R on these N elements. And you put an edge on these two vertices if they basically are common on R minus one element. So I think I have a picture. Yeah, so in other words these are two sets of size R and if they just differ in one element there is an edge between them. So a more convenient way to think of them is like you have a zero N vector of N dimensions where you have exactly R once. And there is an edge between two vectors U and V. If you make a one to zero and a zero to one, right, so you just swap. Now notice one thing it’s a very structured graph. So one thing is it’s a regular graph with degree R times N minus R and why is that? How do you get to a neighbor? You make one of these once to a zero so there are R choices. And where there was a zero you put a one, right, that’s a swap, so there are R times N minus R options. Okay, now here is actually a very simple theorem. It’s very easy to prove. So it says the following: So if you pick any stable step in this Johnson Graph or in other words you pick these R subsets which don’t differ by one swap. So there is no edge between them. So pick any sort of stable set in this Johnson Graph. And if you call those as your non-bases then that gives you a matroid. >>: Well, basically you are saying this is [inaudible]. >> Nikhil Bansal: Yeah, yeah, so if I just pick some stable set in the Johnson Graph those are unknown, because every other [indiscernible] in this Johnson Graph is a bases. And that will always solve my matroid. And actually for those of you who know what the basic change property is it’s very easy to see. Because whenever you have two bases you can always, you won’t get blocked you will always find a path. But again, if you don’t know what basic change is, then don’t worry about it; it’s like a one line proof. But we can take this result on faith. And these are precisely called the sparse paving matroids, so this class. And again there are sort of various other characterizations of this, but this is one useful way to think about it for our purposes. And these are extensively studied also in matroid theory. >>: Okay, is this like stable [inaudible] property or do you want something about [inaudible]? >> Nikhil Bansal: Okay, so stable set is efficient. So what could go wrong if it’s not a matroid? If you take two bases you can’t find a path. And what is a path? You try to do like a one swap. So suppose I take two bases which differ in two elements. Like if it’s just one swap there is a trivial path. So let’s say they differ in two elements, but then if I can’t find a path this way and I can’t find a path this way then these both guys are non-bases because they are both blocking me. But then there is an edge between them, right. So it’s just that, it’s sufficient. Of course there are other matroids which [indiscernible], but that’s the whole proof. So again, a stable set is always a matroid, but of course it’s not [indiscernible]. >>: So this is totally for sparse paving. >> Nikhil Bansal: Yeah, so this is the definition of sparse paving, like one definition of sparse paving. There are also various other characterizations in types of dual matroids and others. But again, as I said we will just mostly talk about stable sets. Okay. So yeah, as I said sparse paving matroids are precisely the stable sets of this Johnson Graph. So let’s look at these for awhile and we are doing the Knuth’s lower bound. So one thing is let me denote given a graph G alpha G is the size of the maximum stable set and I am interested in the number of stable sets in a graph, let’s call it iG. So clearly iG is at least two to the alpha G, because if you have an independent set of alpha G every subset is an independent set. So what does this give us? So again, one way to lower bound the number of matroid is to show lower bound the number of sparse paving matroids, so that’s what we are going to do. So let’s see what this naive --. So if you can lower bound alpha G then you have a good lower bound on number of independent sets using this. So one [indiscernible] lower bound is just the following: so we saw this Johnson Graph J and R is regular with degree R times N minus R, because it’s a one swap neighbor. And when R is like N over two it’s roughly N square over four. And we know that any d-regular graph has an independent set of size at least N over D plus one, just the greedy algorithm we give you. So this Johnson Graph with R is N over two has this many vertices and the degree is roughly N square over four. So it will give you something like two to the N over N to the 2.5. So that’s alpha G and iG is at least two to the two to the [indiscernible]. So that’s like a trivial lower bound. And what Knuth did he showed actually the alpha G is at least one over N instead of this four over N squared here. And the way he did it was to actually just to give a very explicit N coloring of the Johnson Graph. And actually this coloring is very sort of cute and it’s very simple to describe. So again vertex of the Johnson Graph was just zero one vector with R one, right. So associate the following number with the vortex V. So you look at the I coefficient multiplied by I and then sum over all the coefficients, so IVI. Now let’s focus on this number. Now I claim that if you have two neighboring vertices this number can defer by at most N between these two vertices, right. Because what is a swap? You drop something and you add something. And these coefficients are always between one and N. So if I look at this number module N all neighbors will be colored. So for any edge like [indiscernible] will be colored with different guys. So it’s a very simple explicit coloring which tells you that the independent set is at least this. Now one thing actually I will also tell this in the end: So presumably the maximum independent set could be bigger, but nobody knows of any other way to lower bound it, so that’s sort of one trouble. Say if one could push this somehow maybe to two over N for example then actually it would give a tight result, right. I haven’t told you the power bound proof, but because you also prove a power bound of like two over N times N [indiscernible]. >>: [inaudible]. >> Nikhil Bansal: No, no so that’s not clear actually and there are for small values. So people have done experiments and there seems to be like 1.3 or something. So there is no --. >>: [inaudible]. >> Nikhil Bansal: Okay, okay, that might be, but I am not sure. There is at least a size of N over two. I don’t know about N, but yeah, okay. Okay. So any questions so far, because that proves the lower bound, right? So if the alpha is at least this then the number of stable sets is at least that and each stable set is a sparse paving matroid. Okay. So that was the whole lower bound. So now we will sort of come to our upper bound. So again let’s recall that our goal was to show this upper bound and the log, log number of matroids is at least its lower bound plus one plus order of one. So actually, when we first started thinking about this problem the first idea was if you wanted to upper bound the number of all matroids you had better be able to at least upper bound the number of sparse paving matroids by something that’s close to Knuth. If you can’t even do sparse paving there is no hope you can upper bound everything. So that’s the step we focused on and this was actually nice because this is a very clean object. It’s just the number of stable sets in the Johnson Graph. So it’s a purely [indiscernible] problem. And luckily the ideas developed here turned out to be directly extending to matroids which was kind of fortunate. So, for most of the talk I will just talk about --. Yeah, so I denote SN as the number of sparse paving matroids and how to bound this. And then there will be the extension to matroids. Okay, good. So we will try to upper bound the number of sparse paving matroids. So the first claim we will use is the following: that the maximum stable set in the Johnson Graph is at most two over N times the number of vertices. So I will denote by capital N, this N choose N over two, instead of writing it every time, the number of vertices. And notice Knuth already showed that the lower bound of one over N times N and now we are saying that the upper bound on the maximum independent set is at most twice of that. And actually this follows from what’s known as Hoffman’s bound as probably many of you know. This says the following: if you look at the adjacency matrix of a graph and if minus lambda is the smallest eigenvalue value, so again smallest eigenvalue will always be negative because eigenvalue sums up to zero, then this lambda N over degree plus lambda is an upper bound on the size of any independent set. And actually we will see a proof of this in a couple of lines and probably many of you have seen this before. Now for the Johnson Graph the spectrum is sort of very well understood. They are very widely studied graphs and sort of algebra graph theory. And it’s know that, so we saw the degrees like N square over four and it’s know that the smallest eigenvalue is like minus N over two. So lambda is N over two in our terminology. And if you just plug these numbers here you will get this upper bound. So if you can upper bound alpha G by this let’s see what this gives us like on the number of, or at least naively. So if you know that the maximum stable set in your graph was alpha G then the total number of stable sets is either N choose zero plus N choose 1 or [indiscernible] up to this. So in our case alpha G is say bounded by this, two over N times N. So this term is sort of dominant and it will roughly look like this. And if you just review Sterling’s approximation of order, the binomial approximation it will give you some constant N which is the ratio of N over K to the two N over N. Okay, so this is like a naive way of using Hoffman’s bound together with a naive enumeration gives you this. And what lower bound did we have? It was two to the alpha G. So notice there is two things that differ between this upper bound and lower bound, so one is the base of the exponent here. So here because two here it’s like N. And then there is this factor two certain here also, which we won’t be able to remove because that’s the best upper bound we know on the stable set. But this N is sort of problematic because if you take N it will become log N in the exponent here, which is not quite what we want. So the point is this naive way of counting number of independent sets is kind of [indiscernible]. So what we will show actually is that, let me go back, so instead of N to the two N over N the number of independent sets in the Johnson Graph is at least two to the twice N over N. Okay, so we can shave off this base from N to two down. And morally you should think of this bound as saying that most independent sets in your Johnson Graph are subsets of this one large independent set because it’s not like they are smeared out everywhere. So actually this phenomenon happens more often all around us. So the most sort of natural example is to look at the hypercube. So say you have an N dimensional hypercube with two to the N vertices, let’s call it capital N. Then it’s a [indiscernible] graph so we know that alpha G is N over two. So again the naive way of counting all possible subsets of size up to N over two will give you the following upper bound on the number of independent sets, right, which is almost like almost half of two to the N. It’s pretty much useless. Now, what’s the right answer? And actually for hypercube people have nailed it down exactly like two constants. It’s quite an amazing result. So the right answer for the hypercube for instance instead of two to the N it’s like two to the N over two. So again you should think of, and in fact this is a very small constant like three or something. So what this is saying is that in the hypercube there are these two independent sets of size N over two N over two. And essentially every independent set is coming from one of these guys. So maybe this is the right picture, so hypercube is this graph and essentially everything either comes from here or here. And intuitively this makes sense for the hypercube because the point is the moment you try to pick a few vertices here it will sort of block a lot of vertices in your independent set on the other side. So if you want to pick lots of independent sets you had better just stick to one piece. And this will be sort of the idea that we will try to use. And this example already tells you that to be able to do this you have to -. So the reason why this is happening is because of the expansion in the graph, right, like if you try to use something from here it blocks a lot of stuff here and somehow your bound should take into account expansion. So otherwise if you do something naive it won’t be strong enough to do it. And like I sad this kind of phenomenon happens more often. So in fact there is a very interesting result of Jeff Kahn. It says that for any d-regular bipartite graph, not just hypercube or whatever, the independent set is at most two to the N over two plus N over two D. So again this two to the N over two is the right thing, which morally says that most independent sets come from one piece. And actually what’s nice about this result is its exactly tight, because if you have say N over 2D copies of this Kdd, [indiscernible] so the N vertices is right. This graph has 2D vertices and I take N over 2D copies so it looks like this, then for each of these pieces are two to the D plus one choices for my stable set, right. Either I can pick any subset here or any subset here. And then if I raise this number to N over the 2D I will get exactly that. And there was a nice result by Zhao recently which showed that this even holds for general graphs. >>: And D indicates what pertaining to the graph? >> Nikhil Bansal: D is the degree. >>: The average degree? >> Nikhil Bansal: Okay, so Jeff Kahn proved it for D regular, but maybe these ideas are --. >>: [inaudible]. >> Nikhil Bansal: Okay, so they are also general D regular, but not a [indiscernible] bipartite regular. >>: [inaudible]. >> Nikhil Bansal: Yeah, but I think there might be a simple argument which shows that these are extreme cases when it’s regular or you can do some swapping argument and make it regular. Okay, so what do we show? So our result is another bound on the number of independent sets. So if you have any D regular graph with minimum value minus eigenvalue minus lambda then we can round the number of independent set by --. So let’s ignore this term for now. This is the main term. So recall this was the Hoffman’s bound, right, the size of the largest independent set. So basically we say it’s two to the alpha G. So, essentially everything is coming from just this guy. And this is supposed to be like a negligible term. It’s like two to the N over the D with some log square factor sticking in. Okay, yeah, and notice it’s not such a bipartite because if you look at say the setting where any arbitrary bipartite graphs then we know that the smallest eigenvalue is minus D corresponding to this eigenvector. Okay, so lambda is D, so if you plug it in our bound this will become like two to the N over two because lambda is D. So this is half and it will give you two to the N over two plus this small term right. So Kahn’s argument he had plus N over 2D so we loose this extra square additive, but we sort of get the dominating term right. actually this also holds for any general graph. So this proves this bound for general graph, because lambda is only minus D or log And also higher. >>: [inaudible]. >> Nikhil Bansal: Sorry? >>: [inaudible]. >> Nikhil Bansal: Yeah, yeah, so my point is we give a much more general argument in terms of lambda. So why is this useful? Because if your eigenvalue was smaller, it was away from minus D, then this gives you a much tighter bound, because for our Johnson graph we don’t want something like N over two. We wanted something like this capital N over small N. So we can argue that off the shelf is ours bound order. So my point is like this theorem kind of essentially recovers close to this Kahn, which was a long line of work, but also it’s sort of much more general. So I will try to give you a flavor of this result in the next couple of slides. So the whole idea of this proof will be given an independent set will give some sort kind of encoding scheme. So it’s like a computer science way of counting. So we will show that you can always encode an independent set using a few numbers of bits. It cannot be two to the number of bits that we use. And this approach is again the result of a long line of work starting from [indiscernible] in the 60s, but our approach is sort of closest in spirit to a recent paper of Alon, but the bounds they get are weaker and were not useful for our purposes. And also as I mentioned earlier this encoding idea that we have for stable sets that will be useful for matroids. So the rest of the talk will describe these two things. And before I give the proof here is a useful lemma which is useful by itself I think also, which says the following: So suppose you have a dregular graph with minimum eigenvalue minus lambda then the following holds: if you look at any subset of vertices, call it A, then the number of edges within A, so like G of A is the graph reduced on this A. So the number of edges is at least this number and again this is sort of the dominant term. So what this means is suppose you look at a random subset of vertices capital A. >>: [inaudible]. >> Nikhil Bansal: E of G of A, yeah. and E is just the number of edges. So G of A is the graph induced A So it says that no matter which subset of vertices you pick the number of edges is at least something. So it’s giving you a lower bound. And the right way to look at this lower bound --. So suppose you picked some random set A in your graph, so let’s say this was a random graph, now if you look at a typical vertex you will expect a degree. So it has degree D in the original graph and you are picking A as a random set. So you will expect to be roughly like this right, A over N fraction of its neighbor will be here. And you are picking A such vertices. So the total sum of degrees will be this in expectation and that’s twice number of edges. So basically what this theorem is saying is that twice the number of edges or a sum of degrees is essentially what you would expect for a random set minus sort of some error term which depends on how your minimum eigenvalue looks like. So if this was a much smaller than compared to D then every set behaves like a random set. So let me rewrite this again. Okay and how do we prove this? So again it’s a standard result, but it’s sort of you do the usual thing that you do. So you want to find the number of edges in A and if G denotes the adjacency matrix of this graph then it’s nothing, but you just sum up over all these pairs this will count exactly what you want. And this you can write it as this, where chi is the incident matrix of A. And then you do the usual thing, you expand this eigenbasis of G. And we know that the largest -. So, one of the eigenvectors is 11111. So I write this chi A as A over N times 111 plus other vectors which are tied now to this one vector. So this is removing the bias. So I expand out to the usual thing and the thing pops out. So again probably many of you have used these arguments. I don’t need to elaborate. So one of the corollary of this is this Hoffman bound just pops directly, right, because if your set A is independent then the number of edges within it should be zero. So you just ask what’s the largest A I can push so that this stays zero. And if you just do the algebra it will come out to be this, okay. Okay, so the useful corollary of this result for our purposes will be the following. If I look at any set A of size epsilon N where epsilon is some small constant plus some little term that depends on the spectrum. Think of this as a negligible term because lambda is much smaller for us than D. Then essentially, yeah so what the theorem says is that this guy will have a lot of edges, right. This set will be like a random set of size epsilon N. In other words then, G A has a vertex of degree at least epsilon D, so this will be useful to us. Okay, so here is, based on how we do the encoding and again the idea will be, and in this slide I describe the idea. So again there is some stable set I in our graph and I want to encode it somehow. So, one way to encode it is to just write it down exactly, right, this vertex and this vertex. So that will exactly give you the naive bound that we had because the size could be up to alpha G and you write log N bits of each of them. But here is a potentially more useful way to do this. So maybe this picture is useful. It’s not the most exact picture, but it’s a useful analogy. So suppose there was this independent set that you would like to encode using a few number of bits. Now if you could do the following: suppose you could find a small set S. So this S is supposed to be very tiny and formally you think of it as this size. So this set S is contained in my independent set. Suppose I could find such a set, but this set had a large neighborhood, like a very, very large neighborhood. Yeah, suppose I could do that then you know that your independent set must lie in this remaining region because this set S rules out all of this red part. Like this cannot lie in your stable set. So for these remaining guys in A I just need to use one bit zero or one whether it’s in my independent set or not. In other words this independent set is completely specified if I could find this sort of seed set. And in the remainder which of these guys lies in my A? So, the number of possibilities for I, so if you could give such an N coding, its N choose S because of all the possible ways of choosing the set. But once I choose my set it fixes my A and then I only need to specify two to the A and I need like 2D combinations, like which subset of that is there. So this will be the idea to count with an encoding scheme, that’s why these properties. And if you just plug these bounds for S and A in this it recalls exactly our result. So why should such an encoding scheme exist? So let me sort of try to give a sense. It’s almost an exact proof. So here is a way to encode the independent set. So initially we said there is some independent set given by these red vertices that I want to encode and I just don’t want to list all these vertices, that’s too many bits. So I will do the following: I will start with all my vertices and let’s say I will arrange them in the order of degrees in degree. So this is the maximum degree of the next one and so on. And ties I break in some fixed order. And let’s consider the following procedure: So I look at the first vertex, and maybe I have an animation for this, so if it’s not in my independent set like this blue one I just discard it. And sorry, yeah, the first guy is not in my independent set so I discard it. The second one I discard. Now this guy is in my independent set so I choose it. And if I choose this guy it rules out a lot of other guys from being right. All its neighbors in the remainder it will rule out. So I keep doing this. Now these guys are ruled out so I have to move them and now in this remainder graph maybe the ordering of the vertices changes because it’s by degree and then I keep continuing this procedure until my remainder vertices are this Hoffman’s bound on the independent set. Because then I can’t guarantee you any edge anymore, maybe its table set. So the procedure sort of doesn’t kill any vertices when I kill something. So the whole crucial observation is that at the end of the day, when I stop, when I am left with so many vertices I have picked some vertices like S. So one thing is that this S that I have picked, like these red vertices, it completely determines which set is left over. Why, because this was a completely deterministic procedure, right? So for example if you go in the very beginning, so in the beginning you knew what was the ordering of the vertices and now at the end of the day you are left with some subset S and you see the earliest vertex in my set S is this in this ordering so I must have rejected these two. And you keep applying this argument. So if I tell you this set S it exactly tells you the reminder set at the end of the day. >>: [inaudible]. >> Nikhil Bansal: Kind of, yeah. So this is just a small seed that sort of determines what’s left over. You can think of it like that. And there is just some calculation to kind of show that. Okay and what is the property, right? So the point is that initially if you pick some vertices then we arrange these guys in the order of degrees. And we had this argument that if I look at say some big set then large sets have a lot of edges between them. This was that spectral lower bound. So it means the highest vertex should kill a lot of vertices there if I pick. So you can sort of do some math and show that because of that the number of vertices you pick up cannot be too much. Because every time you pick something you kill a lot of guys. And you can only kill so much. So that basically tells you that the set S that you pick is roughly like log D over D times the number of vertices. So just to recap, maybe it was a little quick. The idea was again you want to encode some independent set here, you find a subset S which sort of rules out a lot of things and this set itself is not too large. So your independent set can only be in this region. And this A is also completely determined by this set S so you only need this zero or one indicator and not the whole log N bits. All right. So this was the part for how to capture independent sets. Now how do we encode, how do we extend it to matroids? Now we are trying to count general matroids, not just sparse paving matroids. So these are necessarily independent sets. They could have non-bases spread all over the place. Okay, so we will use one more idea, but the main idea will be the following. So again, a matroid can be specified by saying which of the r-sets are non-bases, but I will efficiently encode these bases instead of listing each one out there. And again the picture to keep I mind is that. So I will find a small set S and again if I think of the Johnson graph maybe these are all the neighbors of this graph. So in the previous setting I could say in my independent set none of this can lie and my independent set can only lie here. But now because we don’t have an independent set maybe you also include some guys from here as your nonbases. And that’s the whole complication, but what we will show, so this is the next key lemma for matroids, that all the non-bases in this neighborhood of S can actually be specified essentially twice the size of this guy. And it will be clear what I mean by this. So even though your matroid could have lots and lots of guys here, lots of guys as non-bases here, you could just, there is a very small sort of witness for them. And let me explain what I mean by this. Okay, so once I prove this lemma the encoding is clear, because I just stole my set S and the non-bases in A like previously and the only difference here is that for the non-bases in this neighborhood it could be many, but I will just have a very small witness which will tell me which one. So the total number of bits I use is not too much. I just have three times S log N bits plus zero one bits for this. Okay, now what do I mean? So what’s the idea that --. So what do we want to show? So lots of non-bases here can be specified by a few number of sets and the idea will be the following: if I look at any set and it look at it’s neighborhood in the Johnson graph there are these N squared over four guys. And maybe lots of these vertices are nonbases, right. But one way to specify them is to list each of them. But we can actually give a very concrete representation. So instead of listing all of these which are non-bases I just described two sets which will encode which of them are non-bases. And the reason is, and this is the following structural property, so if X is a non-base it’s r-set which is a non-base, and there are many non-bases in an X, then there is some kind of underlying reason for these guys to be non-bases. So let’s look at the following example: say you have a graph and the matroid is the spanning tree matroid. So a set like this is a non-base because it has this circuit. Now look at the possible neighbors of this guy. So neighbors are just when you flip one thing and add something. So this is a neighbor, this is a neighbor and you have other neighbors where you break this cycle on some other edge. But there can be several neighbors like this. Now for all these neighbors I can just list the reason that there is a cycle there. So instead of specifying each of them with a lot of bits I can say, “Oh, here is a cycle and anything that contains its cycle is like very concrete”. So, just this one cycle tells you the whole information instead of storing everything. So this is sort of the intuitive idea that there is an inherent reason why things in the neighborhood of something are nonbases. Okay, so here is a way to make this whole thing precise. And again this is the only place where we use some properties of matroids, but it’s just this one slide. Let me go over it incase those of you are more familiar with matroid. So the key lemma is the following: if you have a dependent r-set X, so it’s a non-base, then we can associate two sets which are witnesses for all the non-bases in the neighborhood of X. Like there could be up to N squared over four non-bases, but I will just exhibit two sets which give you a lot of witness. So here is the whole proof of this. So look at some neighbor Y of X. It means that Y is of this form where you draw some element X from this and add some element Y. And let’s look at two cases. Okay, so this is case one: suppose your X to begin with was a dependent set. So it’s can’t have rank R, suppose its rank was R minus two or smaller. Then actually I can just take my witness as this set X because what is Y? It’s just you are adding at most one element to X so it can’t have rank more then R. So this is just a very concrete witness for such guys. Everything in the neighborhood is dependent on it. So this is a trivial case. So the more interesting cases when your X has a rank-r minus one. Okay, now in that case recall the basic property that X will have a unique circuit, like span increase might be the easiest. So there is one unique cycle. And what I will show in a second is --. So the witness will associated with this X is this unique cycle and what’s called the closure of X. And a closure of a set is just if I add all those extra elements which don’t increase the rank of X that’s the closure. Okay, so why is this witness enough? So here is the proof. So suppose there are two sets, X and Y, which are non-bases and Y is a neighbor of X so it looks like this. And first claim is, and again this is just a one line calculation, you know so this was a non-base of rank-r minus one and Y is also a dependent set. So either it must be that when you remove this element X from X it’s rank fell down to R minus two and even when you add it up it could not go all the way up to R or it must be that X plus Y has rank R minus one. So this can only have less rank and it can show that by just some modularity that either of these cases must hold. Okay, good. So if rank of X plus Y is R minus one this means Y lies in the closure of X, because X had rank-r vertices. So if I give you this as a witness everything that is in that closure of X tells you, just this one set tells you all these other sets that we will be explaining. And if ranks of X minus X was R minus two and notice R minus two is strictly in the cardinality of X minus X because this was cardinality R. So this is R minus one. So it means that X must contain some circuit. But that circuit also lies in X, right. And then X only has a unique circuit because it was of rank-r minus one. So if just specify this circuit I that’s the reason why your rank is R minus two. So if you add any Y it will still increase. So that’s basically it, right. So for each it can give a concrete thing. So just maybe to recap, we kind of, so essentially what we did in the stable set procedure, but instead we find these sets S, N and A, but now we also store witnesses of the neighborhood in each X, which is not too large and that give the efficient encoding. So just to conclude we showed this gap, yeah, like showed this much stronger upper bound on the number of matroids, but we still have this one plus order one gap. And it would be nice if it could be reduced to little of one, but there is sort of an inherent bottleneck, at least why this approach fails and probably why maybe any approach based on sparse paving matroids would fail. And the reason is that we don’t quite understand how the max stable set in the Johnson looks like. This is what I was asking, because again, Knuth gave this explicit construction of the small size and the best way we can upper bound is only this. And as long as this remains you will always have two to the alpha G and this two will stick up in your exponent. And one will go away when you take log logs and give that one plus O of one no matter how clever you are on the other things. So sort of one natural way to break this gap would be to understand what really is the truth. And actually this question, but this might be a hard question I think, but this actual problem of stable sets in the Johnson graph there are lots of papers in the coding literature. And people sort of have like simulations of up to NS 24 or something. I mean there is already a huge graph. So they use a lot of symmetries and all kinds of things. And the simulations seem to suggest the answer is like closer to this, but it’s hard to judge because it’s something like 1.07 N over N up to 1.3 N. Presumably it’s converging to N over N, but who knows right, these are small numbers. But yeah, so like if this is the right answer then presumably --. Okay, so maybe even if this is the right answer, one problem is that we don’t know of any other general techniques to upper bound stable sets other than these eigenvalue methods or [indiscernible] method which is also sort of similar. But maybe one could use stronger less hierarchy or something to upper bound it and that’s the only technique I know. And perhaps if you got a method for certifying that upper bound was actually this then maybe one could on top of it combine with matroid techniques to actually do the same [indiscernible] that we didn’t, get the right bound on M,N, but that’s probably a long program. Okay, so that’s it and thank you for your attention. [clapping] >>: So this would imply the number of independent sets [inaudible]? >> Nikhil Bansal: Yeah, so assuming, so again we don’t know how this machine will look, but you would have to kind of carry it out, right. Because for us we used the Hoffman’s bound with two N over N. And then on top we built all this eigenvalue, da, da, da to actually show that most independent sets come from just this guy. It’s not like N chooses this, but more like two to the N over N. >>: Did you ever use the structure of [inaudible]? >> Nikhil Bansal: So we used eigenvalue arguments in that to show such a thing. So this was the --. >>: [inaudible]. >> Nikhil Bansal: But hopefully, yeah, so that would be one piece one has to do. >>: No, but exactly what did you [inaudible]? >> Nikhil Bansal: Yeah, exactly. And actually the theorem we can show is only this. And the reason why is because this is good is because this is like two to the alpha D, but maybe more tight then this. It’s not clear whether there is technology to push that. But, presumably it should if that’s the right thing, the black box reduction. Is there anything else? >> Konstantin Makarychev: Any more questions? >>: [inaudible]. >> Nikhil Bansal: Yeah, [indiscernible] function is exactly gives you Hoffman bound. So there was a paper from [indiscernible] back in the 70s and all these things. Yeah, so all these [indiscernible], data function, eigenvalues, they are apparently all the same. I don’t quite understand why, but, so. >> Konstantin Makarychev: Let’s thank our speaker again. [clapping]

>> Konstantin Makarychev: It’s a great pleasure to have Nikhil... here at Microsoft Research. Nikhil is a professor at...

Related documents

Products

Support

&gt;&gt; Konstantin Makarychev: It’s a great pleasure to have Nikhil... here at Microsoft Research. Nikhil is a professor at...

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib

>> Konstantin Makarychev: It’s a great pleasure to have Nikhil... here at Microsoft Research. Nikhil is a professor at...