>> Konstantin Makarychev: It’s a great pleasure to have Nikhil... here at Microsoft Research. Nikhil is a professor at...

advertisement
>> Konstantin Makarychev: It’s a great pleasure to have Nikhil Bansal
here at Microsoft Research. Nikhil is a professor at Technical
University Eindhoven. Before that he was a research manager at IBM
research. Nikhil works on various types of algorithms, online
algorithms, and approximation algorithms and today he will talk to us
about the number of matroids.
>> Nikhil Bansal: Okay, thank you Konstantin and thanks for inviting
me. So I will talk about this work on the number of matroids and it is
joint work with Rudi Pendavingh who is a colleague at TU Eindhoven and
a master student Jorn van der Pol.
Okay, so probably most of you know what matroids are and have used them
extensively. These are central objects in optimization and once nice
thing about them is they combine graph theory and linear algebra
together. But just to remind you, so matroid is basically a set
collection. So you have a universe of elements, let’s say one through
N and there is a collection of subsets of U, which we call as
independent sets, which satisfies very basic axiom. So the first axiom
is that these sets of subsets close. So if I is independent every
subset of I is also independent.
And a more interesting property is what’s called the extension
property. So if you have two independents, let’s say I and I prime and
I is strictly bigger than I prime then there is a way to extend I prime
to a bigger independent set by including some element which is not in I
prime, but in I. So in other words there exists some X in I minus I
prime so this is also independent.
And actually most of the terminology for matrix comes from linear
algebra. So a typical example to keep in mind is that your universe is
a collection of vectors. And independent really means they are like
independent over some field. So then you should see that okay if some
set is independent then its subset is independent. And extension
property also holds [indiscernible] and always extend a set. Add some
other linear independent vector if some set is smaller.
So one big problem in matroid theory is to understand how a matroid
looks, because notice that unlike graph theory where you have some kind
of structure these matroids are defined by properties. So it’s not
really clear what the structure is. And this is of course a whole
industry in itself. So a typical question is: How does a random
matroid or typical matroid look? And there have also been attempts to
define a certain notion of random matroid theory, but it’s much, much
less developed then random graph theory for example and we don’t even
know very basic questions.
In fact one of the most basic questions that we can ask is: How many
matroids are there on elements? If you want to define any probability
of something you had better be able to at least answer this. And
notice one trivial bond is of course two to the two to the N because
you have elements. We can have up to two to the N subsets. And it’s
any collection of subsets. So this is the [indiscernible].
Now already in 1974, I guess it’s almost forty years ago, Knuth showed
the following lower bound: that the number of matroids is at least two
to this number. So it’s one over N times the central binomial
coefficient and since N [indiscernible]. So this is roughly like this.
And the way he actually did it, and we will see this proof in a few
slides, is he constructed an explicit class of matroids known as sparse
paving matroids and he showed that the number is this much.
So if you summarize these two things let’s see what’s already known.
So since we will be dealing with such big numbers you start to look at
these things on the log, log scale. So we look at log, log number of
matroids. So this result is between N minus three half log N minus
some constants. And this, after taking log, log will become N. So the
number lies somewhere here.
Now you might ask: Why do we bother about this tiny three half log N?
I mean usually an approximation you just get about constants, so this
is negligible. So it depends, you can give several answers. So one
thing is this log, log scale is a bit deceptive, right, because even if
you have X verses X squared once you take the log, log scale it just
translates to an additive one error between X and X squared. And the
sort of more important reason is that it has been widely conjectured in
matroid literature that most matroids are sparse paving.
So people believe that Knuth’s bound is actually the right answer for
the number of matroids. And so there are various quantitative versions
of what that means, but I won’t bother you with the precise
conjectures, but they mean that Knuth’s bound is sort of close to the
right answer. Good, so just recall Knuth or this three half log N,
right. Now we saw a naive upper bound of two to the two to the N.
Notice it can be trivially improved and we can shave off this half log
N as follows by just looking at basically matroids of a given rank.
So again, what is a rank of a matroid? So again recall because of
this, again we all know this, so the maximal independent set in a
matroid is called a bases and all of them have the same size because of
this exchange property. If not then you could extend one of them. So
the size of each base is kind of unique and it’s called the rank of a
matroid. And one way to specify a matroid of rank is I just tell you
which set of size are the bases. Because if I tell you that you know
everything else because every subset of that are an independent set and
every other R set is not a base.
Okay. So this tells you that the number of matroids on elements of
rank R is at most this, right. So good, so if you take log, log of
this you will get log --. So we have first log [indiscernible] of this
two, then you take log of this and this is maximized when R is N over
two, which is like two to the N over root ten. So if you take the log
it’s like N minus half log N. Okay. So this is for a matroid of
particular rank. And now notice that a total number of matroids is
just at most N plus one times matroids of a given rank. And this N
plus one is a negligible factor once you take two logs, logs it
disappears badly.
Okay, good. So this shows that log, log of M N is at most this. And
this is like, okay, so to bound M N we can essentially focus on the
number of matroid of a given rank. We don’t have to worry it’s just
going to add another N plus one factor. So this was a trivial way to
get this half log in improvement. And again almost forty years ago
Piff had a little bit stronger upper bound where he could shave off
another half log N. So he could show N minus log N plus order one. So
again, Knuth had three half’s and this was log N. And this was sort of
the best state of affairs until now.
So what do we show? We tighten this upper bound and show that
basically we can recover Knuth’s lower bound up to this additive term.
And in fact more precisely our additive term is just one plus order
one. So basically what we show is log, log M N is what Knuth had plus
one plus little of one as opposed to Piff’s log N gap. If you remove
the logs, logs more precisely it looks like this. So Knuth had this
lower bound and we have like a factor two extra here, which translates
to the plus one when you take two logs.
Okay. And the interesting thing is to prove this we actually don’t
need to know much about matroids. In fact I am not really a matroid
guy. So it’s all very, very basic facts. In fact for like almost 90
percent of the talks we will not even talk of matroids we will just
talk about independent sets in a graph. So I will actually call these
stable sets using the [indiscernible] knowledge, because independent in
matroid means something else, but I might confuse myself anyway because
we are usually used to calling these independent sets, but yeah.
So this will be our main tool.
>>: [inaudible].
>> Nikhil Bansal: Okay. So it’s believed that the number of matroids
is close to what I call sparse paving matroids which is the class Knuth
constructed, but presumably like sparse paving matroids could be more
than this, so one doesn’t know, right.
>>: [inaudible].
>> Nikhil Bansal: So I don’t know much about matroids, but.
>>: [inaudible].
>> Nikhil Bansal: Yeah, yeah, definitely. Like sparse paving is a
special class. We will see actually what I mean. It’s a very, very
special class, but somehow [indiscernible]. So like [indiscernible]
for matroid for example is not sparse paving, but the point is these
are very tiny fractions of this huge space of matroids.
>>: [inaudible].
>> Nikhil Bansal: Yeah, exactly. In fact the most explicit examples we
know are not sparse paving matroids, but we will see what these are in
a moment.
Okay. So the outline of the talk will be the following. So first I
will tell you this Knuth’s lower bound construction and what are these
sparse paving matroids. And then for awhile I will talk about sort of
some technology to count the number of stable sets in a graph and then
we will see how this connects to matroids in the end.
Okay. So the next couple of slides I will tell you how Knuth came up
with this lower bound on the number matroids. So the first observation
is if you want to specify a matroid of rank R you can also just specify
the sets of size R and I will call these R sets, which are non-bases.
If you tell me which sets of size R are non-bases the other sets of
size R are the bases and all of their subsets are independent sets. So
this will give you everything you need to know about the matroid. So
it’s like a complete description. And it’s convenient to find the
Johnson Graph, which probably many of you know.
So this graph has two parameters, N and R and it has N [indiscernible]
vertices corresponding to each subset of size R on these N elements.
And you put an edge on these two vertices if they basically are common
on R minus one element. So I think I have a picture. Yeah, so in
other words these are two sets of size R and if they just differ in one
element there is an edge between them. So a more convenient way to
think of them is like you have a zero N vector of N dimensions where
you have exactly R once. And there is an edge between two vectors U
and V. If you make a one to zero and a zero to one, right, so you just
swap.
Now notice one thing it’s a very structured graph. So one thing is
it’s a regular graph with degree R times N minus R and why is that?
How do you get to a neighbor? You make one of these once to a zero so
there are R choices. And where there was a zero you put a one, right,
that’s a swap, so there are R times N minus R options.
Okay, now here is actually a very simple theorem. It’s very easy to
prove. So it says the following: So if you pick any stable step in
this Johnson Graph or in other words you pick these R subsets which
don’t differ by one swap. So there is no edge between them. So pick
any sort of stable set in this Johnson Graph. And if you call those as
your non-bases then that gives you a matroid.
>>: Well, basically you are saying this is [inaudible].
>> Nikhil Bansal: Yeah, yeah, so if I just pick some stable set in the
Johnson Graph those are unknown, because every other [indiscernible] in
this Johnson Graph is a bases. And that will always solve my matroid.
And actually for those of you who know what the basic change property
is it’s very easy to see. Because whenever you have two bases you can
always, you won’t get blocked you will always find a path. But again,
if you don’t know what basic change is, then don’t worry about it; it’s
like a one line proof. But we can take this result on faith. And
these are precisely called the sparse paving matroids, so this class.
And again there are sort of various other characterizations of this,
but this is one useful way to think about it for our purposes. And
these are extensively studied also in matroid theory.
>>: Okay, is this like stable [inaudible] property or do you want
something about [inaudible]?
>> Nikhil Bansal: Okay, so stable set is efficient. So what could go
wrong if it’s not a matroid? If you take two bases you can’t find a
path. And what is a path? You try to do like a one swap. So suppose
I take two bases which differ in two elements. Like if it’s just one
swap there is a trivial path. So let’s say they differ in two
elements, but then if I can’t find a path this way and I can’t find a
path this way then these both guys are non-bases because they are both
blocking me. But then there is an edge between them, right. So it’s
just that, it’s sufficient. Of course there are other matroids which
[indiscernible], but that’s the whole proof.
So again, a stable set is always a matroid, but of course it’s not
[indiscernible].
>>: So this is totally for sparse paving.
>> Nikhil Bansal: Yeah, so this is the definition of sparse paving,
like one definition of sparse paving. There are also various other
characterizations in types of dual matroids and others. But again, as
I said we will just mostly talk about stable sets.
Okay. So yeah, as I said sparse paving matroids are precisely the
stable sets of this Johnson Graph. So let’s look at these for awhile
and we are doing the Knuth’s lower bound. So one thing is let me
denote given a graph G alpha G is the size of the maximum stable set
and I am interested in the number of stable sets in a graph, let’s call
it iG. So clearly iG is at least two to the alpha G, because if you
have an independent set of alpha G every subset is an independent set.
So what does this give us? So again, one way to lower bound the number
of matroid is to show lower bound the number of sparse paving matroids,
so that’s what we are going to do.
So let’s see what this naive --. So if you can lower bound alpha G
then you have a good lower bound on number of independent sets using
this. So one [indiscernible] lower bound is just the following: so we
saw this Johnson Graph J and R is regular with degree R times N minus
R, because it’s a one swap neighbor. And when R is like N over two
it’s roughly N square over four. And we know that any d-regular graph
has an independent set of size at least N over D plus one, just the
greedy algorithm we give you. So this Johnson Graph with R is N over
two has this many vertices and the degree is roughly N square over
four. So it will give you something like two to the N over N to the
2.5.
So that’s alpha G and iG is at least two to the two to the
[indiscernible]. So that’s like a trivial lower bound. And what Knuth
did he showed actually the alpha G is at least one over N instead of
this four over N squared here. And the way he did it was to actually
just to give a very explicit N coloring of the Johnson Graph. And
actually this coloring is very sort of cute and it’s very simple to
describe. So again vertex of the Johnson Graph was just zero one
vector with R one, right. So associate the following number with the
vortex V. So you look at the I coefficient multiplied by I and then
sum over all the coefficients, so IVI.
Now let’s focus on this number. Now I claim that if you have two
neighboring vertices this number can defer by at most N between these
two vertices, right. Because what is a swap? You drop something and
you add something. And these coefficients are always between one and
N. So if I look at this number module N all neighbors will be colored.
So for any edge like [indiscernible] will be colored with different
guys. So it’s a very simple explicit coloring which tells you that the
independent set is at least this.
Now one thing actually I will also tell this in the end: So presumably
the maximum independent set could be bigger, but nobody knows of any
other way to lower bound it, so that’s sort of one trouble. Say if one
could push this somehow maybe to two over N for example then actually
it would give a tight result, right. I haven’t told you the power
bound proof, but because you also prove a power bound of like two over
N times N [indiscernible].
>>: [inaudible].
>> Nikhil Bansal: No, no so that’s not clear actually and there are for
small values. So people have done experiments and there seems to be
like 1.3 or something. So there is no --.
>>: [inaudible].
>> Nikhil Bansal: Okay, okay, that might be, but I am not sure. There
is at least a size of N over two. I don’t know about N, but yeah,
okay.
Okay. So any questions so far, because that proves the lower bound,
right? So if the alpha is at least this then the number of stable sets
is at least that and each stable set is a sparse paving matroid. Okay.
So that was the whole lower bound.
So now we will sort of come to our upper bound. So again let’s recall
that our goal was to show this upper bound and the log, log number of
matroids is at least its lower bound plus one plus order of one. So
actually, when we first started thinking about this problem the first
idea was if you wanted to upper bound the number of all matroids you
had better be able to at least upper bound the number of sparse paving
matroids by something that’s close to Knuth. If you can’t even do
sparse paving there is no hope you can upper bound everything.
So that’s the step we focused on and this was actually nice because
this is a very clean object. It’s just the number of stable sets in
the Johnson Graph. So it’s a purely [indiscernible] problem. And
luckily the ideas developed here turned out to be directly extending to
matroids which was kind of fortunate. So, for most of the talk I will
just talk about --. Yeah, so I denote SN as the number of sparse
paving matroids and how to bound this. And then there will be the
extension to matroids. Okay, good. So we will try to upper bound the
number of sparse paving matroids.
So the first claim we will use is the following: that the maximum
stable set in the Johnson Graph is at most two over N times the number
of vertices. So I will denote by capital N, this N choose N over two,
instead of writing it every time, the number of vertices. And notice
Knuth already showed that the lower bound of one over N times N and now
we are saying that the upper bound on the maximum independent set is at
most twice of that. And actually this follows from what’s known as
Hoffman’s bound as probably many of you know. This says the following:
if you look at the adjacency matrix of a graph and if minus lambda is
the smallest eigenvalue value, so again smallest eigenvalue will always
be negative because eigenvalue sums up to zero, then this lambda N over
degree plus lambda is an upper bound on the size of any independent
set. And actually we will see a proof of this in a couple of lines and
probably many of you have seen this before.
Now for the Johnson Graph the spectrum is sort of very well understood.
They are very widely studied graphs and sort of algebra graph theory.
And it’s know that, so we saw the degrees like N square over four and
it’s know that the smallest eigenvalue is like minus N over two. So
lambda is N over two in our terminology. And if you just plug these
numbers here you will get this upper bound. So if you can upper bound
alpha G by this let’s see what this gives us like on the number of, or
at least naively. So if you know that the maximum stable set in your
graph was alpha G then the total number of stable sets is either N
choose zero plus N choose 1 or [indiscernible] up to this.
So in our case alpha G is say bounded by this, two over N times N. So
this term is sort of dominant and it will roughly look like this. And
if you just review Sterling’s approximation of order, the binomial
approximation it will give you some constant N which is the ratio of N
over K to the two N over N. Okay, so this is like a naive way of using
Hoffman’s bound together with a naive enumeration gives you this. And
what lower bound did we have? It was two to the alpha G. So notice
there is two things that differ between this upper bound and lower
bound, so one is the base of the exponent here. So here because two
here it’s like N. And then there is this factor two certain here also,
which we won’t be able to remove because that’s the best upper bound we
know on the stable set. But this N is sort of problematic because if
you take N it will become log N in the exponent here, which is not
quite what we want.
So the point is this naive way of counting number of independent sets
is kind of [indiscernible]. So what we will show actually is that, let
me go back, so instead of N to the two N over N the number of
independent sets in the Johnson Graph is at least two to the twice N
over N. Okay, so we can shave off this base from N to two down. And
morally you should think of this bound as saying that most independent
sets in your Johnson Graph are subsets of this one large independent
set because it’s not like they are smeared out everywhere. So actually
this phenomenon happens more often all around us. So the most sort of
natural example is to look at the hypercube.
So say you have an N dimensional hypercube with two to the N vertices,
let’s call it capital N. Then it’s a [indiscernible] graph so we know
that alpha G is N over two. So again the naive way of counting all
possible subsets of size up to N over two will give you the following
upper bound on the number of independent sets, right, which is almost
like almost half of two to the N. It’s pretty much useless.
Now, what’s the right answer? And actually for hypercube people have
nailed it down exactly like two constants. It’s quite an amazing
result. So the right answer for the hypercube for instance instead of
two to the N it’s like two to the N over two. So again you should
think of, and in fact this is a very small constant like three or
something. So what this is saying is that in the hypercube there are
these two independent sets of size N over two N over two. And
essentially every independent set is coming from one of these guys. So
maybe this is the right picture, so hypercube is this graph and
essentially everything either comes from here or here.
And intuitively this makes sense for the hypercube because the point is
the moment you try to pick a few vertices here it will sort of block a
lot of vertices in your independent set on the other side. So if you
want to pick lots of independent sets you had better just stick to one
piece. And this will be sort of the idea that we will try to use. And
this example already tells you that to be able to do this you have to -. So the reason why this is happening is because of the expansion in
the graph, right, like if you try to use something from here it blocks
a lot of stuff here and somehow your bound should take into account
expansion. So otherwise if you do something naive it won’t be strong
enough to do it.
And like I sad this kind of phenomenon happens more often. So in fact
there is a very interesting result of Jeff Kahn. It says that for any
d-regular bipartite graph, not just hypercube or whatever, the
independent set is at most two to the N over two plus N over two D. So
again this two to the N over two is the right thing, which morally says
that most independent sets come from one piece. And actually what’s
nice about this result is its exactly tight, because if you have say N
over 2D copies of this Kdd, [indiscernible] so the N vertices is right.
This graph has 2D vertices and I take N over 2D copies so it looks like
this, then for each of these pieces are two to the D plus one choices
for my stable set, right. Either I can pick any subset here or any
subset here. And then if I raise this number to N over the 2D I will
get exactly that. And there was a nice result by Zhao recently which
showed that this even holds for general graphs.
>>: And D indicates what pertaining to the graph?
>> Nikhil Bansal: D is the degree.
>>: The average degree?
>> Nikhil Bansal: Okay, so Jeff Kahn proved it for D regular, but maybe
these ideas are --.
>>: [inaudible].
>> Nikhil Bansal: Okay, so they are also general D regular, but not a
[indiscernible] bipartite regular.
>>: [inaudible].
>> Nikhil Bansal: Yeah, but I think there might be a simple argument
which shows that these are extreme cases when it’s regular or you can
do some swapping argument and make it regular.
Okay, so what do we show? So our result is another bound on the number
of independent sets. So if you have any D regular graph with minimum
value minus eigenvalue minus lambda then we can round the number of
independent set by --. So let’s ignore this term for now. This is the
main term. So recall this was the Hoffman’s bound, right, the size of
the largest independent set. So basically we say it’s two to the alpha
G. So, essentially everything is coming from just this guy. And this
is supposed to be like a negligible term. It’s like two to the N over
the D with some log square factor sticking in.
Okay, yeah, and notice it’s not such a bipartite because if you look at
say the setting where any arbitrary bipartite graphs then we know that
the smallest eigenvalue is minus D corresponding to this eigenvector.
Okay, so lambda is D, so if you plug it in our bound this will become
like two to the N over two because lambda is D. So this is half and it
will give you two to the N over two plus this small term right.
So Kahn’s argument he had plus N over 2D so we loose this extra
square additive, but we sort of get the dominating term right.
actually this also holds for any general graph. So this proves
this bound for general graph, because lambda is only minus D or
log
And
also
higher.
>>: [inaudible].
>> Nikhil Bansal: Sorry?
>>: [inaudible].
>> Nikhil Bansal: Yeah, yeah, so my point is we give a much more
general argument in terms of lambda. So why is this useful? Because
if your eigenvalue was smaller, it was away from minus D, then this
gives you a much tighter bound, because for our Johnson graph we don’t
want something like N over two. We wanted something like this capital
N over small N.
So we can argue that off the shelf is ours bound
order. So my point is like this theorem kind of essentially recovers
close to this Kahn, which was a long line of work, but also it’s sort
of much more general.
So I will try to give you a flavor of this result in the next couple of
slides. So the whole idea of this proof will be given an independent
set will give some sort kind of encoding scheme.
So it’s like a
computer science way of counting. So we will show that you can always
encode an independent set using a few numbers of bits.
It cannot be
two to the number of bits that we use. And this approach is again the
result of a long line of work starting from [indiscernible] in the 60s,
but our approach is sort of closest in spirit to a recent paper of
Alon, but the bounds they get are weaker and were not useful for our
purposes. And also as I mentioned earlier this encoding idea that we
have for stable sets that will be useful for matroids. So the rest of
the talk will describe these two things.
And before I give the proof here is a useful lemma which is useful by
itself I think also, which says the following: So suppose you have a dregular graph with minimum eigenvalue minus lambda then the following
holds: if you look at any subset of vertices, call it A, then the
number of edges within A, so like G of A is the graph reduced on this
A.
So the number of edges is at least this number and again this is
sort of the dominant term. So what this means is suppose you look at a
random subset of vertices capital A.
>>: [inaudible].
>> Nikhil Bansal: E of G of A, yeah.
and E is just the number of edges.
So G of A is the graph induced A
So it says that no matter which subset of vertices you pick the number
of edges is at least something. So it’s giving you a lower bound. And
the right way to look at this lower bound --.
So suppose you picked
some random set A in your graph, so let’s say this was a random graph,
now if you look at a typical vertex you will expect a degree.
So it
has degree D in the original graph and you are picking A as a random
set.
So you will expect to be roughly like this right, A over N
fraction of its neighbor will be here.
And you are picking A such
vertices. So the total sum of degrees will be this in expectation and
that’s twice number of edges.
So basically what this theorem is saying is that twice the number of
edges or a sum of degrees is essentially what you would expect for a
random set minus sort of some error term which depends on how your
minimum eigenvalue looks like.
So if this was a much smaller than
compared to D then every set behaves like a random set.
So let me
rewrite this again.
Okay and how do we prove this?
So again it’s a standard result, but
it’s sort of you do the usual thing that you do. So you want to find
the number of edges in A and if G denotes the adjacency matrix of this
graph then it’s nothing, but you just sum up over all these pairs this
will count exactly what you want. And this you can write it as this,
where chi is the incident matrix of A.
And then you do the usual
thing, you expand this eigenbasis of G. And we know that the largest -. So, one of the eigenvectors is 11111. So I write this chi A as A
over N times 111 plus other vectors which are tied now to this one
vector.
So this is removing the bias.
So I expand out to the usual
thing and the thing pops out. So again probably many of you have used
these arguments. I don’t need to elaborate.
So one of the corollary of this is this Hoffman bound just pops
directly, right, because if your set A is independent then the number
of edges within it should be zero. So you just ask what’s the largest A
I can push so that this stays zero. And if you just do the algebra it
will come out to be this, okay.
Okay, so the useful corollary of this result for our purposes will be
the following. If I look at any set A of size epsilon N where epsilon
is some small constant plus some little term that depends on the
spectrum.
Think of this as a negligible term because lambda is much
smaller for us than D. Then essentially, yeah so what the theorem says
is that this guy will have a lot of edges, right.
This set will be
like a random set of size epsilon N. In other words then, G A has a
vertex of degree at least epsilon D, so this will be useful to us.
Okay, so here is, based on how we do the encoding and again the idea
will be, and in this slide I describe the idea. So again there is some
stable set I in our graph and I want to encode it somehow. So, one way
to encode it is to just write it down exactly, right, this vertex and
this vertex. So that will exactly give you the naive bound that we had
because the size could be up to alpha G and you write log N bits of
each of them.
But here is a potentially more useful way to do this.
So maybe this
picture is useful. It’s not the most exact picture, but it’s a useful
analogy. So suppose there was this independent set that you would like
to encode using a few number of bits.
Now if you could do the
following: suppose you could find a small set S. So this S is supposed
to be very tiny and formally you think of it as this size. So this set
S is contained in my independent set. Suppose I could find such a set,
but this set had a large neighborhood, like a very, very large
neighborhood.
Yeah, suppose I could do that then you know that your
independent set must lie in this remaining region because this set S
rules out all of this red part.
Like this cannot lie in your stable
set.
So for these remaining guys in A I just need to use one bit zero or one
whether it’s in my independent set or not.
In other words this
independent set is completely specified if I could find this sort of
seed set. And in the remainder which of these guys lies in my A? So,
the number of possibilities for I, so if you could give such an N
coding, its N choose S because of all the possible ways of choosing the
set. But once I choose my set it fixes my A and then I only need to
specify two to the A and I need like 2D combinations, like which subset
of that is there. So this will be the idea to count with an encoding
scheme, that’s why these properties. And if you just plug these bounds
for S and A in this it recalls exactly our result.
So why should such an encoding scheme exist? So let me sort of try to
give a sense. It’s almost an exact proof. So here is a way to encode
the independent set.
So initially we said there is some independent
set given by these red vertices that I want to encode and I just don’t
want to list all these vertices, that’s too many bits.
So I will do
the following: I will start with all my vertices and let’s say I will
arrange them in the order of degrees in degree. So this is the maximum
degree of the next one and so on.
And ties I break in some fixed
order.
And let’s consider the following procedure: So I look at the first
vertex, and maybe I have an animation for this, so if it’s not in my
independent set like this blue one I just discard it. And sorry, yeah,
the first guy is not in my independent set so I discard it. The second
one I discard. Now this guy is in my independent set so I choose it.
And if I choose this guy it rules out a lot of other guys from being
right. All its neighbors in the remainder it will rule out. So I keep
doing this.
Now these guys are ruled out so I have to move them and
now in this remainder graph maybe the ordering of the vertices changes
because it’s by degree and then I keep continuing this procedure until
my remainder vertices are this Hoffman’s bound on the independent set.
Because then I can’t guarantee you any edge anymore, maybe its table
set.
So the procedure sort of doesn’t kill any vertices when I kill
something.
So the whole crucial observation is that at the end of the day, when I
stop, when I am left with so many vertices I have picked some vertices
like S. So one thing is that this S that I have picked, like these red
vertices, it completely determines which set is left over.
Why,
because this was a completely deterministic procedure, right?
So for
example if you go in the very beginning, so in the beginning you knew
what was the ordering of the vertices and now at the end of the day you
are left with some subset S and you see the earliest vertex in my set S
is this in this ordering so I must have rejected these two.
And you
keep applying this argument.
So if I tell you this set S it exactly
tells you the reminder set at the end of the day.
>>: [inaudible].
>> Nikhil Bansal: Kind of, yeah.
So this is just a small seed that
sort of determines what’s left over. You can think of it like that.
And there is just some calculation to kind of show that. Okay and what
is the property, right?
So the point is that initially if you pick
some vertices then we arrange these guys in the order of degrees. And
we had this argument that if I look at say some big set then large sets
have a lot of edges between them. This was that spectral lower bound.
So it means the highest vertex should kill a lot of vertices there if I
pick.
So you can sort of do some math and show that because of that
the number of vertices you pick up cannot be too much. Because every
time you pick something you kill a lot of guys. And you can only kill
so much. So that basically tells you that the set S that you pick is
roughly like log D over D times the number of vertices.
So just to recap, maybe it was a little quick. The idea was again you
want to encode some independent set here, you find a subset S which
sort of rules out a lot of things and this set itself is not too large.
So your independent set can only be in this region. And this A is also
completely determined by this set S so you only need this zero or one
indicator and not the whole log N bits.
All right. So this was the part for how to capture independent sets.
Now how do we encode, how do we extend it to matroids?
Now we are
trying to count general matroids, not just sparse paving matroids. So
these are necessarily independent sets.
They could have non-bases
spread all over the place. Okay, so we will use one more idea, but the
main idea will be the following. So again, a matroid can be specified
by saying which of the r-sets are non-bases, but I will efficiently
encode these bases instead of listing each one out there.
And again
the picture to keep I mind is that.
So I will find a small set S and again if I think of the Johnson graph
maybe these are all the neighbors of this graph.
So in the previous
setting I could say in my independent set none of this can lie and my
independent set can only lie here.
But now because we don’t have an
independent set maybe you also include some guys from here as your nonbases.
And that’s the whole complication, but what we will show, so
this is the next key lemma for matroids, that all the non-bases in this
neighborhood of S can actually be specified essentially twice the size
of this guy. And it will be clear what I mean by this.
So even though your matroid could have lots and lots of guys here, lots
of guys as non-bases here, you could just, there is a very small sort
of witness for them. And let me explain what I mean by this. Okay, so
once I prove this lemma the encoding is clear, because I just stole my
set S and the non-bases in A like previously and the only difference
here is that for the non-bases in this neighborhood it could be many,
but I will just have a very small witness which will tell me which one.
So the total number of bits I use is not too much. I just have three
times S log N bits plus zero one bits for this.
Okay, now what do I mean? So what’s the idea that --. So what do we
want to show?
So lots of non-bases here can be specified by a few
number of sets and the idea will be the following: if I look at any set
and it look at it’s neighborhood in the Johnson graph there are these N
squared over four guys.
And maybe lots of these vertices are nonbases, right.
But one way to specify them is to list each of them.
But we can actually give a very concrete representation. So instead of
listing all of these which are non-bases I just described two sets
which will encode which of them are non-bases. And the reason is, and
this is the following structural property, so if X is a non-base it’s
r-set which is a non-base, and there are many non-bases in an X, then
there is some kind of underlying reason for these guys to be non-bases.
So let’s look at the following example: say you have a graph and the
matroid is the spanning tree matroid. So a set like this is a non-base
because it has this circuit.
Now look at the possible neighbors of
this guy.
So neighbors are just when you flip one thing and add
something.
So this is a neighbor, this is a neighbor and you have
other neighbors where you break this cycle on some other edge.
But
there can be several neighbors like this. Now for all these neighbors
I can just list the reason that there is a cycle there. So instead of
specifying each of them with a lot of bits I can say, “Oh, here is a
cycle and anything that contains its cycle is like very concrete”. So,
just this one cycle tells you the whole information instead of storing
everything.
So this is sort of the intuitive idea that there is an
inherent reason why things in the neighborhood of something are nonbases.
Okay, so here is a way to make this whole thing precise.
And again
this is the only place where we use some properties of matroids, but
it’s just this one slide.
Let me go over it incase those of you are
more familiar with matroid. So the key lemma is the following: if you
have a dependent r-set X, so it’s a non-base, then we can associate two
sets which are witnesses for all the non-bases in the neighborhood of
X. Like there could be up to N squared over four non-bases, but I will
just exhibit two sets which give you a lot of witness. So here is the
whole proof of this. So look at some neighbor Y of X. It means that Y
is of this form where you draw some element X from this and add some
element Y.
And let’s look at two cases. Okay, so this is case one: suppose your X
to begin with was a dependent set. So it’s can’t have rank R, suppose
its rank was R minus two or smaller. Then actually I can just take my
witness as this set X because what is Y? It’s just you are adding at
most one element to X so it can’t have rank more then R.
So this is
just a very concrete witness for such guys.
Everything in the
neighborhood is dependent on it. So this is a trivial case.
So the more interesting cases when your X has a rank-r minus one.
Okay, now in that case recall the basic property that X will have a
unique circuit, like span increase might be the easiest.
So there is
one unique cycle.
And what I will show in a second is --.
So the
witness will associated with this X is this unique cycle and what’s
called the closure of X. And a closure of a set is just if I add all
those extra elements which don’t increase the rank of X that’s the
closure.
Okay, so why is this witness enough? So here is the proof. So suppose
there are two sets, X and Y, which are non-bases and Y is a neighbor of
X so it looks like this. And first claim is, and again this is just a
one line calculation, you know so this was a non-base of rank-r minus
one and Y is also a dependent set. So either it must be that when you
remove this element X from X it’s rank fell down to R minus two and
even when you add it up it could not go all the way up to R or it must
be that X plus Y has rank R minus one. So this can only have less rank
and it can show that by just some modularity that either of these cases
must hold.
Okay, good. So if rank of X plus Y is R minus one this means Y lies in
the closure of X, because X had rank-r vertices. So if I give you this
as a witness everything that is in that closure of X tells you, just
this one set tells you all these other sets that we will be explaining.
And if ranks of X minus X was R minus two and notice R minus two is
strictly in the cardinality of X minus X because this was cardinality
R.
So this is R minus one.
So it means that X must contain some
circuit. But that circuit also lies in X, right. And then X only has
a unique circuit because it was of rank-r minus one.
So if just
specify this circuit I that’s the reason why your rank is R minus two.
So if you add any Y it will still increase.
So that’s basically it,
right. So for each it can give a concrete thing.
So just maybe to recap, we kind of, so essentially what we did in the
stable set procedure, but instead we find these sets S, N and A, but
now we also store witnesses of the neighborhood in each X, which is not
too large and that give the efficient encoding. So just to conclude we
showed this gap, yeah, like showed this much stronger upper bound on
the number of matroids, but we still have this one plus order one gap.
And it would be nice if it could be reduced to little of one, but there
is sort of an inherent bottleneck, at least why this approach fails and
probably why maybe any approach based on sparse paving matroids would
fail.
And the reason is that we don’t quite understand how the max
stable set in the Johnson looks like.
This is what I was asking,
because again, Knuth gave this explicit construction of the small size
and the best way we can upper bound is only this. And as long as this
remains you will always have two to the alpha G and this two will stick
up in your exponent. And one will go away when you take log logs and
give that one plus O of one no matter how clever you are on the other
things.
So sort of one natural way to break this gap would be to understand
what really is the truth. And actually this question, but this might
be a hard question I think, but this actual problem of stable sets in
the Johnson graph there are lots of papers in the coding literature.
And people sort of have like simulations of up to NS 24 or something.
I mean there is already a huge graph. So they use a lot of symmetries
and all kinds of things.
And the simulations seem to suggest the
answer is like closer to this, but it’s hard to judge because it’s
something like 1.07 N over N up to 1.3 N. Presumably it’s converging
to N over N, but who knows right, these are small numbers.
But yeah, so like if this is the right answer then presumably --.
Okay, so maybe even if this is the right answer, one problem is that we
don’t know of any other general techniques to upper bound stable sets
other than these eigenvalue methods or [indiscernible] method which is
also sort of similar. But maybe one could use stronger less hierarchy
or something to upper bound it and that’s the only technique I know.
And perhaps if you got a method for certifying that upper bound was
actually this then maybe one could on top of it combine with matroid
techniques to actually do the same [indiscernible] that we didn’t, get
the right bound on M,N, but that’s probably a long program.
Okay, so that’s it and thank you for your attention.
[clapping]
>>: So this would imply the number of independent sets [inaudible]?
>> Nikhil Bansal: Yeah, so assuming, so again we don’t know how this
machine will look, but you would have to kind of carry it out, right.
Because for us we used the Hoffman’s bound with two N over N. And then
on top we built all this eigenvalue, da, da, da to actually show that
most independent sets come from just this guy. It’s not like N chooses
this, but more like two to the N over N.
>>: Did you ever use the structure of [inaudible]?
>> Nikhil Bansal: So we used eigenvalue arguments in that to show such
a thing. So this was the --.
>>: [inaudible].
>> Nikhil Bansal: But hopefully, yeah, so that would be one piece one
has to do.
>>: No, but exactly what did you [inaudible]?
>> Nikhil Bansal: Yeah, exactly. And actually the theorem we can show
is only this. And the reason why is because this is good is because
this is like two to the alpha D, but maybe more tight then this. It’s
not clear whether there is technology to push that. But, presumably it
should if that’s the right thing, the black box reduction.
Is there anything else?
>> Konstantin Makarychev: Any more questions?
>>: [inaudible].
>> Nikhil Bansal: Yeah, [indiscernible] function is exactly gives you
Hoffman bound. So there was a paper from [indiscernible] back in the
70s and all these things. Yeah, so all these [indiscernible], data
function, eigenvalues, they are apparently all the same. I don’t quite
understand why, but, so.
>> Konstantin Makarychev: Let’s thank our speaker again.
[clapping]
Download