23860 >> Yuval Peres: Okay. Good morning, everyone. ... dense subgraphs is one of my own favorite topics. ...

advertisement
23860
>> Yuval Peres: Okay. Good morning, everyone. So actually in algorithms, finding
dense subgraphs is one of my own favorite topics. So I was delighted to see the title of
Aditya's talk. Please.
>> Aditya Bhaskara: Thank you. Thanks a lot, Yuval. So I'll be talking about finding
dense subgraphs and some other problems that are related to it. And, okay, so the
general theme of what I'm going to be talking about is that you're given a huge graph and
it could arise out of a variety of situations. And we want to find in new subgraphs which
are dense.
And in a vague sense, it has applications to a lot of things. For instance, detecting
communities and social networks is a natural application. And also it's been used for
detecting what's called link spam and Web graphs.
And of course it has a clustering feel to it. So it's used in a lot of clustering type
problems. So let me describe some of these a little bit before we move to the actual
problem.
So first is the example I consider social networks. So we have these graphs that arise
from like friends of people and things like this. And so usually finding dense subgraphs
here means you're finding communities. Let's say people who belong to an institution
and things like this.
And they usually have much more edges inside the community than they do to the
outside. So this is one kind of an example where you would want to find dense
subgraphs. And finding such things are useful because maybe they all have some
common feature that you'd like to exploit.
And the second example I spoke of is what's called detecting link spam. So this has
been used by some researchers in, somehow trying to avoid the problem where people
keep linking to each other's pages just so that their page rank improves.
So one way of detecting this is to find small sets of vertices that have too many edges
among them. So then they're probably just trying to fool the system.
And one thing about this example is that it's different from the earlier one in that you
really have a small -- you really want the sets to be really small. So maybe there are big
organizations where there are legitimately pages linking to each other and you don't want
to find those.
Okay. So how do we formalize such a question? So here's a first cut. So this is a
problem that has been called a maximum density subgraph. And the problem here is so
you're given the graph and you want to find a subgraph H so that you maximize the ratio
of the number of edges to the number of vertices.
So this ratio will be called a density throughout this talk, and we want to maximize this.
And it turns out that this can be solved efficiently. So in any graph you can find a
subgraph that maximizes this quantity.
And it's nonsense the '80s and I can give you the reference. So it's based on a
flow-based -- it's a flow-based algorithm and there are also very fast algorithms that have
say approximation ratio two for this.
But so note that there's no restriction on size, which turns out -- I mean people actually
care about it. There's some applications in which we don't really want -- we don't want to
output graphs that are too big.
So one way of capturing this is to have explicit restriction on the site. So the problem I'm
going to consider more detail is one where you want to find a subgraph H or at most K
vertices and you want the number of edges to be as large as possible.
So this is a very natural problem. And it's what's called a densest K subgraph problem.
And as you can see, the key here is that it's some natural optimized problem. You have
these small support constraints. You want only K vertices.
And these -- it turns out this is a general challenge to handle constraints of this nature.
Okay. So so far we've seen practical motivations. So what about motivations inside
theory?
So here's a problem that has recently been studied in connection for the unique games
conjecture. It's what's called a small set expansion problem.
It's really simple to state. So you're given a graph, which is of average degree D let's
say. In fact, let's assume it's regular and has degree D.
And you want to distinguish between the two cases. The first is when there is a small
set, let's say of size delta N. Delta is given to you, in which almost all the edges are
actually inside. So 99 percent of the edges from inside this actually stay inside. And the
second case you're promised that for any H of size at most delta almost all edges go out.
So this is a promise problem. And you want to distinguish between these two cases.
And it turns out that we don't know how to do this. So the small set expansion
conjecture. So it says that it's hard to distinguish between these two. It turns out to imply
unique games conjecture, which is something that everyone is interested in.
But as I stated it, it's easy to see that it's actually a sub case of the densest subgraph
program, because you can solve densest subgraph you can solve this efficiently.
So it's strictly easier than densest subgraph. But we think I'll try to convince you during
the talk that densest subgraph is a much harder problem than this.
So what do we know about densest subgraph? And let me point out first the negative
results. So we know it's NP hard because generally [inaudible] code trivially. So it's
been shown that there is no polynomial time approximation scheme. So you can get a
one plus epsilon approximation for every epsilon.
So this has been shown. And further it was proved that if you assume something more,
so if you assume something more thanNP hardness, it's what's called a random three set
assumption. Then you cannot approximate it to a factor better than 1.5.
And the first was -- the first is Erdos-Wash code [phonetic] and the other one is
[inaudible]. So what do we know with respect to algorithms? The best known algorithm
before what I'm going to say is for something just close to an N to the one-third factor
approximation.
This is due to Feige Khot 15 years ago. Now, notice the big gap between the two results.
Right? On one hand we can rule out a factor 1.5 approximation, even assuming some
fairly fancy hypothesis. On the other hand we know something like N to the one-third
approximation.
And this is true for reasonably natural problem. So what -- that's what we know. So then
what do we conjecture is true about this problem? So we conjecture that it's actually
hard to approximate to some small -- to some factor N to the C. So we think that the
algorithms are actually tight and the machinery to prove hardness is probably not as well
developed.
And in fact so one of the messages in this talk will be to say that given a factor of N to the
one-fourth is a difficult challenge. So there's a natural barrier at N to the one-fourth is
what I want to say.
Okay. And the other point is that this is a problem which even seems to be hard on
average. So I'll describe a distribution over inputs, over which we don't know how to
solve this problem. And so all the known algorithmic tools fail. And it turns out that
people have even used this as a complexity assumption.
So they've assumed -- so if you assume the densest subgraph is hard then you can show
some other things. So in general if a problem is hard on average something that
algorithms people feel sad about, but there's some people who feel happy about this.
And turns out that cryptographers are one such bunch.
And so this -- so there's a public key crypto system which is based on the average case
hardness of densest subgraph. So how does this -- what's the rough idea? So the
crypto system uses a bipartite graph in which you have some edges going across and
somehow the dense subgraph hides the private key. You can think of the graph as being
a public key. Everyone knows that.
But unless you know the dense part you can't quite decode it. So I'm lying a lot here but
that's the rough idea. Okay. The other paper which uses this is it's about pricing
financial database. So this is a recent work by [inaudible] and a few others at Princeton.
So the problem is the following: So you have a bunch of assets. And the question in
these derivatives is that you want to bundle these assets together and you want to form
derivatives out of these.
And the problem could be that if you know -- if someone who is bundling knows which
assets are bad, then you can gain by bundling the bad assets together.
But the person who is bundling them is just saying look I bundled them randomly. If
densest subgraph is hard then it's hard to tell between the two. So he'll tell you this is
how I bundle the assets and they look random to me.
And you can't contradict him. So this is one thing that -- this is one other application of
hardness. Okay. So here's a rough outline of the talk. I'm going to give an N to the
one-fourth approximation for the problem. And the key thing about this is not the
improvement in the factor but the fact that we somehow have an average case version
which we can solve and using the ideas from there we can move to the worst case.
So that's something that I think is interesting about this. And okay as I said, we don't
know how to prove hardness for this problem. So one thing we could ask is, okay, so we
know all these techniques, like linear and semidefinite program hierarchies. So do they
at least say something?
And can we say that these things don't work? So that's the roughly I'll speak about that.
And I'll also talk about other continuous relaxations that you can write. I'll get to what
these mean in a second.
But, okay, so the last two talks will be -- the last two parts will be a little speculative, a lot
of hand waving involved. And at this point I should mention my co-authors in a lot of
these works, Moses, [inaudible], Eddie Clampta [phonetic], Yuri Feige, Irvin Van,
[phonetic]k who is also a fellow student at Princeton, and Rad Guswany [phonetic] and
[inaudible] too.
But what I want to remember is that about this problem is the significance of the average
case. So somehow it seems to be the key to understanding the problem. Okay. So
without further ado let's go into that problem.
So first consider this simple problem on random graphs. So I want you to distinguish
between two classes of graphs. One in which I just give you a random graph with G and
P, degree N to the one-third. And in the other case, so I take G and P with the same
probability. And I take a bunch of vertices. Let's say I take square root N vertices and
inside them I add something like N to the one-fourth edges.
Sorry. N to the one-fourth times its size edges. I make the degree artificially one-fourth
to see you've actually done something nontrivial. Not that any case I subset here has
degree very small. Take union one case, the simple union one case.
Okay. Now, so here it looks like a big gap. So here any case I subgraph, very small
density, and here the degree is N to the one-fourth. Now, a question is how do you ->>: Degree, you mean average degree?
>> Aditya Bhaskara: Yes, yes. Let's say I make every degree N ->>: On the left when you say -- okay.
>> Aditya Bhaskara: Oh, yes, yes. I mean, the average degree. The number of vertices.
Yes. So how do we tell between these two? And so one way is -- okay. So let's think
about this slightly simpler question.
So I have two random graphs, both on N vertices. One you have degree N to the
one-third. Here you have to the one-half. How do you tell? You can see the degree in
obvious things. But I want something more -- but the question I ask is suppose you pick
two vertices U and V. I ask how many length three parts are there between these two?
So in the first case you can do a simple calculation to show that for any fixed U and V,
the number of length three parts will only be log N with high probability. It will be O of log
N.
While in the second case, you can see that -- I mean that in general the number will be
something like square root N. In particular, there at least exists a pair where the number
of length three parts are through N. Simple calculation. How does this fit in? Now we
have a graph, a big graph degree N to the one-third inside which there is something
planted.
Now, keys to observe that this graph H, which was a random graph with degree this,
right, so this is like a mini copy. So it's a graph on K vertices and its degree was scaled
to the half.
So this is kind of like the second example here, except it's a K instead of N. And luckily
for us there were the number of lengthy three parts is K to the half which is still much
larger than poly log N.
So it's important that in this test we got log N as the answer here, and N to, some
polynomial N is the answer here.
So a simple test to distinguish between the two is check if there exists some U and V with
log square N length three paths, because in this case there will be nothing. And here
only things inside H, there will always be some ->>: The edges?
>> Aditya Bhaskara: Yes, three edges. Four vertices. Including U and V. All right. So
this is a simple test. Okay. And, yeah, you can also check that we don't really need H to
be random. So because the test we're doing is if there exists two vertices with many
length three parts, it turns out by accounting argument that as long as you have K to the
half edges inside this K subgraph you don't really -- there always exists some pair test to
the half.
>>: What do you mean the minimum degree to that [inaudible].
>> Aditya Bhaskara: In this case I want the average degree to be that. So we really
need the average degree. But you can think of the minimum degree. It's fine. All right.
So this is a simple planned regression. So as you can see, this exponent of K is what
mattered here. That's what we'll make formally in our definition. So we define the log
density of a graph to be the log of the average degree divided by the log of the number of
vertices.
So that means if you have a graph on N vertices you say its log density is row if the
average degree is N to the row. And okay. So and the general result that we'll try to
prove is that if you have a random graph G and P, with log density some rational number
R over S, and in the other case you have random graph but something planted inside, but
it has log density slightly more.
So it's R plus epsilon then you can distinguish between these two. So that will be the
case. So note that in previous example it was one-third and half.
And right. So in this case what we could show was that you look for pairs with many
length three paths if the density is strictly more than one-third then you can distinguish it.
So one-third was kind of a threshold here. So anything more than one-third you can
distinguish. There will be many length three paths. Now, the natural question is can
we -- does this threshold thing work for any row? So that will be our claim. So another
example I'll give is if you want log density 3/5. So this means the degree is N to the 3/5,
and it turns out the right things to count. So earlier we were looking at two vertices and
looking at length three paths, now the right thing to do is to look at four TUPLs of vertices
and look at structures of this kind.
So you fix U, V, W and X. You look at how many such structures there are in the graph.
So this is one and maybe this is another and so on.
So you count these structures. And if log density is strictly more than 3/5, then there will
be many such structures. And then -- and in general, if you want -- so what we do is we
count appropriate tree-like structures. Note for 3/5 we had the string. And we count the
number that are supported on a fixed set of leaves. By supported, I just mean these are
the leaves and there exists some tree in the graph which is this kind.
And we pick these such that the expected number of structures for any set of leaves is
roughly constant. So the structures are chosen carefully like this. But we also it turns out
we need some kind of a concentration result, because we weren't sure for any set of
leaves this bound is true. That the number of structures is not much more than poly log
N. So this means you can't choose any tree to be slightly more careful.
Okay. So in general, what kind of trees work? So it turns out that we can take inspiration
from [inaudible] and what are known as caterpillar graphs will turn out to work.
And so caterpillars are graphs that roughly look like this. I won't describe how they look
for general R over S. But roughly speaking it has R plus 1 leaves and it has so many
vertices.
So I won't say more than this. And the properties that we will have are that if you look at
G and P with log density strictly smaller than R over S, then you pick any R plus 1 TUPL.
There will only be poly log N such Caterpillars. And if you pick any graph with the log
density some epsilon more than R over S then there will exist a leaf TUPL with more than
K to the epsilon Caterpillars. So this will be the distinguishing test.
And so this leads us to the claim that I set, the distinguishing claim. So now let me give
an example of how we prove a statement of the first kind. So I'll take an example of a
caterpillar and show how we show properties like this.
So it's a simple probability calculation. So I'll just do it for one value of the parameter. So
let's think of R over S being two fifths. In this case it turns out the caterpillar is like this.
So we fix UVW. We want to find how many such structures there are with UVW as the
leaves.
And the idea would be to bound the number of candidates for each of these internal
nodes. By this I mean A, B. Cs and at each would be log N the total number of
structures you can only have log Q.
Okay. So let's look at some guy C. Okay? Now, okay so how do we bound the number
of things. So we want the guys for A to be neighbors of U. So they're only like N to the
2/5, the degree was N to the 2/5, once you fix U you look at how many things are there
for B, 2/5, two level neighbors. But for these the candidate Bs are all neighbors of this
fixed vertex B. So the number of such guys is only P times this, because you need an
edge to be.
And so the number of candidates for B at this point is N to the one-fifth. And then you get
the number of Cs that you can have only with this information, this N to the 3/5, because
they're all neighbors of this N to the one-fifth sized set. But of those you only want things
that are distant to W.
And by the way, we chose the caterpillar, it turns out that P times N to the 3/5 only equals
1. And you can make each of these statements with high probability statements. So you
can say that the number of neighbors -- these numbers will be concentrated on the edge
value.
And at the end you can get this kind of a statement that the number of candidates for B,
for C, probability that it's more than 10 log N is something like 1 over the 2 N. It means
for every choice UVW this holds and this finishes the proof.
Okay. So we can do this for any caterpillar R over S and this is the theorem that we get.
So that if you have something planted slightly more you can distinguish in time 1 over
epsilon.
Okay. So now what about arbitrary graphs? So that was all in the random case. Now,
what can we do in arbitrary graphs? So, okay, so we wanted row factor approximation to
densest subgraph. That was our stated goal. By that we mean -- okay. So let me fix
some parameters here.
Let N be the number of vertices in the full graph. And I'll denote by capital D it's degree.
And for now I'll think of it as regular and being degree D. Can easily reduce it to this
case.
And H will be a graph on -- so H is the optimum for this problem. And it has its own K
vertices and it has degree small D. So just remember N and capital D for the full graph
and optimal solutions has KN small D. And H is what I will call out.
And the aim will be to find an H prime with average degree little D over row. And so this
is the definition of row factor approximation. So these are things where it can assume
with very easily. So we can assume these without loss of generality. So we can think of
G as being bipartite. So we're interested in the density of graphs up to constants, we can
assume the graph is bipartite. And we are interested in N to the one-fourth
approximation. So constants don't really matter.
And we can think of the graph, the full graph as being D regular. And another thing is we
can assume that this optimum graph H has minimum degree little D. Instead of average
degree. So we can do the usual thing of removing vertices that are smaller than D by 2
and so on.
And also we are allowed to return a subgraph with average -- which is of size smaller
than K. That's because we can just remove it and recurse.
And so this assumption is nice because we'll keep looking at sets of vertices in the
neighbor and it's nice if they don't intersect. So another crucial assumption which I'll not
justify is that we only -- the only interesting case is when K times capital D is N.
So K was, remember, a parameter given in the problem. So it was -- so K is the size of
the subgraph that you want. And so this is only interesting case from the point of view of
the approximation. So other cases can only have a better factor.
So this I will not justify. But one implication of this is that if you pick a random H, then it
has density around constant. And our aim will be to beat this constant by some factor.
And okay. So this is an assumption we can make. And okay so now let me state the
theorem. So suppose we have N and T. And let's think of capital D as N to the row for
some row. Now, the theorem says that if little D is some C times K to the row, then I can
find a subgraph, which is small number of vertices, with average degree at least C. So C
is the factor by which little D is more than K to the row.
If little D is smaller than this, then no guarantees. I'll just return -- I'll just return some
arbitrary matching or something like this.
And note that the approximation ratio here is K to the row. Because there's C times K to
the row, you're returning C. And by the way we chose parameters, so D was -- K times D
was N. So the approximation ratio will be something like this. And log B will be better
than 1/4. Well, at this point there's no real surprise that the proof will involve caterpillars,
because as I said it's very inspired by the random case.
And it will involve caterpillars that correspond to row. Now row need not be a rational
number which is why we get N to the 1/4 plus epsilon here. It's a technicality. Let's
pretend row is a rational number.
So a simple observation is that -- so we want to return a subgraph which is only on K
vertices. But it turns out it's fine even if we return a bipartite subgraph, which is of the
right density. So it has number of edges to be C times A plus B. So that's the number of
edges divided by the number of vertices.
And one of the sides has smaller than K. So once you have this, it turns out you can
easily get to both sides being at most K. So this is what we'll do. Okay. So now I'll prove
approximation guarantee -- I'll prove the theorem that I stated. In the case of one
example, it's R over S being one-third.
So this would be a little technical. But as often, a talk is supposed to have one joke and
one proof and it's good if they're not the same. So I think so this will be the proof, and
hopefully there will be other jokes.
Okay. So all right. So R over S is one-third. So the setting we have is the degree is N to
the one-third. And little DC times K to the 1 third and K is N to the two-thirds because we
thought of K times D as being N.
So this is the setting. And now the idea is the following: So in general we'll be looking -let's start with some vertex U. And look at its set of neighbors.
So we have the following: So we know that the set of neighbors has size at most D.
Because the graph had degree D. Now, what we know is that if U was inside H, then the
minimum degree in H was at least small D. So we know that the intersection of the
neighborhood UH is at least small D.
So let's pretend that we guess some U which is inside H. So for purposes of this
problem. So then we have these two bounds. Now, I'll claim that if we look at the
neighbor of neighbors of U, then we have a fairly strong guarantee on the size of S.
So what do we know about the neighbor of neighbors? We know it's at size most D
squared because they're D neighbors and can expand to D squared. And I claim that if
you look at its intersection with H, which is the optimum, then it has a size at least little D
squared over small C.
That is, so there was this graph R. There was this set R, and I want to say that R almost
expands fully in H. So inside H the degree was at least small D. And the best it could
have expanded by was something like R times D. I want to say that it's close to that.
All right. So why is that true? So this is just a copy of the claim. Okay. So suppose that
is not true. So now let's look at this subgraph, which is formed by R and S. So now that
subgraph should have had a ratio of number of edges to its vertices to be at least C.
Right? Because we assume that S was small. We assume that S was smaller than R
times D over C. And the total -- and the number of edges between the two is at least D
times R. Because minimum degree inside H was little D.
Okay. So we have a bound of -- so we have subgraph of density at least C inside. But
how do we find it? Now we can solve this max density subgraph problem that I earlier
defined. I said that can be solved in poly time.
The problem there was that it was giving -- so it could give graphs that have much more
than K vertices. Right? But that's okay here because K is N to the two-thirds. And the
best you can come up with from here is something where you have N to the one-third
guys on one side. And that's fine.
So if there existed such a -- so if this bound did not hold, then we are done. So this will
be the style of the argument. We'll have some inequalities of this kind. We'll say that if
those bounds do not hold, then some local solving like this will actually give us a dense
subgraph.
So we can continue this line of reasoning. We can say that now work with this. Look at
its set of neighbors. And we can say that if you looked at T, which is the neighbors of S
and intersection with H, H was optimum subgraph, then it has to have a size at least
something like this.
If you work out the parameters, we can show that the size of T should be at least little D
cubed by C squared. And T is the set -- is the intersection of this third neighbors with H.
Now, okay, if you stack carefully this can't be, because the way we pick parameters, the
size of H was K, while this D cubed by C squared is strictly more than -- it's more than C
times K. And the T was the intersection of some set with K.
It can't have a size more than K. So this means that at some step you must have found a
C dense subgraph.
And this finishes the proof. So one of these things should hold, otherwise you will have
found dense subgraph or you get a contradiction.
And all right. So this finishes the proof.
>>: You want the smallest side is obviously smaller than --
>> Aditya Bhaskara: Yes. Right. That's crucial. So we should note that the second side
is -- that one of the sides is smaller than K. And this will hold because of our choice of
parameters.
So all the parameters should be chosen carefully, so that one of the sides will always be
kept smaller than K and then we do this. Right. Okay. And even for arbitrary cat pillars
this will be the general outline. So we'll always have some set of vertices W. And we say
that we'll do some simple procedure on it. And if it fails, then we'll come up with a good
bound that holds.
And if we continue doing these bounds, then eventually we get a contradiction. So this
means that some step you should have found a dense subgraph.
All right. So this finishes the algorithm for dense subgraph, or at least this is all I would
say about it. And, okay, so now what about lift and project methods. So I'll define it in a
minute. So these are a class of techniques that have been studied pretty intensely of
late, which is okay here's a standard formula we use for solving problems.
So we can write an integer program, which is you want to maximize some objectives,
subject to things being 0-1. And we usually relax it so that you have conditions between
0 and 1. The variables between 0 and 1 so they can have fractional values.
And if the integrality gap is small then that means the objective value is approximated
reasonably. It turns out for many problems the gaps are huge. And one way of dealing
with this is to add more constraints to the linear program.
And there are these ways of -- systematic ways of adding constraints which are called
lifting operations. I won't define them in general.
But the idea is somehow that you'll have variables that correspond to pairs of events and
things like this. Pairs, idea and so on.
And so it will turn out that this has a strictly smaller integrality gap or at most as big a
integrality gap as this linear program. And there are systematic ways of writing more and
more constraints. So think of this as a polytope that you define. And you repeatedly are
chopping off parts of it.
And the hope -- there are ways of defining these so if you do N steps of this, then you
actually end up with the integer program.
So then all the counter points will be the actual interpoints that you started off with. And
there are many ways, many systematic ways known of doing this. So the two things
commonly used are what are called Sherali Adams hierarchy and the Lasserre hierarchy.
So we won't need the details of this. But this is the general outline of the thing.
Okay. And the crucial thing is that the Kth level of -- the Kth level in this hierarchy can be
solved in time roughly N to the K. So that's why it makes sense to ask this question. So
after small number of rounds of this, do we have a very small integrality gap? So
because then let's say after some K rounds we have a very small integrality gap. Then
we can solve the problem roughly in time N to the K.
So we ask this question for densest subgraph starting with natural relaxation for it. And it
turns out that you can't do this. So we showed that even after some T rounds of Sherali
Adams, where T is fairly big, T is something like log N. So solving this would already
need quasi polynomial time. So the integrality gap remains roughly N to the 1/4. So N to
the 1/4 is what our algorithm could achieve.
And somehow we don't seem to do better than that. And also the gap remains N to the
epsilon even after close to linear number of rounds of the Lasserre hierarchy. I want to
say Lasserre hierarchies is something like strong STP hierarchy. And most of our
algorithmic tools somehow seem to be captured by small number of rounds of it.
And the fact that even so many rounds of it do not help is strong evidence that the
problem is actually hard. Okay. So it in fact shows that we need really new techniques if
the problem can actually be done better.
So natural things like counting, spectral-based techniques don't work. And it turns out
that the gap instances are actually random for this guy, which is nice because, which is
interesting, because it sort of reaffirms our belief that random instances are a difficult
one.
And then let me finish the densest subgraph part with this interesting question. So it's -so let's say we want to distinguish between these two cases. One you have square root
N degree random graph. The other case where you have subgraph H inside with degree
something like N to the 1/4 minus epsilon.
And note here that if you look at any root N sized set its degree is only at most log N. So
this should be log N. And the question is can you distinguish in poly time. And this we
don't know. And this is why the N to the 1/4 roughly comes into the approximation.
So if we can do this better then we think we can solve the general problem better. So
what are the other continuous relaxation problems.
One such thing I'll talk about is what are called mixed nodes. But let's get into that. So,
okay, consider this problem. Given a graph, I want you to find the set of vertices S such
that the number of edges inside S divided by the size of S to the one plus delta is larger
than one.
So this is something we just saw. This is like saying find an S of log density more than
delta. But you can rephrase it in another way. You can say, try to maximize this quantity.
So X transpose AX for any 0-1 vector. Measures how many edges there are inside S. If
S is the indicator vector, if FX is the indicator for some X it measures how many edges
are inside.
And this roughly measures size of S to the 1 plus delta. Wrote this down so it's invariant
in the scaling. Although it doesn't matter in this case. But it turns out that this kind of a
problem has roughly the same feel as problems of this kind. These are much more well
studied. So X transpose AX is roughly like A X2 squared. So it's actually for the square
of the matrix.
But anyway, so let's not get into the details. But these problems have roughly a similar
feel. And it turns out these things are well studied. They're called what are
hypercontractive norms. So you want to maximize over all RN now. So over 0-1 to the N
over RN doesn't matter. They're similar to a log N factor.
And so if you can show that this norm is small, so A is a distance matrix of a graph. You
can show this norm is small, then you can certify that small sets in the graph actually
expand.
So it basically means that this quantity is small, which means that there are no dense
portions in the graph. And motivated by this, and I'll mention some work which is kind of
different from what I've been talking about, it's about computing norms of this kind of so
this is one other question that I've tried to study. It says you're given some matrix, and
you want very natural problem, you want to maximize the P norm of AX divided by the Q
norm of X. And you want to find out when you can do this.
So note that it generalizes the largest singular value and also this well-studied problem
called the growth ND problem which you can get for certain values of P and Q. Okay.
So the question is when can you approximate these?
It turns out this is very badly understood. And the one case where we did understand, I'll
just mention a bit, but in its full generality its maximizing a convex objective or convex set.
You want to maximize AX norm P subject to the Q norm of X being small. So this is
something we can't do by normal convex optimization. But there are some cases where
we can actually do this. And this is in a joint work with Irvin [phonetic] last year. So if you
have such a problem and P is smaller than Q, then you can solve this problem and the
entries of A are all nonnegative. If you have like adjacency matrix of graph or something
you can actually solve this.
And it turns out that even though it's not a convex optimization problem, there are some
features of it that you can actually solve, that you can exploit. For instance, we can prove
that the level sets of this function -- so you look at this function and you look at its level
set, they actually can't be connected.
So they're not convex but they're simply connected. And they also -- they have this nice
property local optimum so you can actually solve it.
And it turns out the solution is by fixed point iteration, which we can show converges fast.
And so this is when entries of A are non-negative. We also know what happens if you
allow arbitraries. You can reduce from max cut and show it's hard to approximate -- this
looks complicated, but if it had been one here then it's polynomial. It's something like N.
So it's something that's kind of like polynomial. So this is the kind of guarantee that you
often get by parallel adaptation, if you know.
So we know it's very hard, if you allow arbitrary entries, and if you have non-negative
entries we can actually solve it.
Okay. All right. So what are some directions for the future? So we ideally like to
understand the complexity of this sparsity constraint, because this somehow seems to
arise everywhere. So it's not just small subgraphs in graphs but there's applications like
compressed sensing where you want to show that some matrix has, for instance, the RIP
property wherefore sparse vectors, for sparse vectors, it's like an isometry and things like
this.
And it's also related to hypercontractive norms as I was saying. And it's nice to have a
better understanding of these. For instance, it's not known if computing these is hard.
And this is something that would be nice to know.
The other feature of our algorithm was that we did something on an average case and
then we could carry over such techniques for the worst case, right? So it's this kind of
paradigm useful in any other context, so that would be nice.
And another thing is that you can try to relate these things to more exotic problems, like
there's this question of tensor maximization that a bunch of people have studied, it also
appears elsewhere. So it would be nice to relate to this.
Okay, since it's kind of an [inaudible] talk let me generally mention other things I kind of
think about. So one of the questions which I was kind of -- which I was talking about a
little earlier is the question of eigenvectors and G and P. When P is reasonably big.
So we know a lot of things about eigenvalues, and like there's this Vigna semicircle and
so on. But people don't know much about eigenvectors. So, for instance, our entry is
well spread out and things like this. And we tried to take some basic steps in this. We
can do it for some very simple cases. It's joint work with Sandeep. And it's kind of like a
foray into this, but we only have preliminary results. I'll be happy to talk about this.
Okay. And another thing that I've recently worked on is using some tools in convex
geometry to study a question that arises in differential privacy.
So basically it's like I used to think of privacy as being something mysterious. Differential
privacy. But this is one I know something about it now. Okay. What do I mean by
unconditional here? There's something called a hyperplane conjecture and convex
geometry. So there was an algorithm -- there was a mechanism for privately answering
certain queries, and it was shown to be good if this conjecture holds. And we showed
that you can find a different mechanism which actually avoids depending on this
conjecture.
So that is work that I did with [inaudible]. Actually, you guys must have seen them
yesterday. But okay. Thanks a lot. [applause].
>> Yuval Peres: Questions?
>> Aditya Bhaskara: Questions?
>>: Is these -- so you call these caterpillars, is that good or if you want call them possibly
some other structures?
>> Aditya Bhaskara: Yes, for the random case you can actually do. Suppose you have
random graph of some bigger log density inside another random graph. You can count a
lot of things. There are all these -- you can count like four cycles and things like that.
And they all have some thresholds at which they start appearing.
So you can do those. But somehow caterpillars, we found, are very useful in the general
case. So we don't know how to handle this. Also, there was this other -- there was this
other good thing about using trees, which was that we could show that the planted thing
need not actually be random. So it could be any subgraph that has long density strictly
more than the original graph.
So that happens only when it's a tree, because we use this counting argument and then
you get, there has to be some leaf TUPL on which there's lots of things. That's the
reason that we use caterpillars.
>>: That's the reason to use trees. But at most -- not the most general trees.
>> Aditya Bhaskara: Right. But, actually, I guess the fair answer is there might be other
trees. But these were the simplest we could find.
>>: [inaudible].
>> Aditya Bhaskara: Sorry?
>>: Could it perhaps improve the -- if you try to distinguish between two particular powers
then there might be a better tree.
>> Aditya Bhaskara: Yes that would be nice if ->>: Complexity.
>> Aditya Bhaskara: Right. Right now we pick some rational number between delta and
this plus epsilon. And we only have this one, N to 1 over to epsilon. You could bring it
down to something better. That would be nice. But I don't know how to do that.
>>: All trees of certain size.
>>: I guess it would be by Q or something like that. So RS. You need R leaves and S
total nodes. That's the caterpillar --
>> Aditya Bhaskara: Right.
>>: I guess if you pick with any tree with these two things it might work, no?
>> Aditya Bhaskara: If you picked -- no, no, no, it doesn't, because you need this
concentration bound that I was saying. So if you can use the board, I can draw one
figure that will not -- so suppose I picked trees that are kind of lopsided. So this will look
obviously lopsided. But it's an example.
So let's say I pick a tree -- sorry. I pick a tree like this. I claim that this is a wrong tree
and while this is the right one. The point is that if you picked any UVWX, and this is for
log density being 3/5, this is an example that I showed, so that means P is N to the minus
2/5, because the degree was N to the 2/5.
Now, so what happens here is that there will be some four TUPLs UVWX for which there
are lots of such trees. So in expectation it's still true that the number of trees is for some
UVWX is only constant. But whenever these three guys have a common neighbor, there
will be lots of trees. And because of the test we are using is if there exists some four
TUPL it has lots of trees. So we can't use things like this. So somehow we need these
numbers all to be between in this case the problem is that the expected number of
candidates for this guy is smaller than one. And whenever it is one, whenever there is
some common neighbor it turns out there are a lot more for all this.
>>: [inaudible].
>> Aditya Bhaskara: Right.
>>: Need some balancing.
>> Aditya Bhaskara: Yes. There are still ->>: Balanced binary tree.
>> Aditya Bhaskara: Right. I think they will work ->>: R and S liquid depending on R and S [phonetic].
>> Aditya Bhaskara: I think as long as you ensure, if you do some calculation and find
some expected guys for each node, if you find that number is between 1 and N, right, it's
not strictly -- it's not less than 1 or something, then I think it will work.
>>: The question is will it be beyond what [inaudible] get improvement for it?
>> Aditya Bhaskara: Yeah, not that I know of. Sorry.
>> Yuval Peres: Okay.
>> Aditya Bhaskara: That would be very nice. Thank you.
[applause]
Download