17531 >> Yuval Peres: Hello everybody. Our speaker today...

advertisement
17531
>> Yuval Peres: Hello everybody. Our speaker today is Nikhil Bansal from IBM Research. Nikhil finished
his Ph.D. in 2003. He got it from CMU working with Avron Plumb [phonetic]. And since then has been at
IBM Research mainly doing work on various areas in theory of algorithms, online algorithms, approximation
algorithms and other stuff. So he'll talk today on better multiple intense reranking.
>> Nikhil Bansal: Thanks for introducing me and thanks to Microsoft for inviting me.
So my title of the talk is better multiple intense reranking, and I chose this topic because it's sort of
follow-up on what you see as [inaudible] visiting here for the past two or three years was doing. I thought
this would be interesting to people here.
And this is joint work with Anupam Gupta from CMU and student Ravishankar Krishnaswamy.
So let me begin by defining what this problem of multiple intense reranking is. This problem was originally
defined by Anzar Gamzu and Yin in 2009 [inaudible] paper. So quite recent. The problem is very simple,
and it's the following: So you are given a universe of elements. I will denote them from 1 to N. And you're
given a collection of sets on this universe.
So denoted by sub 2 SM. And for each set there is a covering requirement. So it's an integer which will
denote by KS. And so this is your input. And the problem is you want to find the ordering of this universe
so that you minimize the total cover time of all the certs where a set is a set to be covered. So if it is
considered a set S and it's covering requirement with KS it's said to be covered if you see KS elements on
that certain ordering.
So to give you an example. So suppose you have three sets S1, S2, S3 on the universe 1, 2, 3, 4, 5 and
suppose all have a covering of 2. And if I have reordering of elements 1, 2, 3, 4, 5 then the cover time of
the first set is 2 because the elements 1 and 2 cover it because it's covering requirement plus 2. Similarly
for the second set, its cover time is 3 because 2 and 3 together can cover it and so on. So the covering
requirements are 2, 3 and 4. So is the problem clear?
So one remark. So originally the way the problem was defined in this paper by Azar et al., they defined it
slightly more generally. So each set has a weighted vector or weight vector called WS, which has some,
again, integer entries. And what this vector means is that so the first time you see an element from set S,
you look at its time, times W1 and then the second time you see an element from S you pay W2 times that
duration. And so on.
So, for example, in this setting I've described above are WS vectors are just 0001. There's only one in the
KS position. When you see the KS elements, you pay the duration until you see KS elements.
>>: So it's the ordered sets, so the ordering letters?
>> Nikhil Bansal: No, so these sets are unordered. These are just sets in the usual sense. And your job is
to find an ordering of these elements. Okay?
>>: Can you talk about intent, you said intent and you get to see KS documents you're satisfied that intent.
>> Nikhil Bansal: So let me exactly, I'll give this example in the next slide and hopefully it will become more
clearer. It will be clearer why they call it multiple intense reranking. The motivation of Azar et al. was the
following, suppose you have life search and you are doing a query for giant. Now this query has say end
results which you can display on the first page. But this query can mean different things. It could stand for
the grocery store, of a giant bike company, or it could be the movie Giant and in general there could be
several categories. And there are several different types of users which are interested in particular sets.
So this guy's interested in, I think, the grocery version. This guy's interested in bicycles and so on.
Now your job as a search engine is to display these rankings. So clearly it's not so optimal to display, first
all, the groceries, right? Because then the guys who are looking for the bicycles or the movies, they have
to like go down to the end of the page.
So somehow you want to interleave these things. And this is sort of what this problem is trying to quantify.
So the goal is to produce a global ordering of these pages so that all users are satisfied as soon as
possible.
And in general you have some kind of statistical data on each of these user types. So, for example, if I
know that, say, users are looking for bicycles they're of the following type. Like 10% of them are happy
when they see the first document. So they go through this list and the first time they see a bike they'll be
happy and click on it and won't go further. We know that maybe 30% of the guys really like to scroll down
further and they get happy when they see two elements of the choice and so on.
So in general you could have this weighted vector of when people are happy. And your goal is to order
these pages so that for each set you have some kind of preference and so the total over all these sets is
minimized.
So the problem makes sense.
>>: And here the assumption is that each type is equally latent, because you're not weighting the sets?
>> Nikhil Bansal: I forgot to mention -- so let me go back. Good point. So I forgot this is W lock part. So
you can assume that without class of generality, they would define the problem like this. Would assume all
vectors of 001 type. Why? Because if I have a vector like this for some set S, I'll make one copy for each
of this W1, W2. So I'll make S copies of this set. Okay. And with the first copy I'll associate the vector
W1,000, the second copy I'll associate 0W2000 and it's an easy exercise to see that the problem remains
equivalent.
So it's cleanest to think about the following problem where each set has like a covering requirement K of S
which corresponds to this vector where you have 1 and KS position.
So it's a good time for questions. Okay. So in this previous setting, like here, the pages would correspond
to elements that we are trying to order, and the user types correspond to sets.
So this problem is also very interesting in its own right and very special cases that have been studied in the
past. So very famous special cases all sets have covering requirement of one.
And this is called a min subset problem. And has various applications in online learning and so on. Now
here your goal is to sort of order the elements and the datasets and a set is covered the first time you see
an element from that set.
So there is a classic result by Feige, Lovasz Tetali that if you do a greedy algorithm where you choose an
element which covers the most certs, now look at the remaining certs and cover the ones and so on and
produce this ordering and greedy algorithm is a four approximation and interestingly this algorithm is Feige
et al., and it's implicit in the work of Bar-noy et al., and the proof is actually not so trivial. It's quite slick. So
unlike the set covered greedy analysis of those you've seen, this one is quite involved.
And Feige et al. also showed that 4 is best you can hope for any polynomial time algorithm. They prove
hardness approximation factor of 4.
Another special case that's been studied a lot is when the covering requirement of a set is the cardinality of
the set. So a set is covered when you see all the elements in that set.
And this is known as min latency set cover problem. And so there is a two approximation known for this
problem and it's attributed to Hasin Levin because they coined this term in latency cover and yet this set
has been rediscovered. And the true approximation has been known as early as the '70s and [inaudible]
literature and it's been worked on by several people since then.
And it follows as a special case of what's known as this precedence minimizing weighted computation time
with precedence constraints. And again recently in an upcoming paper we show that you cannot beat this
factor two for this problem. So, again, this problem has basically a weighted -- in the special case and it's
been widely studied.
So what about a general question when you have general covering requirements of sets. And there is a
paper, Azar et al gave a log and approximation and N is the size of a universe. Okay. And of course they
leave the open question: Can you get a constant.
So notice that if all the requirements, one, we know that greedy four approximation. First thing you could
try is, okay, how can you generalize this to larger covering requirements? And this turns out to be
somewhat tricky. So if you think so what are the natural alternatives? So greedy is at any time we are
choosing an element that covers the most uncovered sets. If you have higher covering requirement, if you
just look at this definition, it does not make that much sense because just one element by itself does not
cover everything. Suppose every covering requirements are two then you still need two elements to talk, to
be able to cover one element or something.
So another natural thing you can think of is, okay, I'll pick a few set of -- a few elements which sort of
maximize this. I want to pick few elements that cover as much as possible. In general you will ask I'll
choose such a collection of elements which maximizes this.
But this turns out to be a tricky problem when requirements are greater than one. So, in fact, this
maximizer of this expression is sort of, has dense case of above, for those familiar it's a much harder
problem anyways and we don't know any good approximations for this problem.
And just to give a sense why this is related to dense case sub graph. Consider the following instance. You
are given a graph, and I'll interpret it as each edge being a set and each vertex is an element. Okay. And
each edge has a covering requirement of two. And this edge or a set is covered if you cover both its
vertices.
Now if I ask you pick K vertices which cover as many edges as possible. Right? So it's sort of like this.
You want to pick a few -- sorry. So pick few vertices which cover as many edges as possible.
So in particular if you want to pick K vertices and you want to cover as many edges as possible, it's exactly
like finding a dense case of graph size K.
>>: So actually in this example, it's mean latency, right, because you have to cover both, the amount of
each set is equal to ->> Nikhil Bansal: Right, exactly.
>>: And I guess they do something different in scheduling.
>> Nikhil Bansal: Exactly. Exactly. So they won't -- they won't go this route, exactly. Yep. And this is a
very special case, when your Ks are just two also. So the point is you can't, sort of greedy is unlikely to
give anything good. So interestingly, actually, if you look at the algorithm of Azar et al. that got this log N,
they introduce a very nice idea which they call harmonic interpolation. And the punch line is that once you
do this harmonic interpolation, which I'll describe what it means, then greedy algorithm actually performs
quite well.
So what is this harmonic interpolation? So remember that with each set we have like this weight vector
corresponding to the covering vector where you have 1 in KS position. So what they do is they say, okay,
you have this weight vector, replace it by the following. So if you have a 1, then in the previous position
you put a half, and the one before that you put a one-third and so on. So it's sort of, see the harmonic
things coming in.
And they say if you have the weight vectors of the following type, then if you apply the greedy algorithm,
and somehow it performs quite well. And the intuition somehow is that in general if you have vectors like
this, where you have zeros initially and then 1 here, then greedy doesn't know what to pick because the
benefits will come later on, right? The one is quite far off.
But somehow if you do this harmonic interpolation, then somehow greedy gets enough information on how
to proceed. And sort of doing a big injustice, very, very high level description of the results are very cute
and interesting idea in itself.
So I would encourage you if you have time to look at this paper. So, okay, so the algorithm is based on the
following idea that given such weight vectors you first apply this harmonic interpolation then do greedy and
be sure it achieves a log approximation. Notice you lose this log in this harmonic sort of transformation,
hard waiting while you lose it, but it should be involves while you do it. It seems inherent. There are bad
algorithms where the algorithm has to incur a gap of log K.
And in that paper they have some additional results. So if you have in particular nonincreasing weight
vectors, so 1,000 is a special case which corresponds to some min set cover, show a factor 4, should be
though of as generalization of [inaudible] result and this is all based on similar techniques using a greedy
algorithm or adapting the proof.
And also in the special case when your weight vectors are nondecreasing, so in particular of this types, so
this should be thought of a generalization of precedent constraint scheduling they get a factor 2. And this is
sort of based on different techniques of spreading metrics and so on.
So this is sort of what's known on this problem. And the obvious question is: Can you get a constant factor
in the general case?
So I'll describe the following results in this talk. So the first result is that for the general problem, where you
have arbitrary covering requirements, they give a constant factor approximation algorithm. And our
approach is based on linear programming and solving relaxation, and this linear program uses the
so-called knapsack of inequalities, which also I'll define later what they mean.
The second result we have is in a slightly different flavor. It's the online setting, the so-called online
non-clairvoyant min set cover. What this means? In the nonclairvoyant cover you don't know what the
covering requirement of the set. Know the sets up front. But you start producing your ordering and at
some point the set tells you: Oh, I met my covering requirement and I'm happy. But until that happens you
have no clue what the covering requirement could be, it could be 1 or it could be all the way cardinality of
that set.
So sort of be sure these ideas can be adapted and can get a poly log approximation if sort of your
requirements for nice powers of 2, otherwise there is a square root and algorithm. But I'll not focus on this
in this talk, I'll only talk about the first algorithm.
So any questions at this point?
Okay. So here I will proceed. So sort of now -- so our approach is LP-based. First I'll sort of describe the
simple naive linear program you would write at first and show why it doesn't work. Then I'll describe a
standard linear program with the so-called knapsack of inequalities and I'll explain what these knapsack
inequalities are and sort of get the analysis.
Actually the algorithm is very, very simple in hindsight. So that applying to give the whole analysis and we'll
see how simple it is.
Okay. So here is a first LP formulation that you would think of. So again what's the problem? You want to
find an ordering of these elements. So let's say for each time T and element E I have a variable XET which
tells you how much of that element is placed at time T. So you have a constraint that at each time T you
can at most place one element.
The second constraint in this LP says that I mean each element can be placed only once. You can cheat
by placing the same element many times and getting multiple credit for it. So these are the variables XCT,
think of them as sort of matching from elements to time.
Now there is this second variable YST, which means whether a set S is happy by time T or not. By happy,
whether it's covered by time T or not. So notice in the objective function, you are doing, so maybe it's
easier to think of this, the summation science integer. So in sum over each set, and each set YY0, it's
paying one every time step and when it gets covered Y becomes 1, this is actually capturing the covering
time. So the LP formulation clear?
Okay. So it's -- good. So straightforward LP. So you want to say that Y is -- if you're only looking for 01
solution, for YST to be 1 you better have KS elements on the left side that you have covered. So the first
LP that you would think of.
If you think for a moment you realize that it's immediately a bad LP because the idea is sort of that you can
get a lot of partial credit if you cover, you had a covering permanent of K but if you cover K minus 1
elements you get a lot of credit for it. But in integral solution you won't get any benefit.
And again so you can make this precise and give a [inaudible] instance. So the idea is very simple. So
suppose your instance looks like the following. So these are your -- this is your universe A1 through AN
and B1 through BN. And it's sort of like a core group of elements. And each set has the following type. So
it involves a core group and then one element from here.
So you have several sets of this type. And clearly any ordering should first place all these core elements
and then place these in some order. So basically each of these sets will finish some time around here. But
the LP can sort of pay very little because by this time AN it has almost covered all the sets, all except one
element has been covered, so it pays very little. Again, what you would expect and you can make this
precise, very easily.
Okay. So now we'll describe a stronger inequality, stronger LP relaxation, and we will add the so-called
knapsack covering inequalities. Let's take a detour and discuss what these knapsack inequalities are.
So for those who are familiar with this, probably boring. But if you don't know, it's sort of a cute and
interesting thing. Sort of independent of itself. So the problem is the following: You are given a knapsack
of size B. And you have some items where item I has a size SI and some cost PI. And your job is to
choose a minimum cost collection of items which covers your knapsack.
So you want to choose items such that the total size of these items exceeds the knapsack capacity. And
the cost is minimized.
So this is the knapsack problem. So let's write a linear program for that. So the first LP you would think of
is, okay, you want to cover the knapsack. So this is the constraint. That summation, just saying that you
should exceed the capacity of the knapsack. And you want to minimize the cost.
But notice this is a very bad LP for sort of a silly reason. So if you have just an instance with one item,
okay, and suppose this item is of size a thousand times B, then the LP can achieve by setting XI of this
item to be 1 over a thousand and paying very little in the cost, while in any integral solution you have to
choose at least this item.
So integral you pay like one and LP only pays one over a thousand. It's sort of a silly example. You can
easily fix this example by noting that okay five such large sizes I can just truncate the size of an item to
minimum of its size and B. Because I never really need to have size more than B for an item, right? I can
truncate and it why am I letting LP cheat for this. So that's sort of the first idea. But turns out this idea is
not good enough. So here is a second sort of example with two jobs or two items. So now I have, yeah,
the more instance LP where I'm truncating the size of an item with LB. But here's the bad example.
Suppose I have two items both size B minus 1 and one has caused zero and the other has cost 1,000.
Okay. So notice now this truncating with B doesn't help because it's any way both are B minus 1. Now in
any integral solution, you have to choose these both items because you want to exceed capacity B and
they both have size B minus 1. So you have to pay this thousand cost because you have to choose a
second item. But what can the LP do? LP can choose one extent of this item so this, it can pack this
completely. It has already covered B minus 1 capacity. And then it can only choose like 1 over B fraction
of this element. And end up paying very little. It manages to cheat again.
So the fix for this is sort of a little more nuance. These are called the knapsack covered inequalities. So
what you do is you write a more intricate LPS as follows. So I look at all subsets of items. So and a lot of
constraint corresponding to each subset. Let's look at a particular subset of A of item, and the idea is we'll
say in the LP that, okay, I'll give you these items A for free. And now back me the residual. So fill out the
residual capacity with the remaining items. So formally here is the thing. So look at a particular set A. And
suppose I've gone -- okay, these go in your knapsack. So you still have to at least cover B minus -- so size
of node the total size of items in A. So even if you get all these items in A for free you have to cover at
least this many items, and now so, yeah, and using items which are not in A. So now you add this usual
constraint where you are also doing the job truncation trick. So I want to cover this much capacity using
these items, not in A. So you add this covering constraint. So you do this for each possible subset.
So is the LP clear? It should also be clear why this is a feasible constraint. And so just to sort of do a
sanity check, let's go back to this bad example previously, where the LP was getting away just by packing a
very small fraction of this second element. So notice in this nuance LP we would have some change
corresponding to the set -- consider the set A just containing this item, the first item. So correspond to go
that set you would have a constraint that minimum of S comma 1 D minus size of A -- size of A is D minus
1 times X size greater than 1. So basically this concern is saying X2 is at least 1.
And it will force you to also choose the second item completely. So for this particular instance your
standard LP will give you an exact solution.
So these are the so-called maps of inequalities and in general they're very useful. So there's a nice
theorem by [inaudible] and it says if you have a general map set problem you can get an approximation.
Now there are some issues like, okay, they're exponentially many inequalities for every subset but sort of
either you can apply very simple, give separation article in the website and there's ways you can give an
explicit polynomial sized set so ignore those issues.
Okay. So any questions at this point about knapsack? Okay. Good. And before we proceed let me give
this -- so in general this knapsack inequalities give you approximation for knapsack. So let me give you a
factor -- in fact there's several proofs of factor two. So like based on prime dual or some other techniques.
Let me give another constant factor proof which will be sort of more relevant for our purposes in this talk.
So here is the whole algorithm. So given a solution to the LP, let it be X. So first consider the items which
extent at least half in the LP solution. So choose these items, because I only lose factor two because any
way choosing half I choose them completely. And call this my set A. So if set A already covers my
knapsack I'm happy and I'm done, right? I only paid twice my LP cost. So let's assume it doesn't cover the
knapsack. Now I know that my solution XI was supposed to satisfy all these knapsack inequalities for
every subset. So in particular it satisfies the constraint for this particular choice of A.
And the constraint is the following, the knapsack over constraint, that the remaining items should at least
cover the residual capacity.
So now the algorithm is just choose each item I which was not in A with probability set twice XI. Twice
what LP was choosing with. Okay. Now let's see what happens. So in expectation I pay twice of what LP
is saying. I'm choosing twice [inaudible], and I expect load of these items that I'm choosing to be twice D
minus size of A, because they're satisfying this so if I'm choosing twice the price except probability 12
satisfy this also.
Now my job is just to satisfy B minus size A, right, because size A is covered by items in A to cover the
knapsack I just need B minus size A in expectation I'm likely to cover this. So you can use a favorite
concentration inequality and you'll get that with some constant probability, you'll at least exceed B minus
size A because the expectation is twice as much. So there must exist some solution which just with
obvious thing your average cost is not too much and you are succeeding, there is no probability. So there
must exist some solution which is not too bad.
So there's the analysis. And in general this is not the tightest analysis, and you can also get a factor two.
So this is sort of instructive purposes.
Okay. So coming back to our problem of the multiple intransity ranking, so earlier we had this thing for a
set to be covered, you should have at least KS contribution on the left side. I'll stand by this knapsack
inequalities. So I'm looking at a particular set S that I want to capture its covering. So I look at all possible
subsets A of that set and I'll say that, okay, I give you these elements A elements for free. So you still need
to cover at least KS minus A in your ordering, using elements not in A.
And if you manage to do that, then you can set YST to be 1. Okay. So I'm just putting these knapsack
inequalities for every possible subset of my set.
And it should be clear why this is sort of a feasible. I'm not cheating, a feasible constraint on the LP. Okay.
So this is the standard LP relaxation. And we'll prove that it's a constant factor approximation. So here is
the rounding algorithm, and it's very, very simple. It's completely here. So here is like the LP solution. So
each unit here is a time and LP is doing some fractional matching here. So here is what I do. So I
consider these intervals of size 2 to the I, but they all start from .1. So these are geometrically increasing
intervals. So call let JI be the interval from 1 through the I, I do the following. For each interval I choose
each -- okay, so look at an element E and suppose let XGI denote how much it's occurring in the interval JI
in the LP, how much LP is putting to JI.
So I choose it with probability sort of C times LP extent and think of C as a constant for. So if LP was -- if I
look at, say, this particular interval and if LP was basing it extent one-tenth I'll choose it probability
four-tenths randomly. So I'll construct a random set OI for each possible geometrically occurring prefix and
then I just output the ordering over 01, 02, 03 and so on. And that's the algorithm.
So is the algorithm clear? Okay. Good. And the analysis also will be very, very simple. So just following
two slides. So here is a key observation about one lemma that I need. So consider some particular set S.
And let's say in the LP solution it's first covered to an extent more than half in some interval 1 through 2 to I,
because even if LP has not covered it extent half, we don't care, because then we're paying half in the
objective function. So let's look at the first time an LP claims it's covered this element to extent half.
Suppose we could show that in my OI that I construct, I also cover this element with probability at least half.
Then I claim that will be done. Why? Because what was my rounding. So I look at each such prefix
through put to the II sample four times what the LP was doing and I just concatenated these.
So what is the chance that LP will be covered? So for each of this OI is the expected size of that OIC times
2 to the I. Because I'm looking at 2 to the I elements and I'm sampling four times that.
Now the first few geometric intervals 1 through to the I, I don't care because whether they're covering set S
or not I'm sort of giving it away for free. Now when I reach this interval, say of size 2 to the I, I know LP is
claiming to cover set S to extent half in this interval. And suppose I could show the lemma that my OI that I
construct by randomly sampling this can cover by at least half. And then, so let's look at sort of what
happens right. So I can cover this element with probability P and okay in that case I only incur 2 to the I
sort of time in my schedule. Or maybe I'm unable to cover in this interval property 1 minus P but maybe
with probability P I can cover it in this. So then maybe I paid this much. But maybe I miss again in both of
these, but then I can get it in this interval with this much probability and so on.
So notice that in this sum, because you're concatenating all these geometrically increasing intervals, the
length of those intervals is increasing exponentially. But the probability that you could get that far sort of
going down faster than 2 because I assume P is greater than half.
So if I manage to, so the point is if I manage the show this, then I'm done, because this sample can
converge 2 to the I just geometric appropriation.
>>: So 2 to the I.
>> Nikhil Bansal: Yes, so say two-thirds. And then again this is sort of a very simple, so then again the
proof is entirely on this single slide.
Consider some set S and LP is claiming that it can cover this set to extent half in the first 2 to the I steps.
And now we are producing this O of I based on this interval 1 through 2 to the I by sampling each element
with four times what the LP has. And I want to prove the claim that this set OI covers S with probability at
least strategically greater than half. So here's the whole proof.
First of all, the finer set A to be all these elements E which are at least LP places to extent one-fourth. Call
this A. So clearly if I'm sampling with this, all these elements in A are contained in my set OI. So I don't
have to worry about them because -- so -- so actually, yes, this set A should be elements in S that are
covered to extent quarter.
So now for the remaining elements all I require is that to cover the set you should have KS minus A
elements, because A are already there for free. Now let's look at the knapsack of constraint for this
particular set S. We know that the solution XI satisfies this knapsack covering equality, because some of
these residual coverings should be at least residual capacity and the Y was half. So this constraint was
satisfying this. So in particular if I sample each element with four times X of EJ, so then what happens?
So let's look at this constraint and multiply it by a factor of 4 on both sides. So 4 times the left side is at
least two times this residual capacity.
So if I know that if I sample with four times XE of JI then in expectation I'm likely to choose twice this
residual capacity. Just because multiplying this constraint by four.
So in general there's a reasonable chance that you will cover at least -- so in expectation you are getting
twice of what you want. So at least there's a reasonable chance that you'll get what you want.
And if you just use, your favorite concentration inequality you can show that it's at least half and that
finishes the proof.
Okay. So in sort of wrapping up and going over what we were doing, so you have this LP solution. So for
each set S you look at the point where LP claims it's covered the set to about half the extent, and I'll say
that I'll produce a rounding where I'll cover that element by constraint times that duration in expectation. So
since I do that, I pay constant times LP cost and expectation and get a constant factor approximation. And
again sort of doing these ideas and refining them one can extend these results to the online case. So this
particular nonclairvoyant setting that I mentioned about where you don't know the covering requirements.
Sort of we'll skip that in this talk. So that's it.
[applause]
>> Yuval Peres: Questions?
>>: Do you look at the counter pattern because that looks more natural [inaudible] case.
>> Nikhil Bansal: So complement would be?
>>: Given K documents, maximize the happiness. So number of users who would be happy in the densest
case.
>> Nikhil Bansal: Densest case.
>>: [indiscernible].
>> Nikhil Bansal: Let me see if I'm understanding it correctly what -- so you want to choose sort of K items
or K elements so that you cover as many sets as possible. That's the thing. So I claim that's dense case of
graph hard because if you look at this very special case where sets sort of size 2 so think of a set as edge
and this set is covered if you choose both these vertices corresponding to that set. So these vertices are
elements and sets are edges. So if your problem is asking for K vertices that cover as many edges as
possible, it's exactly dense case graph.
So if you could read that, then that would be a much more interesting talk.
>>: But you have this version that because it's in the briefing so it's dense case, densest sub graph, if each
person is satisfied with KS squared 2, that's KS is one, could be like decreasing vector.
>> Nikhil Bansal: Yes. So KS1 is exactly max K coverage.
>>: Is it decreasing?
>> Nikhil Bansal: So one thing is, yeah, it's sort of misleading. So like in -- so this is increasing, increasing,
the reason why you can do better is sort of this metric where you want to sum over the finish time of each
[inaudible]. So that's sort of inherently easier problem than fixing your K elements and asking how much
maximized sets you want to cover. So it's sort of a different beast. It makes -- so the metric we are looking
at of sum of computation of covering time of all the sets is inherently much easier because of the nature of
the ->>: That's what I'm looking at how to make it easier, the actual version. Min class is not the minimum, it's
the maximum.
>> Nikhil Bansal: So I don't know sort of what's -- my guess would be one very simple case. But just that
KS equal to 2 with the requirement of equal to the cardinality of the set is already dense sub graph. So it
will be interesting to see if something is there.
>>: Just another little variant that occurred to me at the beginning, you're thinking of SM. SM is just fixed
sets, M types of [inaudible]. Now that I'm looking at it, there's just everyone's different. Some probability
distribution of subsets but you got M training sample of M people as SM is a sample, you're having to
design your algorithm based on that knowledge, but [inaudible] doing you have to see how it's doing on the
next M sample Ss.
>> Nikhil Bansal: Right.
>>: [indiscernible].
>> Nikhil Bansal: Back learning kind of, yeah. So it's probably a more relevant problem. Sure.
>> Yuval Peres: Any more questions? If not, let's thank the speaker again.
[applause]
Download