Document 17865158

advertisement
>> Mohit Singh: I think we can start now. It's good to have Ola Svensson from EPFL. Today he
is going to tell us about the matroid secretary problem.
>> Ola Svensson: You saw my talk. Sasha might have missed it. It's joint work with Moran
Feldman and Rico Zenklusen who is at ETHZ in Zürich. There will be about the matroid
secretary problem one of the classical problems, the classic secretary problem so let me start
by introducing that and then I will argue why we are interested in the [indiscernible] and what
the [indiscernible] is. To make it a little bit more interesting instead of secretary let's say jobs.
Everybody applies for jobs once in a while and we want to get the best strategy to get the best
job offer. You have applied for n jobs and the goal is to somehow devise the best strategy to
select the best job offer you can get. You know how many jobs you apply for, normally, so you
know n in out of bounds and now you want to select a strategy that so that you will select the
best offer. What are the rules of the game can see you assume that the offers are going to
arrive in a random order. That makes sense. You don't think that the employers will unite
against you to make your life miserable. You also assume that when you see an offer… You
disagree? [laughter]. Maybe. I don't know. I have good experiences. You might have had bad
experiences. And then you also assume that when you get an offer you can assign a value to it.
You can compare to what you already saw. Of course, the best strategy is to wait for all offers
and take the best one, but what makes it interesting is that you have to somehow make your
decision immediately. One an offer arrives you have to immediately decide whether to accept
it or not and your decision stays there, so when you accept one offer the game ends. Here's an
example. Mona Lisa applies for a job, random order. So the first offer… I'm trying to decide if I
want to record this. [laughter]. So the first offer is maybe a value 7, but first you can you can
expect something better so you reject. Then maybe you get an 8.1 and we also reject. Then
Walmart, you're surprised [laughter] so you reject. Last time I gave this presentation some
people were surprised that I didn't like Walmart. I don't know. In Europe is… Okay. Anyway.
And here I am nice. Microsoft Research is an 8.2 [laughter]. Maybe accept. What is the best
strategy?
>>: Wait for Microsoft research?
>> Ola Svensson: No. In general, what is an optimal strategy? All right. Not all optimal, let's
just say reasonable strategy? There are many optimal strategies. The intuitive one has two
phases, see what you can expect. You don't know what to expect in the beginning so you
sample a fraction of the input where you are in the market and based on what you have seen
and then in the sample phase you make a selection. In the single secretary problem what you
would do is just look in the sample phase. You reject everything irrespective of value, but you
remember the best solution in the sample. So the optimal selection in the sample would be
seven. Remember you're interested in selecting the best possible offer so the only thing that
makes sense to accept is something that is greater than seven. So what we will do is accept the
first thing that is greater than seven. If it is less than seven, 6.5, we reject. Two, we reject. If it
is eight we accept. It's easy to see that this is a constant. Suppose we observe, instead of
observing 1 over e, we observe 1 over 2 fraction. Now I claim that the probability of selecting
the best is at least one fourth. Why? Because we are guaranteed to select the best with the
second best in the sample phase with probability one half. The best is in the second half which
happens to be report to one half, so that is probability 1/4. They are not completely but almost
independent. And 1 over e is a little bit more complicated but you can see that even if we take
the third best here there is some probability to select the best one if that comes before the
second best and so on. It's not too complicated. More surprising is that it's also the best. This
was proved in the ‘60s by Dynkin in ‘63. There is a quite long history and it has been studied
independently in the West and East. Maybe you, this guy, maybe you can explain to me after
the talk. This guy always pops up at different places in my slides when I use [indiscernible].
Since you are here maybe you know how to get this out of my slides. We are interested in a
more generalized version and luckily we are not the only ones and we are not the only ones
that have proposed a more general version. This is because of online mechanism design and
Kleinberg was the first one to look at more general problem and then Babaioff et al in ‘07
defined the secretary problem. Why are they interested in the secretary problem? There is an
immediate connection if you think not of secretaries but instead you think of bidders arriving
online. You have customers arriving online and you are selling goods. You have a limited
amount of goods that you can sell to the bidders and you have to immediately decide if you are
going to sell it to the customer or not. Think of this single secretary problem where you have
one good that you want to sell. I have a car and now people contact me online in some random
order telling me their bids. If I sell the car, I sold it, but I can also reject the offer but then the
guy might buy some other car from some other provider. That is the single secretary problem.
You can imagine generalizations where you have many cars and that's the immediate
generalization that Kleinberg considered in 2005. Instead of only having one good you want to
sell, you have k goods and you have n guys coming in random order making bids. So k identical
goods so here in this generalization the goal is to select a k secretaries of the maximum total
weight, so each value. You are not so interested in only selecting the best, but you want a total
weight. If I have 1 out of n I can do 1 over e in expectation of the total of the optimal off-line
weight. That's the best I can do. What do you think of k out of n? Do you think it's easier or
harder, intuitively? Do you think it's an easier problem with k out of n than 1 out of n to get
good weight compared to optimum off-line?
>>: K can be one.
>> Ola Svensson: Okay. But k grows. Let's say k is hundred.
>>: [indiscernible]
>> Ola Svensson: It should be easier, but why? Intuitively it should be easier because you know
one decision is not so important. So you are allowed to do some mistakes and that's the case.
His results suggest that in this case we can do something like 1-some 0 of 1 over squared of k.
When k those big you can actually go over to the [indiscernible] or when k goes to infinity you
will get already close to off-line optimal. This is one generalization. Maybe a generalization
that is a little bit more interesting that actually captures my interest in this problem is the
following. Suppose you are a provider. Suppose you have some server, may be a Netflix server
and you have a network in of your network cables and they have certain capacities, so the
classiest blue numbers. And now you have some clients that will arrive in a random order and
will be guaranteed a unit connection to the server. Suppose this blue guy come and I am willing
to pay more to get the guaranteed unit bandwidth to the server. I say okay. I will provide that
to you and you pay me for units. This would be one way of guaranteeing, but it's not the
unique path. I could also give him that server. And it's also not important which path I give
him. I could change it later on as I always guarantee him 31 unit of bandwidth to the server.
Now the second I can arrive. I bid seven and I say great I will take use as well. Look now to
accept him I have to change the path of the red guy, but I can still accept him because here
there is no way to connect the blue guy without changing the path of the red guy. Now this guy
comes six. I already used quite a lot of my capacity so let's reject him. 5 reject, 9 accept. Now
I'm somehow this is the total amount of customers I can accept because my network supports
at most three customers. So 10 even though it's more than 9 I have to reject. The question is
how well can you do here compared to the optimum off-line. Optimum off-line is easier to
solve in [indiscernible] pattern. That's just a great algorithm. That is an open problem. We
don't know. It's a nice open problem is if there is a constant competitive algorithm for this
problem. A constant competitive algorithm means that the online algorithm should at least do
a constant fraction compared to optimum off-line. Online algorithm is c competitive if it gets a
c fraction of optimum off-line.
>>: Sorry, but off-line is it [indiscernible]?
>> Ola Svensson: I mean greedy, it's a [indiscernible] you would take 10, then I don't know if
you can take 9 or not but you take the next one that you can take.
>>: Is it important that all of their offers are bounded between two constants?
>> Ola Svensson: Know. Actually, we would not assume that. If we assume that maybe we
could do better. If you assume that maybe you can do something, if there are integers between
some numbers then you can do log of the range or something or rounding.
>>: [indiscernible] could be that one guy dominates the sum of everyone. You have no idea
[indiscernible]
>> Ola Svensson: Exactly, and that's why you cannot do better than 1 over e for the 1 out of n
in the weight version.
>>: Like I guess that is an alternate version [indiscernible]
>> Ola Svensson: I think our algorithms would work would work for that case. I think so. I'm
not 100. They are easy to explain when you have weights but yes you can also assume that you
only compare [indiscernible] items. That's somehow a harder version but I don't know if it's
really harder with it but it definitely is not easy. This is both the case I told you where a special
case of [indiscernible] called a gamoid. It's a special type of matroid and so that's what these
two examples are nice and that's what's nice with matroid secretary problems. Somehow it
captures many nice applications. And what is the matroid secretary problem? It's a secretary
problem where we have a matroid constraint of elements to be selected. So one matroid
constraint will be that you can select k out of n secretaries. Another matroid constraint is that I
can select the clients that I can provide unit bandwidth to the server. That's called a gamoid. If
you don't know what a matroid is you have a problem. Kidding. I will define it in the next slide,
but also if you don't understand it, so it takes a couple of presentations to fill you in on what
the matroid is. I will do all my examples on graphs. The only thing we worry about is to select
[indiscernible] for a side click. That I hope everybody has seen. To understand what I'm going
to say afterwards you only need to know what a [indiscernible] graph is. Why study MSPs?
[indiscernible] natural settings and then [indiscernible] matroid that defined [indiscernible]
greedy algorithm works. The definition of matroid is the [indiscernible] work. It guarantees
that this [indiscernible] so they have a very nice structure so the hope is that you somehow
found the best problem that captures many settings but at the same time you have very good
online algorithms. Here we are pretty common. The problem is in the second part we don't
really know if the structure is enough to get very good algorithms and that's what we are trying
to solve and trying to understand. Let me explain what we know about the algorithms for these
cases and let me define a little bit what the matroids are. And as I said matroids formalized
problems that can be solved by greedy algorithm, so if we look we are looking at graphs, so
here we have a graph and now we are looking at the graph [indiscernible] of this graph, the
graph correspond to think about the graph correspond to forests. What the elements of this
matroid would be the set of edges of the graph. Now all possible feasible solutions will be all
forests, so that's the independence sets in the matroid. All the subsets of edges that there are
are cyclic. That is the graphic matroid. What kind of constraints do we use when we prove that
the greedy works? I think exactly what you need to make it work is first we have a ground set.
That's the edges in the graphic and then we have a subset of all the sets of this ground set that
are independent sets. What are the constraints? First constraint is quite simple. If we have
cyclic graph, so this says if we have a set of edges that are cyclic in the graph [indiscernible]
then if I drop an edge it should be a cyclic again. This is downward monotone. The second
thing says that if you have two cyclic graphs set of edges I and J and I is a bigger cardinality than
J then there must be an edge in I that I can add to J and it is still a cyclic. Then the existent edge
in high that is not in J that I can add to J and it is still a cyclic. Why is this? Think about it. If you
have I in many edges that are cyclic and you have few components of the graph. There must be
more components in J. This means that there must be one edge in I that connects to the
components in J. Adding that edge will be a cyclic. That's what a matroid is, not so intuitive
when you see it the first time but after a while it gets intuitive. When you get used to the
definition it gets intuitive. Another example of matroid is yeah right. One question one asks is
how can you have this looks like hard encoding, so you have may be independent sets like
exponentially how can you encode a matroid and what people assume is that they have an
Oracle that returns a set that's independent or not. Think of the graph case. You have
exponentially many substantive edges but you efficiently check if one set is a cyclic or not.
Another example is a linear matroid. Here you have some dimension m and you set E as a
subset of vectors. The universal finite set of vectors and all the independent sets are subset of
the vectors of the linearly independent. Let's now check that the actions are satisfied. In the
first case suppose I have like three independent vectors. If I drop one or two guys that remain
are independent and the second one suppose I have three independent vectors in I and two in J
then I can add one of them to I to make it bigger.
>>: [indiscernible]
>> Ola Svensson: I don't know.
>>: So a constant vector approximation is not known for any of this?
>> Ola Svensson: No. Actually, yeah, if you can get the [indiscernible] linear. I will be surprised
if you do. I don't know. Maybe there are some, probably there are some weird properties
known for linear matroid but not for general matroid, but I don't know about them.
>>: [indiscernible]
>>: [indiscernible]
>> Ola Svensson: Okay.
>>: Sure but…
>> Ola Svensson: Let's look at what is known. There are a lot of special cases, so actually for
the graphic matroid there is a competitive ratio of 2e, so you only lose a factor of 2 compared
to the single secretary. Transversal matroid, 8. Don't ask me what all of these matroids are
because I don't know. K sparse, this is k sparse linear matroid. What is a k sparse linear
matroid? I would guess that it is a linear matroid where the vectors have only k non-zeros.
Here you see this is not really constant when you have a [indiscernible]. This is still open for
linear matroids. Laminar matroid is regular matroid. You see what we are getting at. You can
look at all your matroids almost except for linear and gamoid. I don't know if you know any
other were we don't have a constant and [indiscernible] I think it makes, so they made a kind of
a bold… I mean Babaioff et al they made a pretty bold conjecture probably in 2007, but now it
looks like I believe in it. What they proved is there exists a competitive algorithm for the
problem MSP on any matroid.
>>: That is actually a special case for it, right?
>> Ola Svensson: Sorry?
>>: The graphic matroid is special case of k sparse.
>> Ola Svensson: Yes. It's a 2 sparse, so this generalizes that. Good.
>>: Although I am not sure how many of these are unsolved satisfactorily in the sense that we
assume your matroid [indiscernible]
>> Ola Svensson: Okay. This kind of the beat -- there is different versions of the matroid
secretary problem where you assume if you actually know your universe in advance that you
can query independence of the whole universe before seeing the elements or if you only get to
know where the elements are in the matroid when they arrive. That's kind of beat the, yeah. I
think if you can solve the matroid secretary problem even if you know the whole matroid in
advance but not the value of the elements I would regret, so we don't know that. The open
conjecture there is a constant and actually what this is consistent with our knowledge today is
that there could be an e competitive algorithm the same as for single secretary, so that would
be nice. And not completely out of, completely crazy to say that. There is another problem
that is related called prophet inequalities and they all result in statistics of probabilities that
gave a 2. It was shown that 2 is also the right answer for any matroid. So somehow the nice
thing with matroids they can capture many sets but things don't get messier. Well they get
messier because we haven't solved it but hopefully they don't get too messy.
>>: So you think [indiscernible]
>> Ola Svensson: No. For a fixed case a constant, so we haven't solved this yet. It's true so
this, so I enumerated and said that they were constants. Maybe this is an outsider. What do
we know about the general case? There is a very simple log rank competitive algorithm that I
will explain today. It's Babaioff et al in 2007. Then there is a more complex square log rank
competitive algorithm by Chakraborty and Lachish in 2012 and then there was a log log rank for
Lachish in 2014. What I want to say here is it's been nontrivial to improve log rank. They are
very complex algorithms and one indication that the last time was fairly complex is the hidden
constant in the [indiscernible] is at least 2 to 2 to the 32. Why is the hidden constant so large?
Because the algorithm is also [indiscernible] that you first sample and then you select. You
won't have a large constant because you want to get a lot of information out of your sample
and you want this information to be reliable so you want concentration and to get
concentration it's good to have many elements. So what we will do is to get the similar ratio
but we will use less information from the sample and we have a modest constant a simple 3000
log log rank competitive algorithm and it's quite simple, very simple relatively to the previous
algorithms but it's not so simple so I give you ideas but I think maybe you will not understand
the whole thing. Hopefully you will, but maybe not.
>>: Are you going to tell us more or less what the rank is?
>> Ola Svensson: Yes, sorry. What is the rank? Good question. That is the maximum number
of elements that you can select. The [indiscernible] set for an independent set.
>>: [indiscernible]
>> Ola Svensson: In the linear matroid this corresponds to normal rank and from graphic
matroid usually it's -1 if the graph is connected. I mean it's a spanning tree. Basically the
general technique here is that you start with a complex problem and basically we will use a
sample and so on to reduce it to a set of very simple problems where we run the naïve
algorithm and we will see what this works. One thing we notice when I introduce the single
secretary problem was the only place where we really use at least for the 1/4 argument, the
only place we used that was to get the sample. After we got to sample the order could be
adversarial in the selection phase, not to get 1 over e but to get 1 over 4 it doesn't matter as
long as I can get 50 percent of the elements at random the adverse can give me the remaining
as he wants. This is the same case in this algorithm. The only place we use to run this is to be
able to access 50 percent of the elements at random. Then the adverse can give the elements
as he wants. I think it has some applications in the record that I don't know. As I promise we
will look at this in the graphic matroid. First I will give a free algorithm. The first one is the
naïve approach which is like the first thing you would try if you look at the single secretary
matroid would it work here. Then I will give you the log rank and then we will see the log log.
What about the naïve approach? What do we do in the single secretary? We sample 50
percent. We remember the best solution. We took anything that improved on that solution.
Now we do the same thing. We sample a set by including each element with probability 0.5.
That's almost the same thing as taking the first 50 percent. Supposed this is the optimum
solution in the sample. Remember we remembered seven, the best secretary in the sample.
Now we remember the best forest in the sample, the maximum weight forest in the sample.
Now when an element arrives we should only take it if it was part of the best solution with
respect to the sample. So 2 would not be part of the optimal forest with respect to the sample
because it spanned by heavier weight elements. It will create a cycle where all the other
elements of heavier weight, so it would not be part of maximum weight forest so we don't take
it. Improving if not span by having it. I haven't to find span. Span means that they create a
cycle with you. 4 we take. 4 is better than 3 and will be part of the optimum off-line solution.
7 we should take because it's not spanned by heavier elements. 2, should we take it? Yes,
because it is better than 1. 11 we would like to take but now we cannot take it. That's why
online is not too good here, why we lose. This will be our solution. It will be 4+7+2 which is 13.
Why is this not good? We call it the hat for some reason. It looks like a hat, so you have some
really expensive [indiscernible] that you really want to take. Then you have tons of vertices up
here of very light edges, so basically optimal solution is equal to the weight of the heavy edge.
Let's now run this algorithm on that. You get some sample. Let's assume that the heavy edge is
not part of the sample because if it's part of the sample you would not be able to gather
anything in any case. Now we are extremely likely to see many of these edges before this guy
comes. Remember that they arrive in a random order because there are so many of these
edges. Maybe this one arrives and it definitely looks like it should be part of the optimal off-line
solution because it's not even spanned by any edges in the sample. Let's take it. Then this guy
comes, same thing, looks like it should be part. Then this guy comes and again we should take
it. We are extremely likely to see some situation where two of these arrive before this guy
arrives. This will prevent this from taking the fat edge when it comes and we are out of the
blue. Did I make sense? Now you can argue that this is a special case. There is only one
element that is important, so let's run the single secretary element which would work for the
hat, but we have this generalized hat. We have first a very heavy element. Then you have
medium heavy elements and then on these you have another hat and so on. Here you can
actually see that you have the log relationship, basically, and that's why we will see where you
have a simple log algorithm. Basically, here you have the heaviest guy. Then you have some
fact of lighter elements and then some fact of that lighter elements and so on. That's their
algorithm and that's why I have this. You group your elements according to weight. You take
one weight class and run it on that. We classify elements according to weight. Here I make
some assumptions that you can believe or not to believe, but this is easy. We know exactly the
number of elements, the rank of the matroid. These you can get from the sample. We know
the number of elements that we take and we also know the highest weight of an element. This
looks dangerous because in the hat example you would like to take the heaviest weight, but the
solution to this is some cost probability to run the single secretary problem and then otherwise
run your complicated algorithm. If we know this now I claim that we can change it into the log
rank many weight classes. Why? First of all, we also assume that by scaling the heaviest weight
is equal to the rank. Now you just round each weight to the nearest multiple of 2. This will cost
you a factor of 2. How many multiples are between 1 in the ranking order there is log r, and
that's what they do. We weight this Ci is the one with the weight that is between 2i -1 and 2i.
You have weight class 1 is between 1, 2 2, 4 and then you have the last weight class contains all
the elements with weight between r over 2 and r. Of course, why don't I have to care about
elements with weight over here? Remember that the maximum number of elements that we
can take is r and there exists one of weight r, so this means that even if I take all of the
elements down here they are not very important for optimal value, so that's why I can forget
about really lightweight elements because I only have log r many different. After doing this
rounding you should think of it elements can only have log r distinct weights, so we have only
log r distinct weights. So log r many weight classes by losing a factor of two we can assume all
elements in a class have the same weight. So how is now a log r rank competitive algorithm?
How would you do? One thing, now it doesn't matter all the elements are in classes
indistinguishable with respect to weight. They have exactly the same weight. Now you know if
you want to select a maximum number of linear independent vectors you can just add them
greedily. It doesn't matter in which order you add things. You will always get full dimensional,
so that's the same thing here. Just select one weight class at random and now you just greedily
add things from that weight class. And that's what we're going to do. Let's say sample S by
including each element with probability 0.5 and then use this sample to calculate the rank and a
max weight. That's the only thing they use from their sample, only to calculate the rank in the
maximum weight and then they form the weight classes and they select one of the log rank
weight class at random and then they greedily select elements from this weight class. Here
they use to make the property that it's not important in which order you select the elements
from the weight class. You will always get the maximum set of independent elements of that
weight class. In an example, if we selected a blue weight class we would take all of the blue
guys when they arrive, but we would not take anything else. And this is actually one over log of
the optimum. It's a log rank competitive. Consider the random weight class contributes
optimal log rank to optimum. We select that and we lose a factor of 2 because of the sample. I
don't know if I confused you. Think about it's what we are using like in spanning tree case,
suppose you are looking for connecting the graph. Now if you get edges coming to you then
you can just take edges as long as they don't form a cycle. You should always take an edge that
increases the number of connected components in a matroid. Even if you didn't understand it
it's not the whole deal. What you should understand is that we use extremely little information
from the sample. Basically we use to calculate the rank and the maximum weight class and
now, but on the other hand if we looked at the sample of the [indiscernible] it really told us that
we wanted to protect heavier elements from lighter elements. The problem was we had one
really valuable guy and then these light guys came and messed up our life. That's what our
algorithm will do here we will have two steps as I will explain later on. The first step is really
this. It's just that it will try to protect a heavy element from being messed up in light elements.
First we sample 50 percent. This might be our sample and the first step is to protect heavier
elements from light elements and this second step is a little bit technical condition. It is actually
to decompose a [indiscernible] set of sub problems. Let's not worry about that at the moment.
Let me explain by example. Suppose we have this sample. The green edges are the heaviest
ones that we saw in the sample. The blue ones are the second heaviest. The red is the third
heaviest and these violet guys they are the lightest. Suppose a blue guy arrives. The first check
here is that we should check that he should not be spanned by heavy elements. What do I
mean by that? We should check that the green guys do not create a cycle of him because the
blue, the only guys that are heavier than him are the green guys. We should check that the
green guys do not contain him in the span. If he is contained by the green guys in the span,
then for sure the optimum will never take him because they would first select the green guys.
That's a [indiscernible]. The second condition is because we want to be able to decompose
upon an independent matrix but what we should check to do that is that actually this is
contained in a cycle if I look at the slightly lighter element than himself and all the other
elements. I will look at the red elements plus the blue elements plus the green elements and I
will check that all of these elements create a cycle with myself. And they do in this case. That is
the second condition. It is spanned by elements in S of smaller or higher weight. In this case it
is spanned because this creates a cycle so we picked it. That's the two conditions. The first one
is natural. The second one is a little bit weird and it is just because we want the parts to be
independent. The hat example, how will it work out? This guy is not spanned by heavier
elements because right now I only saw the blue guys. They have all the same weight. He is fine
with the condition one. Is he fine with condition two? Yes. He is spanned by the same weight.
I didn't even have to use lighter elements. I take him. Then this guy arrives and he is fine with
condition one. He doesn't span by heavier elements, but he is not fine with condition two
because there is no cycle even if I look at all edges containing him, so I should not take him.
And this is good news because if this was exactly a dangerous problem that those two edges
like that that they took, but now I will never be able to take two edges like that because I need
one of them to have it in the cycle like I had over here. I had two edges so this one will create
my cycle and if I could pick that one because they created a cycle. I will never pick that one and
therefore I will never create the case where I took two guys, so this one is part I can always
take. It is a constant competitive ratio for the generalized hat. This is good news as far as
explaining this. So the slightly smaller means that I look at one weight class before me. Have
you seen matroids before? No one has seen matroids before except Mohit. Okay. Why can a
form like this algorithm form a matroid notation is falling so we define one matroid for each
weight class. So how is the matroid defined? Let's look at matroid M1. M1 is not so
interesting, so let's look at M3. What does this mean? We contract all elements that are
heavier than this third weight class. That's the blue big set. Then we induce and it should be in
the span of the lighter elements. This is the red that is here in the span is slightly smaller. And
this is the matroid. Basically, the reason there is a dependence sub problem is because you
contract what is in the span of the previous one and they will be nicely formed like an onion.
That was just a side remark for those that know matroids. Let's now analyze this algorithm. To
analyze it it's really intuitive to think about what it does in the first case where these two cases,
that we taken element if it's not spanned by heavier, but it's standby slightly smaller. By
definition, you will get the following claim. The algorithm picks and element e with probability
at least the probability of being spanned of elements with just smaller weight and higher minus
the probability of being spanned by the same or higher weight. This basically is the definition
because we take an element that is not spanned by heavier guys, including the guys in my own
weight class and it should be spanned by smaller weight. They might be dependent. That's
why I don't have a [indiscernible] just smaller weight and higher minus same or higher weight.
This probability has to be large. It contains more elements in just smaller weights. So the
probability is over the sample.
>>: Sometimes you don't pick them because it causes a cycle and you don't pick them.
>> Ola Svensson: Yeah, so that's why I have the same or higher. The same, so I will only do it
for every second weight. I was lighting a little bit. I will only pick every second weight class and
that weight class I will check that it is not spanned by the same elements that I took.
Otherwise, it would be only higher here.
>>: Okay. So the sample [indiscernible] I don't see. This is the same probability of the sample.
[indiscernible] element, which element?
>> Ola Svensson: Elements in the sample of smaller weight and higher minus probability of
being spanned by elements in the sample of higher weight and those that you pick of the same
weight but note that we took 50 percent of the elements and in the worst case we will take all
of the remaining 50 percent of the same weight before I arrived and this has the same
distribution. That's why we took 50 percent. You can think of it as the probability of the
sample to be spanned of the smaller weight and higher minus the same or higher weight. Let's
now under standard we want this to be high. We want an element to be picked with high
probability, so let's understand why this is good for the hat example. Let's look at an element
and suppose it's part of weight class i. Here are the graphs and this is the probability that this
span if I only look at things that are of the same weight or higher and here if I look at slightly
lighter elements and higher, so what happens in the hat example? When I look at my same
weight and a higher weight it's very unlikely to be spanned, but then I look at one weight class
that is the [indiscernible] and extremely likely to be spanned. Basically it looks like follows.
That's why we get very likely to pick elements in the hat because they are extremely likely to be
spanned once they go for one lighter set of elements. Unfortunately, this is the general
situation. There is no such pig. That's why we don't get the constant. There is not this nice
jumps. Now we have log n weight rank here and now we are stuck. We don't know how much
information we can get from the sample. Now the last probability algorithm is actually to run in
group these weight classes. Now you have log many classes and we will do geometric grouping
again. Here it's a little bit hard to understand but, again, we will geometrically, we will group
these together in a random way and first we will decide how many weight classes we will group
together and we will do this geometrically. Say we group one weight class together to keep the
original weight classes or we take two together or four together or eight together. We will be
log log rank many choices. And then you also have to take a random shift based on the size of
your grouping. This is a pictorial view size of our bucketing and then I will do a random shift.
Now the probability of picking my element will exactly be this, the probability of being spanned
here minus the probability of being spanned there. If I take [indiscernible]. If I was luckier it
would be C open 3 minus C open 1 and C [indiscernible] minus 0.1. And what I can show is this
method will give you exactly log log rank. The intuition is now that it's log log rank many
choices because we first had log rank many weight classes. Now we have log of that many
different bucketing. I think the main message was that I think the log rank competitive
algorithms only use rank of matroid. So we use the matroid structure well the protection of
higher weight elements and this allows us to currently get the log rank competitive algorithm.
Let me say what is the difference between, as I understand it the difference between Lachish
and our approach is that we did this geometric grouping and actually this random choice is
independent of the sample. What he will do is take the sample and I think he will then engineer
the best possible decomposition. For that you need very high certainty to always get the best
possible decomposition for the sample is also good for the remaining part. That gets hairy. I
think so. But here that gets very hairy with very big hidden constants but he claims that his
algorithm selects log log rank with high probability. Our algorithm selects log log rank in
expectation. So what is my belief? Actually, I believe that there is a decomposition that
guarantees a constant fraction. We don't really know what kind of information of the sample
we get out, so we have some sufficient conditions to actually right down decomposition, but we
don't know if the sample actually gives us enough confidence for these properties that we
need. You can ask me later on. Thank you.
>>: Do you have an example where the [indiscernible] of what you said is much smaller than 1
over log log the rank of the [indiscernible]?
>> Ola Svensson: Yes.
>>: Basically, that happens when you pick one particular size of the bucket.
>> Ola Svensson: Yeah, I mean one trivial example in our case I think is to have the hat
example. I think because they have two weight classes, but our algorithm will never look at
that you have only to weight classes and now with huge probability we group them together
and then you take nothing. You don't take the heaviest guys.
>>: So basically you get one with probability of one over log rank?
>> Ola Svensson: In that case, yes. When you have this case then you get. I think what he will
try to understand what case you [indiscernible].
>>: Your algorithm uses more randomness? Like your old randomness, I believe?
>> Ola Svensson: Yeah.
>>: And the other one you said doesn't use that or also?
>> Ola Svensson: I don't know. But I know that he is engineering how the decomposition will
look.
>>: Obvious but the [indiscernible] necessary to [indiscernible]
>> Ola Svensson: Probably not, but I don't know. It sounds weird but you need [indiscernible].
But I don't know.
>>: Why you group your constants?
>> Ola Svensson: I don't really know. There is some [indiscernible] constant to reduce that you
know the maximum weight and the rank because you run, have to assume that you know the
maximum weight you have to run single secretary algorithm with some constant and, you
know, I don't know. 3000 is a conservative estimate. I don't, in the paper there is some plus so
I thought it would be still writing 3000.
>> Mohit Singh: Let’s thank the speaker again. [applause]
Download