Document 17864763

advertisement
>>: All right. So it's a pleasure for me to introduce Ilan Cohen who's visiting MSR for today and
tomorrow and he's a PhD student at Tel Aviv University supervised by Yossi Azar and today he'll
speak about joint work with Seny on tight bounds for online vector bin packing.
>> Ilan Cohen: Okay. Hello everyone. My name is Ilan Cohen. I'm going to talk about tight
bounds for online vector bin packing, joint work with Yossi Azar, my driver, Bruce Shepherd
[indiscernible] University and Seny Kamara from Microsoft Research. So first we'll talk about
the motivation from the problem. Nowadays the use of cloud computing is increasingly
widespreading. Instead of having the computation in the end station, they are doing the
computations inside the cloud. This results in several problems. One of them is how to
schedule the stuff that's coming to the cloud. So this is the classic scheduling model. So the
scheduler needs to decide on which server to put the job or the task. Each job has a certain
capacity that it takes on the server and so the load on the server will depend on the task that
the scheduler assigns to it. So we are going to expand this model and instead of looking at onedimensional server, we're going to look inside the server and see that there are a couple of
resources and it can be independent between the -- there can be tasks that demand a lot of
CPU and a lot of GPU but not a lot of memory and vice versa. There can be tasks that can
demand a lot of memory but not a lot of processing resources et cetera. So we assume that we
know what tasks demand in advance. So tasks come to this system know in advance and we
need to decide where to assign it. In other words, cloud computing the schedule needs to
assign tasks to identical computers, the servers. Each task is d-dimensional demand, so
demand CPU memory it can be unrelated on demand from another. And each computer has ddimensional capacity constraint that we cannot overflow in any other dimension. Okay? And
now our goal is to assign the tasks on the identical computers and we can open new computer
if necessary. And the goal is to minimize the number of computers used. Okay. If we are going
to normalize as a value between 0 to 1 we are going to get a vector bin packing problem which
vectors are d-dimensional. Vector xi represents the task that that arrives to the system and its
d-dimensional vector between 0 and 1 because we normalize everything between 0 to 1 and
we need to pack the vector into bins. Bins will represent the servers. Now the capacity
constraint of the servers turn on capacity constraint on the bin that we are saying that each bin
is d-dimensional capacity constraint. We are going to sum up all the vectors. This is actually ddimensional. Any question? So we cannot overflow in any of the dimension. So just a little
example, two-dimensional example, let's assume that vector arriving so it can be assigned to
the first bin. The second vector arrive; it still can be assigned. Third arrive and still can be
assigned. But if I'm going to assign the fourth vector I'm going to overflow in one dimensional,
so I must open new bin and assign the new vector to the new bin. But I can still use the third
bin if another vector is going to arrive I can still assign it to the first bin. Okay I'm going to talk
about the online algorithm, online version of the problem. In online algorithm we need to
answer a sequence of requests. In our case a sequence of vectors arrive into the system and
each request must be served immediately. In our case we must assign it to a bin immediately.
And each assignment is irreversible. After we assign a vector to a bin we can't change where
we assigned the vector to. Okay? And this kind of problem we are going to talk about the
competitive ratio of the algorithm. Simply if we have an instance i of the sequence of request
of vectors in our case, we are going to compare the algorithm result, the number of bins used
by the algorithm versus the optimal off-line result. We are going to assume the optimal knows
all the vectors in advance and computes the optimal number to assign on the vector too. Look
into the competitive ratio which is a ratio between the algorithm result and the optimal result.
The algorithm result will always be larger, so it's for some number larger than 1 and we are
going to do this [indiscernible] over all instances. And our goal is to bound this threshold. So
what is known about the problem? Any fit algorithm just open new bin if necessary will give
you order of d. Garey, Graham, Johnson and Yao analyze a first fit algorithm which is a version
of any fit algorithm just put in the first bin that touches it, that's visible. Open if there is none
visible bin and they analyze this algorithm and they found out it's order of d plus 0.7. So the
previous lower bound for this problem was 2 by Galambos, so as you can see there is a huge
gap between the upper bound which is the order of d and the lower bound which is 2.
>>: [indiscernible]
>> Ilan Cohen: Yeah, also for a randomized algorithm.
>>: You're talking about [indiscernible]
>> Ilan Cohen: Yeah, we're going to talk about that a little while.
>>: [indiscernible]
>>: It's been only fit online.
>> Ilan Cohen: It's been only fit online. Actually it's not a fit NP hard problem. We're going to
see a reduction to graph coloring which means that the upper bound for graph coloring which
will…
>>: But [indiscernible]
>> Ilan Cohen: Yeah. A proposal online, in online reduction but we talk about it. But first of all
we want to close a huge gap between the order of d and 2. The first result that we have, we
have a new lower bound of d in the power of 1 minus epsilon for every constraint epsilon. That
means you cannot get any linear approximation, any sub linear approximation. We also took,
studied the case of 0, 1 vector bin packing that each vector coloring is either 0 or 1. And in this
case the lower bound that we proved is square root of d minus epsilon and we show, but we
proved afterward that the first fit of 0, 1 is upper bound is square root of d, so in this case is the
lower bound and the upper bound is still tight. Is also tight. So the question because that order
of d, or d can not be really a large number, so the question is what can we do because the
upper bound and the lower bound are tight. The only thing that we can do is add another
constraint on the input. Another concern that we have if I take example from real life I can say
that each task that comes to the cloud cannot demand more than 20 percent from the server
memory or more than 20 percent then the total CPU. If I had this demand, this constraint on
the input, I expect to get a better competitive rush in the algorithm and different lower bound.
So we'll define general vector bin packing going to be with constraint that the demand is a 1
over Bd some integer larger than 1 from the capacity. There is two ways to formalize it and 1
can leave the bin size to 1 as before. That means the bin is either smaller or equal to 1 and a
vector between 0 to 1 over B. You see that there is a factor of B between the maximum
request and the capacity of the bin. We formalize in the second we we just multiply everything
by B and we say that we have bin size of size B, so we have a different -- still the maximum
demand is 1 of the bins in the capacity of the bin. The same just by multiplied by a factor of B.
Okay. So the new problem is the lower bound crashes. We found a new lower bound which is
d in the power of 1 over B minus epsilon, okay? As B increases it's getting closer to constant
lower bound. And the question is is there an upper bound that's improving as B increases. And
we found other algorithm which gives approximation [indiscernible] of the over 1, d the power
of 1 over B -1 log d which is, where you shift by 1 in exponential. [indiscernible] and we have a
matching results also for the 0 or 1 it will be, it's [indiscernible] better result than the 0 to 1.
>>: When you are using the version where the bin size is B, B to the d?
>> Ilan Cohen: Yeah there Because you get a much better result. If B is 5 you will get square
root [indiscernible] of log d. But first things first. Let's close the gap with B close to 1, okay? So
we're going to use reduction to online graph coloring. I bet most of you know graph coloring.
I'm going to talk about the online version of the problem. Assuming, let's assume that there
exists the vector bin packing that's gets order of order of d1 minus epsilon competitive. If there
is such algorithm I use it to produce online graph coloring on the order of n to the power of 1
minus epsilon and we will see afterward that online graph coloring is actually out to
approximate because the reason online you cannot get this approximation answer for the
online graph coloring so there isn't a [indiscernible] vector bin packing algorithm either. So let's
talk about online graph coloring because its online version also is going to get a sequence of
requests. In our case each request is a vertex and when vector v arrives, vi arrives, not that it, it
just reveals its previous neighbors. It reveals its neighbors between v1 to vi minus 1. So is it
verified that you can describe graph by just revealing a vertex which was previously adverse.
And as in graph coloring the algorithm needs to set an admissible color for vi. Admissible is
color that is different from its neighbors’ colors, and the goal is to minimize the number of
colors used. That's the quick example. So a vertex arrives and we need to color it immediately
because it's online problem. We choose to color it with red. The second vertex arrives and
now it reveals that is [indiscernible] to A to its previous vectors, so we cannot choose the color
red. We're going to color it green. Then c vertex arrives. It has an edge to a red vertex but not
to a green vertex, so we can use the color green once again. d arrives and it shares an image to
c and to A so we can use no red and no green, so we choose let's say blue. Now when vertex e
arrives we can choose, we can use the color red again so we color it red. So this is the online
graph coloring problem, and as I said before it really [inaudible] approximate. In fact, I just
prove that there exists an adversary that's reveals through a graph the optimal can color it with
order of log n colors. And any online algorithm we use at least n over log n color. And it's a
huge gap of, in -- so in other words, that means that for any epsilon larger than a 0 you cannot
color this graph with n in the power of 1 minus epsilon. Okay. So now we do the reduction. So
as I said for each vertex vi I'm going to produce xi. That corresponds to this vertex and each bin
will correspond to a color. So my demand from the construction if we have a subset of vertices
that are independent, that means that they can have the same color, if and only if the
corresponding vector for these vertices can be in the same bin, because I want to have a
correspondence between the bin and the color. If I succeed with this condition the chromatic
number will be equal to the optimal number of bins. And the number of bins used by the
optimal, used by the algorithm is the same number of colors used to color the graph. This
condition looks a bit odd because I'm telling you that I'm going to use coding at just 0, 1 or 1
over n and there's the number of vertices I know in advance. Instead of, this time I'm going to
change it to much more clear term that's if I have an edge between the vi and vj that means
they cannot be, they cannot have the same color. I want to say that they cannot be in the same
bin. That means that there exists a k such that the value of the k index of each xi plus xj is
larger than 1, so you cannot put them in the same bin. They cannot have the same color. I
want to make the correspondent. So how do I do that exactly? I say that I'm going to produce
a vector xi to each vertex that arrives, but is a problem of the future edge, because I don't know
what the future edge will be because let's assume that j is larger than i okay? So I'm going to
put 1 is i coordinate automatically. Okay? For each vertex that arrived I'm going to put 1
automatically and then let's assume j is larger than i so if j has an edge to i it will put 1 over n in
the i coordinate. Okay, that means that vector i and vector j cannot be in the same bin and this
is a condition that we've done before. And all the other coordinates is going to be 0 so they will
not reflect on the other. That's an example. So the first vertex arrive I'm going to put 1
automatically in the first coordinate and 0 in all the rest so the vector bin packing need to
assign it to a bin. Let's say it choose a red bin. So I'm coloring the corresponding vertex to the
same color, red. Now the second vertex arrived. It has an edge to the first vertex to A so in the
first coordinate I'm going to put 1 over n. So to verify that I cannot put the second vertex, the
second vector into the same bin as the first vector, so I need to open a new vector. I open the
green and I will color the vertex with the same color of the vector. Okay. And let's continue
with the example. The third vector arrive c. c does not have edge to the second so I put 0 in
the coordinate. You can check that I can put the second vector and the third vector into the
same bin. So I choose to put them. There is no overflow in any of the coordinate, so I can go
and color the vertices corresponding in the same bin color. And the next one, d arrives. It has
an edge to the third vertex and to A so in the corresponding vertex I put 1 over n so you cannot
use either of the colors again or the color red, so you must use a new color blue.
>>: So the 1 over n is not important? It's epsilon?
>> Ilan Cohen: Epsilon, yet.
>>: Can you make it [indiscernible]
>> Ilan Cohen: No. If you put something above 1 over n you can, something below one over n
it will be enough.
>>: [indiscernible] n-1?
>> Ilan Cohen: What?
>>: 1 over n -1 [indiscernible]
>>: [indiscernible] less than 1 over n.
>> Ilan Cohen: Less than 1 over n because you don't want [indiscernible] for the other bins to
perform. But say if d does not have an edge to c so you want to be able to put all of them
together without overflowing this coding, so because it's 1 over n you cannot put them. Okay.
So I quickly show you, so 0 or 1 vector bin packing you cannot use the trick of 1 over n so
instead of that we increase the dimension of the vector that we producing. Instead of this I'm
going to start the example for x. Now I need, I use n squared dimensional vector so instead of
designating one-for-one vertex I'm going to put a block of 1s and then x1, 2 is going to serve if
there is edge between 1 and 2. So now when the second vector arrive it has an edge so we put
1 in the corresponding place and we must color it, we must put it in other bin. Let's see the last
example so as you saw there is a block of 1s so it automatically as before. And we put, and we
put another place 1 if there is an edge between them because we do it in the block we that
show us that 1 and 1 are not the same 1 each other. But because we use n squared
dimensional vector it will correspond to the lower bound instead order of d 1 minus epsilon, it
will be square root of d. So we saw a lower bound reduction and as I said because online graph
coloring is really [indiscernible] approximate also vector bin packing so this is concludes vector
bin packing with B equal to 1. So if the online coloring we use alpha bin we get an online graph
coloring for two alpha colors. So we take any graph and we get online coloring automatically
using vector bin packing. The reason that I show you this simple diagram is because it's going
to be more complicated with bins larger than 1. Okay we don't know how to use vector bin
packing in order to get direct coloring, but instead we are going to show you how to get a alpha
classes of triangle free subgraphs. If you think about what this coloring did, each color will
introduce a subgraph which is 2k free of independent set. So we use the vector bin packing to
be equal to 1 to get an independent subgraph of independences with vector bin packing be
equal to two we are going to use it to get out for graph of triangle free. Each bin will
correspond, each bin vector will correspond to vertices which perform a triangle free subgraph.
But we don't have any honest results on triangle free on taking graph and perform triangle free.
So we need to show you how to color it. So if you look at the example the blue part is triangle
free and the red graph is triangle free but it is not a valid coloring. We have to adjust
[indiscernible] with the same color so if I show you a [indiscernible] that know how to take this
triangle free subgraph and give a valid coloring, so our final color will be the class and the color
that the algorithm can give. As you can see, the algorithm gives this 1 and this 1 the same color
yellow, but the class of this vertex is blue and the class of this vertex is red, so eventually, so
that coloring will be different because we are going to…
>>: But what is k?
>> Ilan Cohen: k I will explain later. k should be small enough in order to prove that this
problem is still hard. If k is small enough we'll talk about k later, so we don't use any too much
coloring eventually because we know that alpha is small. We assume that alpha is small
because the vector bin packing is a good approximation [indiscernible].
>>: Are you saying that triangle free graphs have [indiscernible]
>> Ilan Cohen: Not all of them, but we'll talk about specific subgraph and a small chromatic
number. First thing, second thing first, let's talk about how we use the vector bin packing in
order to get divide, how to divide the graph to a triangle free subgraph. To be equal to 1 we get
a automatically independent set. In this graph if we have a subset that are triangle free and
there's no triangle inside the graph we want to say that they can be in the same bin. Okay.
Because we're going to use again 01, over 1 over n so it's enough to I'm going to show you that
if we got that triangle, so there is a coordinate. I will talk about what is coordinate exactly, but
in this coordinate if you sum up the values it's larger than two so this vector cannot be in the
same bin. Okay. I'll, what is a coordinate? Let's assume that i is smaller than j, is smaller than
k, so if I first I'm going to put in vector i automatically in index ij which is n squared, I need
[indiscernible] the vector is n squared dimensional vector. In ij I'm going to put immediately 1.
Also when j arrive I'm going to put immediately 1. So now when vector k arrive already know if
there is triangle between i, j and k because all of the vertices have already arrived. And if there
is triangle then I'm going to put in the same coordinate of vector k, 1 over n. This ensure that
algorithm cannot put i, j and k in the same bin. Let's see an example. As you can see there is a
triangle between a, c and d. There is an edge between a and c. There is an edge between c and
d and d and a. So in the designated index 1 free which is the index for all the triangles that
contained in the first and the third vertex there is automatically 1 and 1 and because d close the
triangle I will put 1 over n and this means that you cannot put 1, 3 and 4 in the same bin, so
what you got is triangle free subgraph. Okay. You can see that it's all -- a, b and c is not a
triangle so index 1, 2 of the third coordinate I have 1 and 1 but I put 0 so they can be in the
same bin. And they are in the same bin the blue bin without overflowing any coordinate. So
we use the n squared dimensional vector in order to put those in triangle free. So this is how
we would do it to [indiscernible] 1. I'm using n by 3 dimensional vector.
>>: [indiscernible] for every [indiscernible] of size B you have B -1 and this is representing the
first B -1 [indiscernible]
>> Ilan Cohen: Yeah. Now you we understand the -- yeah. But it's not the valid coloring. Now I
mean to color the graph. Now I need you to show the second part to get the valid coloring
because that results just on the coloring. I would do, I do it. Okay. First of all if I'm going to do
it to a general graph to pick any triangle free, triangle free subgraph and to color it in the online
way, I don't know how to do it. We know that the chromatic example [indiscernible] n but we
don't know how to color it and especially in the online fashion. Instead of that we're going to
the lower bound toolbox and going to bring a big [indiscernible] toolbox, okay? Halldorsson
and Szegedy also proved that online graph coloring is approximate even for graph that has two
properties that I'm going to discuss. First before the adversary produces a sequence of nodes.
vi reveals its previous neighbors and the algorithm needs to set a color for vi. So the first new
property that they showed is that it's called the [indiscernible] model or the spit in your face
model that after each step that you decide the color for the vi, the adversary will laugh at you
and say what the real color for the vertex. And the second important property that this graph
has is that if vi is a neighbor of vj that vi is going to be a neighbor of all previous vertices that
are the same color as vj. Okay? And we'll discuss and see how it helps us. So I…
>>: [indiscernible] color or…
>> Ilan Cohen: No. Adversary color, that's why. This is to explore the properties of the graph
and we are going to exploit these properties in order to…
>>: [indiscernible] this is adversary [indiscernible] color?
>> Ilan Cohen: No. You are going to color it first and then eventually you will [indiscernible]
>>: [indiscernible] basically [indiscernible]
>> Ilan Cohen: Yes. That means if I am neighbor to vj and vj is color of yellow, then I am going
to be neighbor for all previous adversaries that are color yellow. Okay? But we will see an
example. So Halldorsson and Szegedy proved also for this kind of…
>>: [indiscernible] before you have even chosen your color, right?
>> Ilan Cohen: Yes. The position works like this. Adversary arrives and reveals its previous
neighbors. You need to declare a valid color which will be adversarial. Than that adversary will
[indiscernible]
>>: But I'm saying the last properties for those adversary colors [indiscernible] it doesn't matter
what [indiscernible] [multiple speakers]
>> Ilan Cohen: Actually it's online known graph.
>>: [indiscernible] sort of colors at the adversary would choose because it's only the colors for
which the entire set like you would [indiscernible] color [indiscernible] okay?
>> Ilan Cohen: Yeah. Also for non-graph which these two known properties the chromatic
example of the graph which reveals order of log n and any online algorithms will use at least n
log n colors. Actually it's a really nice paper. Is one-page paper if you have time, yeah, onepage with everything creative walk and everything. If you have spare hour. So now I'm going to
show you color triangle free and I'm assuming that I'm taking triangle free subgraph and I'm
coloring it triangle by triangle in online fashion of course. Of course, the coloring portion will be
really simple. It use adversary coloring and the coloring principle said color it with first
neighbor adversary color in the same bin. Look at the first neighbor in the same bin and
choose…
>>: [indiscernible] so you have this triangle free…
>> Ilan Cohen: I have this triangle free, now I want to in online fashion to color all the blue
subgraphs to get…
>>: [indiscernible] random tree?
>> Ilan Cohen: Triangle free and there are two properties [indiscernible] because the property
is for the on graph, so it's also for the subgraph.
>>: [indiscernible] so why is the, why are they just probably true?
>> Ilan Cohen: They proved also for graph with these two properties it's really hard to
approximate.
>>: Oh, okay. So you are reducing from that…
>> Ilan Cohen: Yeah. I'm not taking any triangle subgraph. I'm taking triangle subgraph that's
come from this kind of graph. It's called a non-graph coloring paper.
>>: [indiscernible]
>> Ilan Cohen: Yes. The vector [indiscernible] just give me the partition, now I show you a valid
coloring with small number k is going to also be also log n so you will see that I don't do that
[indiscernible]. So what is the coloring? The part to do is really simple right now. It's the first
neighbor adversary color if such a neighbor exists and it's going to be 0 black some color that's
adversary doesn't have, some new colors if such neighbor does not exist. So the k is actually log
n +1, something like that. Okay. So we are bound and we are going to exploit adversary in
order to color the graph. So but just remember the assumption we assume that all the vertices
that derived from the same vein; that means that they are triangle free. Okay? So the first
vector arrived according to the protocol I'm going to color the [inaudible] with black because
it's the first neighbor. Now the adversary needs reveal its coloring which is red, these guys.
The second vectors arrive. According to the protocol I'm going to color it with red, first
neighbor adversary color. The adversary needs to pick a valid color, say yellow. The third
vector arrived to the bin, not to the general graph because maybe we now coloring just 1 bin
but in online fashion. It doesn't have any neighbors. It may have another neighbor from
another bin, but we don't care [indiscernible] make sure. Okay? So [indiscernible] and they
arrive and you can see it has an edge to a yellow, adversary color so it must have an edge to all
yellow ones. So in this case we're going to color it yellow, so the adversary [indiscernible]. Let's
see quickly this proof. It's going to be more simple than what you thought. And let's say, let's
assume for example, we have two vertices which have the same color. Okay? Let's say it's
black. Black is a special color, but the second one is a first neighbor so it can be black. Let's say
it's yellow. Okay? So it must have closest neighbors and the adversary color is yellow. But
remember the second property is if the second vertex is an edge, has an edge to a vertex with
color, it must have an edge to all vertices with color yellow and if you can see, you can see the
triangle. And what we assume that this graph is triangle free, so we got correctness.
[indiscernible]. Okay. Quick analysis. So we use n squared just to, because we use n squared
[indiscernible] vector so actually and we use k which is log n. Log n is small number according
to that so we still got an upper bound. Now we leave the B equal to 1, B equal to 2 and now B
equal to 3. If we go the general vector [indiscernible]
>>: [indiscernible]
>> Ilan Cohen: It's…
>>: So you still have multiplied by the number of [indiscernible]
>> Ilan Cohen: Of k, yeah. So of log n, yeah.
>>: [indiscernible]
>> Ilan Cohen: So if you get as I said, if you get alpha coloring which is dr…
>>: Then you going to get k alpha.
>> Ilan Cohen: Yeah. But k is small. That's why we use the lower bound is 1 minus epsilon
and…
>>: [indiscernible]
>> Ilan Cohen: Yeah, yeah.
>>: [indiscernible] [multiple speakers]
>> Ilan Cohen: Okay. Great. So now I'll do, how do I do it for [indiscernible]? Actually the part
of it that I show you doesn't do just taking a triangle free and performing to a valid coloring
which is 2 clique free. Actually what this does is taking any B clique free and split it to k classes
to B-1 clique free and the proof is just the same if we assume a negative that we have B-1 clique
so both have the same first neighbor and we got B clique and we assume that this graph is B
clique free. So what does, so this is how we get, as I said we are going to use n to the power of
[indiscernible] and dimensional vector. This is why the lower bound with d 1 over B. And this is
the general point is we're going to first give the vector bin packing in order to get the B+ 1
clique free and each clique free we are going to use this [indiscernible] in order to get B clique
free until we get 2 clique free which is valid coloring, so we just multiplying everything k +1
getting everything, in every separation so the third number be some constant. For total
numbers k +1 to the power of B. Multiply k. Okay. Because k is small and B is constant it's still
impossible to get this result and this I got the lower bound. Okay? Now we're, so this
concludes the all the lower bound. But the question now and we can take a 5 seconds to
breathe and think about the upper bound. So we showed as we expect lower bound decrease
as B getting larger. The question is can we find an upper bound which is [indiscernible] gets
better [indiscernible] because we have a much stronger demand constraint on the input. And if
you take [indiscernible] fit algorithm or any fit algorithm you will find out that compared to the
adversary the order of the state and not getting better as B increases. So we presented two
stages algorithm which use some techniques from [indiscernible] from online unrelated
machine and so on. So it's two-stage algorithm and the first stage is I'm going to pack into
virtual bins with capacity I'm going to stretch the bin capacity; that's why it's a virtual bin. I'm
going to expand it by c of d. C is some constant. Log d is staying log d. But afterwards I need to
extend second stage because I want the solution we need to be just assign to a real bin so I'm
going to use the second step in order to assign. So this is animation. So this is all vector
[indiscernible]. Let's assume we know all in advance we can easily [indiscernible] to get a good
approximation by constant using doubling, but if we assume that we know outcome in advance
so we know that all of the vector can be assigned into these opt bins, but instead we're going to
allow it, each bin coordinate to cB log d, okay? And each vector that arrive I am going to assign
to one of the bin and I want to make sure at the end of the maximum load in each coordinate is
cB log d. What I know on the input is the maximum coordinate that using the optimum solution
that can be packed into bins with maximum capacity of B. so this is the first stage. But you can
see that it's not the real distributions, so I need the second stage. So each virtual bin I'm going
to end by its own. Each virtual bin is going to have a group of real bins of algorithm bins and
each vector that arrived to this virtual bin I'm going to assign to one of the real bins associated
with this virtual bin. So because we got opt for each vector that arrived, I'm going to put on
one of the out bin. Because of the number of virtual bins was up to a constant like opt, so r will
be the [indiscernible] of the algorithm. So…
>>: [indiscernible]
>> Ilan Cohen: Yeah. I will tell you. [indiscernible]. So what we know about the input of the
second stage is that the vector can be assigned into bins with capacity cB log d and we need to
assign it to all bins with capacity of B. So let's return to the first stage. We assume that we
know the number of bins. We need to assign the vector one of the bins. Well the goal is to
minimize the stretch factor of the bins. Actually it's we can think about this other problem that
you get the same of vectors and you want to minimize the stretch factor or minimum load
which actually this improves the result of the log [indiscernible] and this get order of log d
which is improvement of the off-line result. How we do it? We use a potential function and
some techniques from the algorithm machine scheduling if you know, don't know. We use
some constant A and we use potential function which is we are taking every code a, log on
coordinate which is in online fashion and the potential is exponent is some constant of the log,
summing over all jobs, over all the machines over all coordinates. So our goal in this step is just
to minimize, to minimize the potential function. In other words it is just taking the machine
with the minimal this sum. So if you just…
>>: [indiscernible] you don't need that sum [indiscernible]
>> Ilan Cohen: What?
>>: You don't need that sum [indiscernible] it just [indiscernible]
>> Ilan Cohen: Yeah. It's essentially this one and this one. Okay. We use just the techniques
from [indiscernible] machine you get order of log dm. M means the number of machines, but
we careful analyze we can remove the M because we don't want to depend the number of
machines and get order of log d.
>>: So you [indiscernible] I consider d [indiscernible] if B is small and m is large, then you can't
get log d, right?
>> Ilan Cohen: No. You can get always order of log d. This is -- there's a difference between
the algorithm machine. The main difference is…
>>: All the machines are identical.
>> Ilan Cohen: Yes. That's why all the machines are identical. This is the second impression. If
I choose to put, I can bound each coordinate by the potential of other machine, so I use it in
order to bound. This second term is some telescopic sum that we have in the unrelated
machine if you know. So this concludes this proof stretch factor of B log d. Now I just want to
show you the second part. The second part will be very simple. So as we said there is online
stream of vectors that can be packed into virtual bins with capacity cB log d. What I'm going to
do is take r some parameter that is set. And I'm going to distribute vectors U at random. Each
vector that arrived to this virtual bin, I'm going to choose a bin different uniform at random and
assign it to it. The question is what is r for succeeding with constant probability. So using
Chernoff bound to prove that if you use r of theta and d1 over B-1 log d, some other stuff. As
you can see this is a [indiscernible] algorithm, so it says that r will be the competitive
[indiscernible]. You get the probability of overload is less than half. If it did overload then you
just can open new r bins because you are not allowed to overload. If you see that you are
going, about to overload, you just open new r bins because the probability of overload is small.
>>: [indiscernible] can you overload it?
>> Ilan Cohen: Any other round and encoding it. [indiscernible] so if bin overload then we just
open new r bins. So the expected number of bins because the probability of overload is less
than half, the expected number is theta of r. Okay? So this actually concludes the algorithm
but we got now a randomized algorithm for distributing the vectors. So the question we want
to de-randomize this algorithm, so we use some [indiscernible] using the technique of derandomization online algorithm. We need to de-randomize it in online fashion. And now we do
it also by potential function. The potential function has two elements. Also for each bin
coordinate I'm going to have some potential function, but the potential function is going to
depend on the load as before until this step. And it's going to depend on also on the vector
that arrived until now. Now our goal is to balance the potential ones, to balance and to using --
we can abound this potential function using the bound we can bound the maximum load on the
coding. So how do we bound this potential function? We use some deterministic technique we
will call random test. We prove that in this step the potential will not increase. How we do it?
We take for any given vector xi we're going to put, we're going to do a random test and check
an expected value of this test. Okay. We're going to choose a bin uniform at random and
assigned to it. Just by the test. We are not really going to do it. Deterministic algorithm just
for the proof. And for this part, if we do this action we will get expected value on the potential,
and if we prove that the expected value of the potential for any, for -- it doesn't depend on the
potential before. It doesn't depend on xi. It’s always going to be expected, always going to be
smaller than what we started with. Then that means that there must be a bin which not
increase the potential function.
>>: The expectation is only over your one-step random test?
>> Ilan Cohen: Yeah. So for any, I'm taking the potential before. If I do this test, the potential
after will be, will not increase.
>>: [indiscernible]
>> Ilan Cohen: Yeah. This is exact test. So using, we know the potential function in the
beginning so we can use it in order to bound the maximum load because in the potential with
some expression with the load, so if we know the potential in the beginning and then
maximum, it must the same, so we can then use it to bound the maximum load and we have a
bound of B so we get the same, we get the same approximation [indiscernible]. So we didn't
lose anything from de-randomization algorithm, so this concludes the upper bound so we have
the deterministic algorithm with this approximation [indiscernible]. As I said which getting
better as being [indiscernible]. So let's just conclude what we showed today. First of all we
close the gap for vector bin packing with B equal to 1. We studied the vector bin packing for
larger bin size, B larger than 1. We show a new upper bound for these kind of problem d power
of 1 over B minus epsilon. We show almost matching upper bound d1 over B minus 1 log d.
And we extended these bounds also for 0, 1 vector bin packing.
>>: [indiscernible]
>> Ilan Cohen: Yeah, because the random is the lower bound of [indiscernible] randomized
algorithm using Yao theorem [indiscernible]. And open questions, we still have this annoying
shift by one between the upper bound and the lower bound and for if B is order of log d so the
lower bound became constant, but the upper bound is still order of log d. So we want to close
that step. And thank you. [applause]
>>: Can you think of your algorithm in the same [indiscernible] because the algorithm
[indiscernible]
>> Ilan Cohen: Yeah but I think that we actually fix exponential weight which is, you can think
about exponential weight as parameter [indiscernible]
>>: [indiscernible] it's usually cleaner to take the [indiscernible]
>> Ilan Cohen: It depends. If you asked my advice it would be to [laughter] [indiscernible]
>>: [indiscernible]
>> Ilan Cohen: Some would prefer exponential [indiscernible] but this is essentially the same
idea beyond the [indiscernible] okay.
>>: Thanks. [applause]
Download