>> Jin Li: It's great pleasure for us to... PhD actually this year from University of Toronto. But...

advertisement
>> Jin Li: It's great pleasure for us to have Di Niu come here to give a talk. Di get his
PhD actually this year from University of Toronto. But from last September he already
get basing assistant professorship at the University of Alberta.
Di was a recipient of NSERC, which is believe National Science Engineering Research
Center post graduate scholarship from 2010 to 2012. And recipient of Alexander
Graham Bell Canadian graduate scholarship from 2006 to 2008.
His research interest span the area of multimedia delivery, cloud computing, data
mining, machine learning, and social networks, network economy, network coding, et
cetera.
Today he's going to talk about his work in economics in resource sharing in the cloud.
>> Di Niu: Thank you, Dr. Jin Li for hosting my talk. My name is Di Niu from University
of Alberta. And the title of my talk is the Economics of Resource Sharing in Cloud
Computing.
The essence of cloud computing is actually a pool of shared resources. So this is not a
term raised by myself. But I copied from some online websites.
So because the cloud computing enables the virtualization technology, it can bundle
different resources like CPU, memory, disk and network into virtual machines.
Because of the virtualization, we can do this autoscaling of the service. Instead of
investing in some infrastructure now we just put the virtual machines from cloud
computing providers and we can launch more instances when the demand goes up and
shut down the instance when the demand goes down. So this will provide a lot of
elasticity to our service.
And recently there's some research on the bandwidth reservation in cloud computing
systems. So now bandwidth reservation has become available for the traffic flowing out
of a virtual machine or between the virtual machines. And a lot of those research, for
example the first two, are done by the Microsoft Research.
A natural question is how do we price the cloud resources? Look at the current status.
So pay-as-you-go is the status quo of the pricing of cloud resources. It charges the
user by the instance hours it uses.
That means if you use the cloud, you get charged by the hours you use. If you don't
use it, then you don't pay for it.
And there are various payment options, including the on-demand reserved and spot
instances, et cetera.
For example, this is exemplified by Amazon Web Services.
The questions would be from the cloud side, how to price the shared resources fairly
and effectively? And we want to ask is pay-as-you-go enough? And from the client
side, we want to know how to best utilize different payment options, given that these
things are not too informative to a client which has no expertise in pricing and demand
estimation.
And outline of this talk is given us below. So we want to answer as -- answer these two
questions that we raised above as some computational problems. That is, how to better
price the shared resources in cloud computing and how to better exploit the shared
resources.
We take two case studies in our investigation. The first case study is the cloud network
reservation problem. And in that case study we want to introduce network sharing and
analyze how do we price the network sharing.
And the second case study will be computing instance reservation problem. And in that
we would introduce computing instance sharing.
And in the end we will talk about the lessons we can learn from these case studies.
Let's look at the first case, cloud network's reservation pricing. So the current method is
just to reserve and pay for an absolute amount of bandwidth either flowing out of VM or
between the VMs. And this is the model, just a proposed industry references I showed
in the previous slide.
But the shortcoming is apparent. First of all, our client has no expertise in how much
bandwidth it's going to need. So there's no expertise in bandwidth estimation.
And the second one is bandwidth is varying over the time so even if you reserve a fixed
amount of bandwidth, it's not going to work because you may waste some bandwidth
during the off-peak hours.
And lastly, and it's most important, is that if you do this individual reservations for each
tenant or client, it will be very wasteful because a lot of them are just not using their -fully utilizing their resources at the same time. So we cannot exploit the multiplexing
game.
So we want to propose a new model for cloud network reservation, given that the data
center engineer techniques is already in place.
So in our model, the client is just paying the cloud for the guaranteed service that is
provided by the cloud, and it does not worry about anything else. Everything else is
taken care of by the cloud provider. That is given the demand of the tenants, the cloud
will reserve the bandwidths automatically from the data centers for the clients. And this
client will assume their quality of service sensitive applications; for example, their video
services or other MapReduce services that requires the performance.
So how does the cloud actually reserve the bandwidth? Because it has a lot of
workload data stored in its database, it can retrieve the data and make some analysis
and characterize the workload or just to predict the workload to do some bandwidth
reservation that closely matches the client's demand. And such a service will be
intelligent from the cloud side. And the client does not worry about any of the algorithm
itself.
So the first benefit is the cloud has more expertise in estimate the workload because it
has data and the intelligence of analyzing the data. And the second benefit is that
because the can do this estimation, it can also autoscale the reservation so that the
utilization can go up.
And more importantly, the cloud can multiplex the demand of several tenants. So
before the clients, each client has to reserve the bandwidth individually, but now the
cloud can actually take the demand of several clients and reserve the whole bandwidth
for this group of clients, with just some performance guarantee provided to each one.
Because of this, their server reduced the cost of serving all of them. And the lower
expense can be rewarded to the clients as reduced pricing.
Now, let's look at a bit more detail of the theoretical model here. So we assume that
each tenant has a random demand of bandwidth, DI, in a short time period; for example,
each 10 minutes.
And we'll see on this bandwidth demand is a Gaussian variable with some mean and
variance that can be estimated. And also, all the clients have some covariance matrix
because they might be correlated with each other. For example, if it's the video service,
they may be correlated depending on their genre.
And we assume the statistics of this demand, DI, can be characterized or just estimated
from some data mining technologies.
>>: So the covariance is between the users, the clients, or across time?
>> Di Niu: It's between users. So there's nothing across time here. We just assume -let's look at the particular 10 minutes or particular short time period. And during this
period there are several clients. We assume each of them has a random variable
demand. And they might be correlated with each other. And the correlation is to be
learned from the workload history.
>>: [inaudible] using bandwidth as example. I think both Amazon and Microsoft are
basically bandwidth usage. [inaudible] among pricing. So basically depending on how
much traffic you send and receive, they give you a deal.
>> Di Niu: Yeah.
>>: Whether it's a VM, virtual machines, there's two models basically, on demand and
basic ->> Di Niu: Okay.
>>: [inaudible].
>> Di Niu: Right. Right.
>>: Reserved. I mean ->> Di Niu: Right.
>>: Can you basically comment a little bit about ->> Di Niu: Right.
>>: I mean, what [inaudible] you see basically that you try to reserve bandwidth?
>> Di Niu: Okay.
>>: I mean, versus on demand paying?
>> Di Niu: Right. The traffic is charged based on the usage in current Amazon Web
Services. So it's charged based on how many gigabytes you actually transferred in a
certain month.
But there's no bandwidth guarantee in Amazon Web Service. That's why there's this
kind of research that proposed bandwidth guarantee for users. But I think it's different
to charge the bandwidth reservation than to charge the bandwidth usage.
Because if we -- if we care about the users' quality of service, then we want to give them
a guarantee. So we need to reserve a certain bandwidth: And the charging model
should be different there than the usage. Yeah. So -- right. So it would become clear
later, I'm sure.
So because we are focusing on the quality of service sensitive client, we need to define
some client utility. In our model we let each client specify a guaranteed portion of its
service. For example, if it's a video service the video is stored in the data center and at
certain time it's need to be streamed out of the data center. We define -- there's a
guaranteed portion of WI, the server -- the client needs. For example, let's say NetFlix
when it starts the video in AWS, it says that I need a 96% of my streaming service
guaranteed. So it's the only variable the client chooses. It does not choose an absolute
amount of bandwidth. But it chooses how much bandwidth is guaranteed in my service.
So that's why it's different from the Amazon's charging model for the traffic. Because
there the Amazon charges for the usage. But here we propose to charge for the
bandwidth guarantee or the guaranteed service.
On when the client requires a guaranteed portion of WA -- I, say 96%, then there's a
utility associated with the clients. If you require a larger WI, the utility is higher. Also, it
depends on the demands of the clients. For example, the request -- among the request
from end users. So this is something to be measured. But here this is some
performance metric to be specified by the clients.
So at the cloud side we want to charge the clients for price. And this price is also
depending on the guaranteed portion. So basically the utility and the price both
depends on guaranteed portion, not the absolute bandwidth.
And now I think because very natural is the client I will choose guaranteed portion to
maximize its utility minus the price if the utility and the pricing function is determined.
So this is actually the profits of the clients. Say this is NetFlix. And NetFlix will choose
a certain guaranteed portion to maximize the utility minus the cost of doing so. And this
cost is the pricing function specified by the cloud.
Yes?
>>: [inaudible] after the guarantee, the 96% [inaudible] utility to drop [inaudible] or ->> Di Niu: After the 96%?
>>: Yes. So that the client wants to -- 96% of the bandwidth to be guaranteed.
>> Di Niu: Yeah. Right.
>>: And what is the utility after that? [inaudible] function of the client?
>> Di Niu: Oh, yeah. So I will give you an example of a utility function in the next few
slides.
>>: [inaudible] let's basically maybe [inaudible], I mean, basically this become model
[inaudible] because I mean [inaudible] cloud computing we have this important
resource, computation, memory.
>> Di Niu: Yeah.
>>: [inaudible] coupled.
>> Di Niu: Yeah.
>>: [inaudible] separate these two, right? And the storage and network. I may argue
network is the most elastic portion.
>> Di Niu: Yeah.
>>: Of the basically, I mean, bandwidth it can change a lot.
>> Di Niu: Yeah, right.
>>: And also network I will argue, also, is basically the component that the user did not
want, basically did not argue for guarantee by at least -- I mean, think about my usage
of Azure as a service.
>> Di Niu: Yeah.
>>: I don't need to say, okay, the bandwidth you have to be -- you get [inaudible] you
guarantee 80% of 800 nanobit. Because if the bandwidth is not available because it's
600K, 600 [inaudible].
>> Di Niu: Yeah.
>>: [inaudible] basically just adapt, right?
>> Di Niu: Yeah. Yeah.
>>: So I have to question ->> Di Niu: Yeah.
>>: I mean ->> Di Niu: Right.
>>: Basically if you actually basically put out this as a pricing ->> Di Niu: Uh-huh, uh-huh ->>: How do user, I mean, look at -- the plan has to be attractive enough ->> Di Niu: Okay.
>>: -- for the customers to be willing to use it.
>> Di Niu: Oh, yeah. Right. Right. So I think it's a different business model. So now
the model is to [inaudible] the network but just to charge the user for other traffic,
amount of traffic.
But my model is to say we need to carefully give this portion of bandwidth to user, and
we charge a fee for this reservation. And we can provide you with a guarantee saying
96% or 98%. And then we can safely use the rest of the bandwidth for other purposes.
And we can sell that to ->>: [inaudible] this pricing model is not attractive from basically a customer point of
view. It's going to be very difficult ->> Di Niu: Okay.
>>: Amazon or Microsoft.
>> Di Niu: Okay.
>>: To put that pricing model like this.
>> Di Niu: Uh-huh.
>>: You need to be more simple.
>> Di Niu: Okay.
>>: All right? I mean ->> Di Niu: Yeah, right. This is somehow, yeah, pretty complicated. And we need to -we need a cloud to estimate the bandwidth statistics. And then we will provide a
guarantee based on that statistic, right. So it's -- it's only after you pre-load to some
workload that is very estimatable, that is very predictable.
But for other computing requirements, it's very hard to give the guarantee actually.
Yeah. But, yeah, here I will give a ->>: [inaudible] push it back because I think this is the aspect because I think basically ->> Di Niu: Right, right, right. Yeah. Okay. Sure. So our objective is to maximize the
social welfare of the system, that is the total utility of everybody minus the cost of
serving them. But, however, the cloud does not know the utility of the clients, and the
client does not know the cost of function of the cloud. So how can we solve this
problem? There's no way to solve it right now.
However, we can solve it using the pricing idea. We just charge a price of using their
service to each client so each client will give me a guaranteed portion by maximizing its
surplus. And once I get that, I will evaluate if this is good for social welfare. If it is, I'll
just stay. If it is not, I will update the pricing policy to increase the social welfare.
So actually the pricing becomes a iterative solution to the social welfare maximization
problem. And that's why the pricing is a computational problem here.
So let me just put it into the more real example. So let's look at the example utility
function. So say I have a client. Say it is NetFlix or some video channel. And its utility
has two parts. So it has -- it is a function of the guaranteed portion. And in the first part,
it is the WIDI multiplied by a coefficient alpha I. WIDI is the usage. Because I need to
guarantee WI of my demand.
So there's a linear revenue of using -- of using this traffic. However, this part is not
guaranteed. So this will incur a utility loss due to the demand not guaranteed. And this
will be a convex function.
For example say four percent of my client is not satisfied. Then I may lose some
reputation because of that portion. And the more I cannot guarantee, the more I lose.
So that would be a convex function of the portion I guaranteed.
So in the end, the utility can be modeled as a contain function. And we are also looking
at the expected utility because we do not know this variable. This is just a random
variable. So we want to take the expected utility. And it's the expectation of the DI
multiplied by this. And also we do some calculation. It could be in this form for the later
part.
And if we look at the cost function, it's some form like this. KW would be the bandwidth
reserved from the cloud for all the clients. And this is the unit price of using a unit
bandwidth.
And let's explain this term in the next slide. For example, some of the printers can be
like this. And the absolute here is the percentage I did not guarantee.
>>: [inaudible] for you.
>> Di Niu: Okay.
>>: So I'm not really sure I understand the W versus the D.
>> Di Niu: Okay.
>>: [inaudible].
>> Di Niu: Okay.
>>: What is demand if it's not ->> Di Niu: Oh, yeah.
>>: Guarantee.
>> Di Niu: Yeah. DI is the demand from end users. For example, this one certain user
of NetFlix incurs some bandwidth at this moment. And that's the total demand from this
once-only users. The actual demand from the client -- from the end user.
>>: [inaudible].
>> Di Niu: Right.
>>: [inaudible]. You know, the actual bit rate that's [inaudible].
>> Di Niu: Yeah, the actual bit rate that needs to be provisioned to satisfy all of them.
For example there's one certain user requesting a video from the server. Then you
need to put the one certain video's bandwidth be great to satisfy all of them. But now
the guaranteed portions say it's 98%. Then I just guaranteed 98% of them. But the
other two percent is just best effort.
>>: Okay. So sometimes I think of, you know, the utility as being, you know, not
concave but convex, right? So it's useless unless you can give me what I'm asking,
what my demand is. So I don't understand the ->> Di Niu: The utility's a concave function here.
>>: Right.
>> Di Niu: Yeah.
>>: Sometimes ->> Di Niu: Okay.
>>: -- think it might ->> Di Niu: Okay.
>>: In some cases it's unique, you know, that it's zero ->> Di Niu: Okay.
>>: -- until you can supply me with what I am asking.
>> Di Niu: Okay.
>>: Like if I can't -- if I'm watching a video and you can't supply the bandwidth ->> Di Niu: Oh, yeah. Right. Right.
>>: -- then it's not going to be useful.
>> Di Niu: So say the client here is a company like NetFlix but there's another client of
NetFlix that's the end users. So the end user, there are, let's say, 1,000 end users.
And if I satisfy 98%, let's say 980 of all these users, I still get some revenue because of
the service. But for the other 20 users I neglect -- there's a loss of reputation. So they
might experience some interruption of the service because of that.
And if we ignore more users, if we ignore 30% of users, then the loss is even more.
Yeah. So that's how I get here.
>>: I mean, basically you always think in the network paradigm ->> Di Niu: Okay.
>>: [inaudible] networking utilization, have you [inaudible] been a different paradigm,
let's say basically MapReduce, basic processing paradigm, so here it's basically this,
right. I mean, you have a -- I mean, a cloud which is operating in the MapReduce
there's all the users this can be [inaudible] can be Azure, can be basically [inaudible].
And then the resource here is a number of [inaudible] you want to ->> Di Niu: Right, right, right.
>>: Right? I mean, and the utility can be measured by the latency.
>> Di Niu: Okay.
>>: By the time it basically takes to complete the job. So basically here is say, I mean,
I'm let's say basically for the same job --
>> Di Niu: Okay.
>>: If I decided to use a hundred computing instance, this may finish, let's say, in 50
seconds. If I use 200 things, maybe it will finish in 20 second or something.
>> Di Niu: Yeah.
>>: I mean, so basically you can get a curve which is amount of resource you recruit for
this job.
>> Di Niu: Okay.
>>: Versus the time that it complete [inaudible].
>> Di Niu: Right. Right. Right.
>>: The thing about this -- I mean ->> Di Niu: Okay. Yeah.
>>: Because otherwise, I mean, our work, we feel that this -- basically this utility curve
doesn't seem to be too realistic.
>> Di Niu: Okay.
>>: I mean, even for networking basically [inaudible].
>> Di Niu: Okay. Okay. Yeah. That's a very good idea, right. Right. I think the major
reason is -- I focus on the network scenario is because I have the data traces for
verification in the -- some video services. But, right. To model the utility relationship to
the instance number, that will be a very good idea. Right. Okay.
And the cost function, let's look at the cost function. So the cloud needs to reserve the
total amount of the bandwidth to satisfy all the submitted guaranteed portion. So all the
demand will be the sum of the WIDI. So this is each client's demand. And all the
demands should be greater than this reservation with a very small probability. That's
the relation probability.
And if we assume this demand are Gaussian, we can see that this can be expressed as
the mean and the variance of the WIDI, and each DI, if it's measurable, then it's the
mean vector of the DIs multiplied the guaranteed portion plus some function of the
covariance metrics.
And because of some mass manipulation, we can see that this guy is actually cone
centered at the zero. So the cost of function would be looking like something like a
cone, centered at zero.
And we find the major challenge here is that because of the multiplexing, the cost is
coupled among all the WIs, that's the guaranteed portion of all the clients. So we just
cannot solve the optimization problem very easily. We need some faster iterative
algorithms to do this.
And the existing method is very straightforward. It's called dual decomposition to
subgradient method. So the original problem is like this. And now we just convert it into
a Euclidean problem. Oh, sorry. But it just introducing another variable here that's
called V. So these are the same problem.
And we write the Lagrangian of this problem so the Lagrangian coefficient here is the
KI. And the dual problem would be just minimize the QK, where QK is a subpremium of
the Lagrantian function over WN.
And we find that's the coefficient here, Lagrantian coefficient can be explained as the
pricing function. So this is actually the linear pricing. And the first part is the utility
minus the price that's the surplus of each client. And the second part is the price minus
the cost, that's the revenue -- that's the income of the cloud.
And algorithm says that we need to update the price iteratively in every stat. And this is
the gradients of the QK. And this guy's actually the one that maximizes this one under a
certain K and this is the one that maximizes this one under a certain K.
However, there's some weakness of the algorithm. Let's look at the procedure again.
So first, a cloud provider will charge some price, and based on that each client will
return the surplus. And returned the guaranteed portion to the cloud provider. And
based on which the cloud provider update the price using the subgradient algorithm to
increase the social welfare.
However, this could be very slow because it's a iterative process. And there's an issue
of the step size choice. How do we update the price in each iteration? The step size is
very hard to choose. If we choose a small step size, it's very slow. If we choose the
large one, it may not converge.
And also, it's a synchronous algorithm in nature, so that we need to do this for all the
clients at the same time. However, that's not possible, because the message passing
delays between them is different.
And our theoretical contribution here is this new algorithm called equation updates. So
instead of using the subgradient algorithm, we look at the KKT conditions of this
optimization problem and we find this is the very easy form like this. So the utilities -the utility is equal to the cost derivative.
And based on that, we do the same thing as before, but the only difference here is this
price updates. So instead of update the price according to some small step size, we
just set the price to be the derivative of the cost at that particular guaranteed portion.
And then the price is passed back to the -- each client. And everything else is the
same.
And why do we call this equation updates? Because we can transform this to a
sublinear equation. Here is WI tilde is equivalent to solve this problem. So UI's
derivative equal to the price to solve WI tilde. So in the end, we can update these two
equations to get optimal pricing solution.
And such a -- such an update is performed for each client. It does not to be
synchronous. We find that the linear pricing, that is to charge a linear price would
suffice to lead to optimal solution. And we also allow the relaxation. That is, the price
can be weighted some of this guy and the previous price. Also, it can handle some box
constraints here. So that's our contribution to the algorithm.
So let's look at the convergence of this. For example, this is a single client case.
There's single clients. And optimal price happens here and optimal guaranteed portions
here when the derivative of the cost equal to the derivative of the utility.
And in our algorithm, we set a price -- we set a price here with some guaranteed portion
here. And the first price would be KI equal to the derivative cost. And then we pass
that back to the clients, which gives me a number WI tilde, which is here. And it goes
like this. So it converges could be very fast. And in the subgradient method, it actually
starts from these points but update using some small step size. So that's why the
convergence is very slow. Or it cannot converge. So that's the theoretical contribution.
And we actually provide a lot of convergence conditions in several works I just did
before. So the convergence, one of them is happening at this condition; that is the
derivative of the cost second other derivative is smaller than the second other
derivative, the utility which has taken the minimum over all the utilities.
So because I said the cost function is a cone at the center at the zero, so this guy's very
small away from the zero. Because it's a cone. And if you -- if you go away from the
zero, this becomes zero. That's why this one is always greater than this one. So the
condition could satisfy.
All right. Let's look at some simulation based on the real-world traces behavioral we
take this video service demand. There are some 1700 videos in this data set, and it
contains the bandwidth usage of each video channel collected from other users.
And we assume each of the video channel is a client of the cloud. So we want to
guarantee the performance of each video channel, and we want to charge a price for
that channel.
For example, this is a typical ->>: [inaudible].
>> Di Niu: Yeah.
>>: I know you see the peer-to-peer service.
>> Di Niu: Yeah.
>>: The demand is the -- the peer demand of the [inaudible] right?
>> Di Niu: Yeah, right. Right.
>>: Are you assuming basically, I mean, that this load is served by the servers to
basically [inaudible].
>> Di Niu: Yeah. Yeah. I assume, right. This is -- I assume the demand is server from
the centralized server. Right. Right.
So for example, this is the bandwidth demand of a particular video channel. Say it's a
sports video, right? So this is a bandwidth demand over time. And each time is 10
minutes. And 144 is a day. So this is a daily [inaudible] pattern here. And how we
model this into a Gaussian demand and we need some statistical tools. So let's look at
the each 10 minutes period. So we want to predict what's the demand mean in that 10
minutes? And the demand variation in that 10 minutes. And we need to do this step by
step so it rolls forward.
And we find that if we use this -- what's called a seasonal ARIMA model, we can
actually predict the next 10-minute demand mean very accurately. It's not far away from
the real demand.
However, there's some error -- there's still some error ->>: [inaudible] basically numbers minute or ->> Di Niu: Oh, yeah, it's the time period. So each time period is 10 minutes from the
next time period.
>>: So one six basically -- so one six a zero is one time -- one 10-minute time?
>> Di Niu: Yeah.
>>: And there are 1681 next minute time, right?
>> Di Niu: Yes. So it's 10 minute between these two steps. Right.
>>: Basically in the previous slide, slide 19, you are trying to predict a demand ->> Di Niu: Right.
>>: Can you get back to ->> Di Niu: Yeah. Oh, sorry. Yes, here.
>>: You are basically trying to predict the 10-minute basically demand of later.
>> Di Niu: Yeah.
>>: So just use the current 10-minute time. If you just use a current demand to predict
the [inaudible], what is the performance difference?
>> Di Niu: Okay. So ->>: I mean, you are basically using some kind of filtering basically to do prediction.
>> Di Niu: Yeah, yeah, right.
>>: It's [inaudible] just ->> Di Niu: Yeah. Right. So, right. That's a very good idea. If we just use the current
demand as an estimator of the next demand, then it's -- the curve looks almost the
same as this one. So the error won't be too large.
However, the error may not be modeled as a Gaussian variable that way. So there's
some minor, minor variation there. So -- right. So that's also option. That's the
simplest one. But here it can yield a little bit better result. And the error looks like this.
And we find we need to estimate the error standard deviation. So that will serve as the
variation of the Gaussian variable in each sharp period.
And we find we can use this model called GARCH from the metrics literature so that
when the demand fluctuates a lot, so we can predict a larger error standard deviation
when it goes down, when it becomes very tranquil, we will predict a very small error
standard deviation.
And what's more, this error series can be modeled as a Gaussian process. So if we
look at the many QQ plot, the errors of a particular channel, it seems Gaussian. And for
all the channels combined it also seems to be Gaussian here.
So the power of this seasonal ARIMA model is to produce this error series which has a
Gaussian behavior so that we can really model the demand as a Gaussian variable.
But the statistics of Gaussian variable is changing every 10 minutes.
We can also reduce the prediction complexity by using this PCA technique. So for
example if we have three biggest channels in all the 452 channels, the demand looks
like this. We can just use the PCA to find the trend of all the common trend behind this
series.
First of all, it's a daily trend. And second it's a downward trend after the video is
released because the video popularity is decreasing. And the third one is a timely trend
at different time of day. There's some different popularity.
And we find if we just use eight components we can explain most of data variance. So
that's why we don't need to train a model for each channel. We just need to train eight
channels model and finding this eight components. And then we predict for each of
these components and transform the prediction into the channel prediction using some
linear combination.
So that's how the statistical part is done. And after we get the statistical part, we can
plot this mu I and sigma I into our theoretical frame to do the pricing and the resource
allocation.
So let's look at the performance of our you in algorithm here as compared to algorithm
gradients. So algorithm one is our equation update. And this comma equal to one
means there's no relaxation and this is relaxation. That's the price is awaiting some of
the current price -- the previous price and the current price.
So we take a look at all these channels over 81 time periods, each one being 10
minutes. So basically for each time period we predict the demand statistics and solve
the optimization. And we want to know the performance of the optimization algorithm.
Let's look at the gradient method first. We find for all the gradient method there's some
times it does not converge. And even if it converges, it's converge iteration is high. The
mean is around 30 to 40 iterations. However, if you use our algorithm, it converges very
fast. It convergence below 10 iterations.
And especially for this relaxation, it convergence in all the cases. So that's all the 81
experiments here.
And we find that our algorithm can find better optimal solutions than the traditional
algorithm, so in our algorithm the final computed objective that is the expected social
welfare is always higher than the gradient method. So it converges faster and it
achieves better optimization performance.
And let's look at how the price look like. So if each tenant requires the guaranteed
portion of one that is always guaranteed all of my demands, then we solve the equation
-- we solve the optimization, we find that the optimal price look like this. The first one is
the mean of the demand. And this is the standard deviation of the demand. This is
constant depending on the violation probability. And what's important is here.
Let's look at the price discounts due to the multiplexing, that is the price computed here
as against the price of serving them individually. We find that there's some mean
discount of 44% with a total saving of 35%.
And the majority of the users enjoy a discount of 30% while there's some risk
neutralizers that can enjoy higher discount because they are anti-correlated to the
market. That means when or users do not use the service, their demand goes up. And
when the users demand -- other users' demand goes down, the other users use the
service, they try to quit the system. That's why we need to give them more discounts.
And also we get some additional results. We show that what if the cloud maximizes its
profits. And in such a model, I will see on the cloud maximizes the profits by controlling
the guaranteed portions while each client can submit any pricing function as a bid to the
cloud system.
And it requires a guaranteed portion of 1. We find that such computed price is also the
unique Nash equilibrium of the market.
So let's look at a summary of part 1. We can see how to price the cloud network
reservation as against a priced network usage. We find if each kind requires
guaranteed portion of 100%, the optimal price is computed function of the statistics.
And if each tenant requires a guaranteed portion less than one, this needs to be
computed using some iterative algorithm. And we propose some efficient algorithm
here.
So the take-away message is pay-as-you-go is not enough. And the second one is the
pricing should depend on the workload statistics. And also it is computational problem.
Okay.
>>: So these iterations go on very quickly ->> Di Niu: Yeah.
>>: -- or are they every 10 minutes one iteration or ->> Di Niu: Yeah. It goes on very quickly. So in each 10 minutes we do the demand
estimation and then we do this pricing. So for each pricing computation there's some
several rounds of iterations there to get the optimal price.
>>: So you would imagine like negotiating with NetFlix.
>> Di Niu: Yeah.
>>: They say here's a price at like say okay, I'll take that, and then you say oh, but if
you take that, my price is going to go up, then you give them ->> Di Niu: Right. Right.
>>: And then you keep doing that until you converge?
>> Di Niu: Right. Right. I'll assume NetFlix has a function there. So I just give them
their price and instantly they will return a guaranteed portion to me. And then I change
their price to them. So after several iterations ->>: Sorry, sorry, change your mind like that?
>> Di Niu: Yeah. Right. So I assume -- right. So that it's not the human that controls
that choosing function there. Right. Okay.
>>: What if they don't want to reveal all that?
>>: And pricing is tricky. So [inaudible] nowadays let's say if you have [inaudible] that
product price will go a fraction smaller. So, I mean, basically let's say basically
[inaudible] for 29.99. That's a fairly [inaudible] number. One, check Amazon. They will
give you 29.70, 29.60, 28.99, 20 [inaudible].
>> Di Niu: Yeah. So ->>: [inaudible].
>>: So real people actually do this too. All right.
>>: So I think there are people trying to work out something.
>> Di Niu: Yeah. I assume this pricing happens like in a computer right away, so the
computer sets the price very instantly. And the choice of the guaranteed portion also
changes instantly [inaudible] the computer, the NetFlix. So that's the assumption here.
Okay. So the natural idea is how do we utilize the similar idea for the computing
instance reservation? And it's inspired from this thought. So the on-demand instances
are charged by instance hours. However, the major concerns here is that a partial hour
is charged as a full hour. So for example AWS service, if you use instance for 10
minutes it's charged as if you use it for one hour.
So there's an inefficiency there.
And the reserve the instance allow you to pay a one-time fee and then enjoy discounts
or just free to use that reserved instance for the reservation period. However, you need
to highly utilize the reserved instance to gain the benefit of that.
So both of them are not so perfect, therefore, the end users for clients. So that's why
we propose this model. We will assume that there's a broker. Actually this is a
profitable service. So the broker can reserve lots of instances from the IAAS service.
And because of the reserve instance has a discount it can serve these instances to
different users with a price discount.
So, so when the user look at this instances, they are actually launching on-demand
instances from the broker. And when there is not enough instance, the broker will
launch some on-demand instances from the IAAS cloud to satisfy the additional
demand. And the money flow is like this, the client pays the broker, the broker pays the
cloud provider.
So the broker can actually supply the pricing gap between reserve instance and
on-demand instances to turn a profit because I get some cheap deal from the cloud
provider due to the discount so I can make some money. And also if I do not want to
make money, I just reward the discount to the users. So the cost of using the cloud is
lowered.
>>: Let me basically say one thing. Basically the difficulty of this.
>> Di Niu: Okay.
>>: So if we look at today, I mean, basically Amazon does have reserve price versus
on-demand price.
>> Di Niu: Yeah.
>>: However, the price is fixed. So the reserve price, let's say, is half cents and
on-demand price is one cent.
>> Di Niu: Okay.
>>: Always [inaudible].
>> Di Niu: Okay.
>>: [inaudible] always [inaudible].
>> Di Niu: Okay.
>>: The on-demand price does not change.
>> Di Niu: Okay.
>>: Let's say we use this model. The difficulty is for the user, right? I mean, basically
cloud computing is still on the verge of basically stopped.
>> Di Niu: Okay.
>>: Right? I mean, if I want to basically compute instance [inaudible] right, I need to
know the price.
>> Di Niu: Yeah: Yeah.
>>: And if this price basically just changes every hour [inaudible] something.
>> Di Niu: Okay.
>>: It's [inaudible] for the user ->> Di Niu: Okay.
>>: -- to make a decision of whether to use that service.
>> Di Niu: Okay.
>>: I mean, currently for example Amazon will give you a clear message ->> Di Niu: Okay.
>>: -- that say if you reserve this computer for the whole year ->> Di Niu: Okay.
>>: -- you basically charge 50 cents.
>> Di Niu: Oh, yeah. Yeah.
>>: Let's say?
>> Di Niu: Right.
>>: If you basically want to turn on and off on-demand ->> Di Niu: Okay.
>>: Let's say it's double or triple the price, okay?
>> Di Niu: Right. Right. Right.
>>: But if that's model ->> Di Niu: Okay.
>>: -- user can easily understand ->> Di Niu: Okay.
>>: -- right? And the user can plan its usage ->> Di Niu: Yeah.
>>: -- basically -- I mean, based on that.
>> Di Niu: Okay.
>>: Is that you want this price to fluctuate.
>> Di Niu: Okay.
>>: Once it fluctuate, it's become very taxing from the user's end to utility this model. I
mean, basically put that away. The user will not know whether to use this service or
not. We're spot on that.
>> Di Niu: Right. Right. I think that's why we need a cloud broker. Because the users,
they have no idea of how to utilize different payment options. The broker can do some
smart calculation, intelligence, and then just [inaudible].
>>: [inaudible] broker should be basically [inaudible]. Because Azure itself should be
the broker. So you will need to solve the price problem yourself.
>> Di Niu: Yeah. So, I mean, yeah. I mean, this is very complicated. If Amazon is the
broker or Microsoft is the broker then it can just set a very simple pricing scheme to the
users. And the rest of it is taken care of by the Microsoft or Amazon. So what I think is
that the user should be blind to different pricing options.
So say Microsoft offers very easy pricing option and the users may turn to the Microsoft
and then Amazon will follow the move. So that may be a case.
So the idea here is that if the broker say -- if it's a third party company uses this idea it
can save some cost for the users, then the user will turn to the broker instead of turn to
the Amazon Web Service. And then Amazon Web Service will follow such a move
maybe.
So I think might be a potential benefit to the users.
>>: I'm trying to check for Amazon elastic pricing.
>> Di Niu: Okay.
>>: See what its current elastic pricing model is.
>> Di Niu: Like, in our recent model I just checked, there are three options. One is the
on-demand, the other is reserved, and the third one's ->>: [inaudible].
>> Di Niu: Yes, spot instance is. And it's very hard for user to know how to utilize this
difference.
>>: The spot instance is still a fixed price. A fixed amount.
>> Di Niu: The spot price is changing every hour actually.
>>: Really? Okay.
>> Di Niu: Yeah. It depends on the user bid and the available of the additional
computing instances.
>>: [inaudible].
>> Di Niu: Yeah.
>>: Okay.
>> Di Niu: And the user -- if the user bids below the spot price, it cannot use the
service. So there's no idea for user to make sure whether its service can be guaranteed
or not because of the changing of the spot price. Yeah.
>>: So ->> Di Niu: Yeah?
>>: So for the spot instance, the clients demand it [inaudible] elastic [inaudible] so price
in the high, I'm going to not use as many, if it's priced low, then [inaudible].
>> Di Niu: Yeah.
>>: Are you doing the same assumption here, you're stating the price [inaudible].
>> Di Niu: I do not assume that. Right. Right. So, right, that's a different story there.
Okay. So let's look at the benefit of such a broker. The first one is that it can better
exploit the reserved instance because when you take all the demands of different
clients, this is more stable. And the reserve instances will favor the stable demand
because if the demand is bursty, then sometimes you are not utilizing the reserved
instance, and you cannot gain the benefit.
And the second one is we can multiplex the partial usage into this full usage so the idea
is here. Without the broker I'll say these are two different users. And the users one will
reserve -- will use on-demand instance like this, but its computed demand is only here
and here. Say within an hour there's idle time here.
And the second user will just use the instance two here. So in this case, we can just put
the both users on to the same computing instance. So we load the user one into the
computing instance here. And then we load user two here. And user one on this
instance again. So now we just need the cost of one instance hour instead of two
instance hour.
And if such things happen frequently in actual user demand, we can actually gain a
benefit.
So the major question here is for the broker to decide when to reserve the instances
and how many to reserve. And that will become a computational problem.
And we -- we actually take some look at the mass details. We you'll the input is the total
number of instances demand at time T over a horizon of one to big T. And this DT
means the total computing instances I need to serve all the clients.
So I need to output the number of instances to be reserved at each time. So using
some smart algorithms.
And the instance reservation will be effective from this time to tau period letter. So each
period is a billing cycle, say one hour. So this -- this reservation is only effective for tau
time periods.
So the optimization problem is to minimize the cost of serving -- of using the cloud
service. Here the comma is the one time reservation fee. So at time T we reserve RT
instances. We pay this amount of money. And over the whole horizon we pay the
totalled reservation fee of this guy. And this P is the usage fee of the on-demand
instance.
That means at time T we have demand DT, and this is the one reserve the instances
that's already effective. So this is the additional amount of instances we need to launch
to satisfy the demand. So this is on-demand's usage fee actually.
And we take a look at this NT so the number of reserve instances that remain effective
at time T, it will be equal to our I summed over these time periods. That it's only
available starting from this time because the reservation period is tau.
And there's no discount, we'll assume that because for the reservation we just assume it
pays for a one-time fee and we can use it for the entire reserve period. And that's the -some example is provided by AWS service, having utilized the instances is using this
model.
And some other model assume there's some discount for using such a reserve
instance. But we just assume there's only one time reservation fee here.
>>: So are you chopping the hour up into many slots, or are you taking many hours?
>> Di Niu: Yeah. I'm taking each hour as an individual hour. So each T here is an
hour.
>>: [inaudible].
>> Di Niu: Yeah. Each little T is an hour. Say I need to do this planning for the future
10 hours.
>>: Okay.
>> Di Niu: Yeah.
>>: And we find such a problem in integer programming. And it's very hard to solve.
And we formulate this into a dynamic programming and we find this has an exponential
complexity. So our contribution is to provide some simple algorithm that has some
performance guarantee. That means it's not far away from the optimal cost.
And also we propose online algorithm that does not rely on the knowledge of the future
demand. Say I do not need to know what's the user's requirement in the next 10 hour.
And also this one also has some performance guarantee where just working that
currently.
And let's look at the first I. The first algorithm is called the periodic decision. And we
call it heuristic later. So we segment the whole horizon into these tau intervals. Say
this is tau reservation. Period is 6. So we segment it into these 6 intervals. And the
idea is to reserve L instances at the beginning of each interval and only at the beginning
of each interval such that the utilization in that level is just above the reservation fee
divided by the usage fee.
So for example if the reservation fee is 2.5 and the usage fee is 1, we need to reserve
these two instances. That is because if we reserve the instance here, the utilization is
too low. It's only 2. So it's not good enough. And if we reserve here, it's 3. So we just
reserve up to here because this value is 2.5.
And, yeah, shall this is very intuitive. And the details can be worked out easily. And this
has a limitation because we only reserve at the beginning of each period tau period. So
we need to improve that by allowing reservation during the middle times. So we
proceeds this greedy algorithm. We do a top-down level-by-level dynamic
programming. So we look at the first level at the top here. And we do a dynamic
programming to decide when to reserve and how much to reserve actually one, just to
reserve one instance in this level.
And after we decide that, if the demand in that level is no covered by the reservation, it's
-- it's passed down to the next level. And we do a DP again here.
So this is better than this one, because we allow reservation during the middle times
here. And we find both algorithms are too competitive. So we approve that both
algorithms at most doubles the optimal cost. And this requires some theoretical
analysis.
And both algorithms has low complexity. So the complexity is at most the horizon. So
let's say we look at 10 hours multiplied peak demand. Peak demand here is five. So
that's the complexity here. And this is very practical and can be implemented.
And let's look at the simulation. How do we do the simulation? We look at the Google
cluster usage. It has 180 gigabytes of resource usage of more than 900 users over
about one month of 12,000 physical servers.
>>: Where do you get that usage data?
>> Di Niu: Oh, this is publicly available and released sometimes one year ago.
>>: How about -- I mean, why only -- there's only 180 gigabyte?
>> Di Niu: Oh, so, it's because they're in total for 00 gigabytes, but we just used part of
them.
>>: Oh. So it's basically size of the usage data is 180 gigabytes?
>> Di Niu: Yeah, right. Right. Yeah. And but that's the Google like private server. But
how do we study this -- this problem?
So that trace actually reflect the computing demands of the Google engineers and
services but can represent some demands of the public cloud users to some degree.
Say this Google services and engineers serve using IAAS model. So how do we do
that?
So we map the demands for these different resources into the demand for a number of
computing instances served from the web service and the web service.
And such mapping is actually nontrivial. But I'm going to neglect the details. And they
consider some constraints of applications such as in MapReduce the difference -difference reduce jobs -- different map jobs cannot be on the same machine.
But there's some other OS constraints that we did not consider. And this is actually
scheduling algorithm, scheduling problem itself.
And we set on-demand hourly rate to be this. And this eight cents is the on-demand
rate of the Amazon Web Service small instance. Back to the time we do this research.
And the graduation period is one week because we have only one month data. If we
have one year data, we can assume the reservation period is longer.
And the usage discount of reserved instance is 50%. But it's equivalent to, say, we pay
a one-time fee. Because for the entire reservation period, we pay half price for that
instance.
And the demand curve of several users look like this. This is a bursty user and this is
like a more medium fluctuation user, and this is low fluctuation user.
And we group the users into different groups based on their demand fluctuation. We
want to see how the users in this different groups can benefit from our service, which is
the most good one for such a service. And we -- we calculate the demand standard
deviation, the demand mean over the one month period of each user. And we plot each
user as a circle here. And we find that this is the high fluctuation users, this is the
medium, and this is the low fluctuation. And these values are set by ourselves
artificially.
And we want to just a preliminary exploit how the aggregation can suppress the
fluctuation, bring benefits. We find that these are the users and if we aggregate all the
user demands and look at aggregate curve, then the fluctuation is suppressed because
the demand standard deviation divided by mean goes down.
And also here if we aggregate all of them, we find the wasted instance hours: That's the
total time that is used, but that is not used but actually charged. So the total partial
hours is reduced if we aggregate all of them. And this is most standard in the medium
user, medium fluctuation users. And for all the users it can reduce up to 23% of all the
partial usage.
So that means we might gain from such an aggregation. And let's look at how our
algorithms would work.
So this figure plots the total service cost without the broker in different user groups
under three different algorithms. For each algorithm, we compare ->>: [inaudible].
>> Di Niu: Yeah?
>>: [inaudible] can it cost?
>> Di Niu: Yeah: So I set the on-demand rate to be $0.08 and that's the price of the
Amazon Web Service. And the reservation discount is half price. And the reservation
fee we just take from the Amazon Web Service.
>>: So the ->> Di Niu: Yeah?
>>: Basically you assume if you just ask a spot, the spot price is always the same,
which is twice the reservation; is that right?
>> Di Niu: Yeah. I only considered on demand and reservation instance here. The
reserve and on demand instance, but did not consider the spot instances yet.
>>: So the question is what's the on-demand price?
>> Di Niu: The on-demand price is the small instance rate. That's the eight cents per
hour.
>>: Okay.
>> Di Niu: Yeah.
>>: And the reserve price is?
>> Di Niu: The reserve price is four cents per hour. And the reservation fee is -- it's in
the paper, but I need to check. It's just copied from the Amazon Web Service.
Yeah, so for each algorithm this guy is with broker. So the broker will do the reservation
using this algorithm. And this cost is without broker. That means each user uses this
algorithm. And so that we can do the fair comparison here. And we find that there's a
large amounts of cost reduction, like 5,000 here.
And, however, different algorithms are about the same, especially when the users do
the reservations themselves, because whether the users do it themselves, there's
actually no reservation because it's high fluctuation. They will decide to use some
demand instances anyway.
>>: What is the cost?
>> Di Niu: The cost is the total cost of using the cloud service. That's the ->>: [inaudible].
>> Di Niu: Yeah. The total cost paid to the cloud.
>>: Okay.
>> Di Niu: Yeah: So that's the reservation fee plus the on demand fee.
And for the medium fluctuated users, we see that there's even more saving. Especially
for this greedy algorithm. And the online is a little bit worse.
>>: [inaudible].
>> Di Niu: Uh-huh?
>>: For medium fluctuation how come the saving behavioral you can -- can get to slide
of the ->> Di Niu: Yeah, sure.
>>: -- of the medium fluctuation.
How come basically the price difference without broker, without -- with broker is
basically more than the fact of two? Because that reservation and the basically
on-demand pricing, the difference is only a fact of two, right?
>> Di Niu: Okay.
>>: So if I just basically choose on-demand always ->> Di Niu: That's a good question because it starts from 50 here. So it's not more than
double, yeah. It's about -- let's see. It's 75 here. So it's less than double. But almost
double.
>>: It's almost double.
>> Di Niu: Yeah, almost double is right. Yeah. But it could not be, yeah, more than
double.
And the low fluctuation there is little benefits with or without -- with the broker because
the low fluctuation they're stable anyway. So if you do the aggregation, using the
reserve service, if you do not, they are still using the reserve service.
And for all the users we find there's also some saving of 100K. So the total cost is
800K, and you could save 100K. So the saving is about 10% to 15% in total.
So let's look at the percentage in detail here. So for all, it's between 10 to 20%, about
15%. And for medium, it's the highest. And high a little bit lower. And low is lowest.
Because for low fluctuation, you are using the reserve, the service anyway, reserve
instance anyway. And for high, you are -- you are not using the reserve instances
anyway. You are using the on-demand instance. And the most benefit is in the medium
fluctuated group.
And let's look at what's the user benefit of such a service. We will assume such a
simple pricing scheme, that is the -- the targets from the broker, the charges of the
broker to the users. Yeah?
>>: So without -- without the broker, what algorithms do you [inaudible]?
>> Di Niu: Oh, without the brokers, they're using the same algorithms. So we'll assume
just each user uses a heuristic or each user uses a greedy. And some they're
[inaudible]. Yeah.
And the simple pricing of the broker is let the users share the aggregate cost in
proportion to their instance hours. So that means we just charge user based on the
instance hour. That's the on-demand way to do the pricing. And we find in these
curves 70% of all the users receive a discount more than 25%. So on average each
user receive about 30% of the discount. So this can also benefit the users.
And how does the scene changes when the reservation period change? We just
considered one week before. But now if we consider different reservation period, say, if
there's no reservation period, then the saving is very -- is very little. It's below 10%. So
that means all the savings are coming from multiplexing the partial usage. There's no
benefit from reservation. But when the reservation periods goes up, the saving goes
even higher. So that's good news.
In reality, we have longer reservation periods, and we can get more benefits.
>>: [inaudible] something like half peak. Reservation period is half [inaudible].
>> Di Niu: Oh, yeah, yeah. Half a year, right. On there's one year, something like that:
>>: [inaudible] this is corresponding to the side cost ->> Di Niu: Yeah.
>>: They can put the hardware [inaudible].
>> Di Niu: [inaudible]. I see. Right. Okay.
And the second question is what if a billing cycle is not and hour? If it's a day, what
happens? And actually this is some model taken by this cloud provider called
VPS.NET. It's some small cloud provider.
But there's a possibility of taking a billing cycle as a day. And we find the total saving
goes up, also, because there's more partial usage in this case.
And this is the saving of individual users. Histogram. And we see the saving -- mean
saving is higher. So what are the lessons learned from this case study? We find the
cost savings are coming from several aspects. First it's coming from the reserve
instances and the demand instances pricing gap. And the broker can use this to gain
benefits.
And the second one is when we aggregate all the clients' demands, instead of serving
them individually, we have a smooth curve, smooth demand curve. Then it can use the
reserve instances with high utilization. So the aggregation has some benefits.
And the third one is that we -- if we coordinate all the clients together, we can do this
instance multiplexing instead of serving them individually.
And the fourth one is the one we didn't study in our research because we just assume
each user is also smart enough. But in reality, like Dr. Li said, each user is not smart
enough, and they may not use those high intelligence strategies to make the
reservations. And we assume that the users are dumb and our service is smart, then
we can gain more benefits.
Also if we serve them all together, we can get a volume discount because of the high
demands the broker gets from the cloud provider. However, the concern is that we
must check if loading a you in user on to a you in instance opens a billing cycle.
Because in multiplexing the partial usage, we switch the users on the same instances
all the time, we need to make sure loading the same user -- a different user on to the
same instance does not open you in billing cycle to avoid the additional charges.
And we find this -- the good news here is that it's not true on the EC2 heavy utilized
instances, so that we can actually do this on such instances. Or some other instances
like from this called provider, ElasticHosts.
The finally problem is how do we estimate the demand? So I just assume the
knowledge of the future demand this DT, small DT. But in reality, that might not exist.
Sometimes the user can submit their demand before hand, but sometimes not. In this
case, we can do two things. The first thing is that we -- if we aggregate all the clients'
demands, there might be a better trend for us to predict. And the second thing is that if
we do not know the future, we can use the online algorithm and still get some benefits.
And the related work, a little bit here. So the workload characterization prediction is
done by several people from like Silva and Gursun and some spike characterization
problem, and this is done by some Microsoft people.
And from the distributed optimization literature, there's this famous primal/dual
decomposition. And there's a good survey paper here by Chiang, et al.
And there's this book by Bertsekas and John Tsitsiklis on parallel methods like these
kind of algorithms to solve distributed optimization.
And the third thing that we didn't touch here is auction type of pricing. That also gains
popularity recently. For example, this year [inaudible] by IBM people. So they are
trying to design some truthful auction markets in the cloud computing instance.
And also some optimal bidding strategy. To bid the spot instance is to guarantee their
service. So this is also some interesting work.
And some concluding remarks. So the first thing I need to mention is that cloud
computing enables algorithmic pricing and capacity planning. So now with cloud
computing we can do this computerized pricing and planning. And the idea is to design
efficient algorithms to handle very large systems with a lot of input data. So that's the
right way to go, I think.
And also, we can design some complicated pricing schemes based on options, futures,
and other derivatives. Because now everything is done in the computer and online with
the web user interface we can allow more intelligent and computerized pricing and
bargaining.
And the second thing is that demand statistics are very important in had cloud
computing. And we need to rely on them for pricing.
And that's why the use of machine learning and statistical techniques is very important
here.
And finally, I think, the fair pricing of shared resources requires some knowledge of the
game theory. So what we studied here is distributed optimization and a little bit of
game. However, for a real fair price -- pricing, we might need to use this cooperative
game theory, such as Shapley Value. However, that is also very challenging to
compute here.
So this assumes users conclude and how the economic results could change. So that
would be also a very promising direction.
And thank you for your attendance. And I will be open for any questions from you.
Thank you.
[applause].
>>: Very interesting talk.
>> Di Niu: Thank you. Thank you.
>>: Just to [inaudible] the second part of your talk you constructed a synthetic workload
based on some Google data ->> Di Niu: Yeah.
>>: And then some assumptions about what -- and then you ran that against the
Amazon pricing.
>> Di Niu: Right. Right. Right.
>>: Okay.
>>: [inaudible] assuming, let's say, there is [inaudible] which basically sit between
Amazon and the user, and the Google user ->>: So using that workload, you showed some advantage?
>> Di Niu: Yeah.
>>: Of having a broker. The let's see. Does the broker look to the users like -- I mean,
it also has a reserved and ->> Di Niu: Broker will try to predict demand from the user. You.
>>: But from the users' point of view, whether interact dealing with the broker, they also
->> Di Niu: Yeah.
>>: -- can have reserved ->> Di Niu: No. They ->>: What's the pricing model?
>> Di Niu: The pricing model of the broker is the -- like on-demand model. So the user
just pay for rates for each hour it uses. Yeah.
>>: [inaudible] one price ->> Di Niu: Just one price.
>>: This is the price for this hour?
>> Di Niu: Yeah. Right. That's like a single pricing scheme. So, yeah, the whole idea
is to eliminate the complicated pricing structure of the cloud and just to provide a better
interface for the user to use the cloud.
So the broker actually do some this reservation or on-demand launching from the cloud.
But the user just pays a fee like this. So this is a simple pricing strategy that is to let the
users share the aggregate cost in proportion to their instance hours.
>>: They just -- so every hour the price changes? So they can't do reserve. They don't
->> Di Niu: Yeah. Yeah, yeah. Every hour ->>: Is this arguable that that's more complex?
>> Di Niu: Yeah, every hour the price changes.
>>: For the user cloud, how many user pay more in -- on the broker than basically
without a broker?
>> Di Niu: Yeah. That's a very interesting question. So in this scheme, no user pays
more. So all the users pays less, actually. Because now we take a look at the whole
demand curve, and each user just pays its portion under that curve.
>>: [inaudible].
>> Di Niu: Yeah.
>>: [inaudible] let's say I'm a user with a very fixed signal.
>> Di Niu: Okay.
>>: Right?
>> Di Niu: Okay.
>>: I can pay reserve price directly.
>> Di Niu: Yeah. Right.
>>: To basically Amazon.
>> Di Niu: Yeah.
>>: And that cost should be minimal, right?
>> Di Niu: Yeah.
>>: I mean, basically ->> Di Niu: Yeah.
>>: How -- I haven't be able to safe [inaudible] in that case, right?
>> Di Niu: Right, right, right, right, right. Yeah. Yeah. Actually, yeah, this problem is
very interesting.
So if it consider the real fair pricing strategy, it's going to be the Shapley Value. And
under that strategy, some users pays even more under the broker. But here I just see,
of course, every user pays less than without the broker. But it's not fair ->>: [inaudible] low fluctuation user, my price will be lower, right? The price is not the
same on-demand for all users. I will have a lower price ->> Di Niu: Yeah.
>>: Because of lower fluctuation.
>> Di Niu: Yeah, right, right, right. Yeah.
>>: For different users?
>> Di Niu: Yeah.
>>: Can you go back to slide to see where basically that's reflected?
>> Di Niu: Oh, yeah. It's here. It's actually here. So here I will assume each user pays
the same fee for every instance hour.
>>: [inaudible] actually use that [inaudible].
>> Di Niu: Yeah.
>>: [inaudible] aggregate price calculated [inaudible] for that after use this hour. And
then next hour it will be different.
>>: Yeah, but in that case the on-demand is always satisfied. You see how many
ECC1s and then it's always ->>: You don't know the price before you ->>: You don't know. But it's going to be cheaper than -[brief talking over].
>>: Is first argument is any user [inaudible] than basically the old schemes. If this is
basically the pricing strategy [inaudible] I don't think this property holds. Because there
may be user [inaudible] right? I mean, [inaudible].
>>: [inaudible] your own reservation.
>>: [inaudible].
>>: You are saying if you [inaudible].
>>: [inaudible] your own reservation.
>>: Right. Yeah:
>>: [inaudible]. Your benefit gets [inaudible].
>>: Here basically the reason I'm arguing this.
>> Di Niu: Yeah.
>>: Because potentially all the user with no savings, they will not join the system. They
will not join the broker.
>> Di Niu: Right, right, right.
>>: Right?
>> Di Niu: Right.
>>: And if you remove those users for the remaining user, will you still be able to save?
>> Di Niu: If you remove the users ->>: If you remove all those smooth users.
>> Di Niu: Okay. Okay. They do not have an incentive to join, right? Okay.
>>: [inaudible] reservation and then [inaudible] lower than this one. Because this one
actually do some on-demand thing, right?
>> Di Niu: Yeah. Right. Okay. Yeah. This is the -- yeah. The problem -- I was
confused before submitting the paper, actually.
>>: [inaudible] maybe let's discuss this off ->> Di Niu: Yeah.
>>: Another quick question is for your trace, do you always assume user starts at the
hour boundary, or they can actually start anywhere?
>> Di Niu: They can start anywhere.
>>: [inaudible].
>> Di Niu: Yeah.
>>: So ->> Di Niu: So that's why there's some partial usage.
>>: [inaudible]. How much is the partial usage contribute to go that [inaudible] instead
of doing hourly think you can do 10 minute [inaudible].
>> Di Niu: Okay. Okay. How much is the -- okay.
[brief talking over].
>> Di Niu: So if there's no reservation ->>: In my office.
>> Di Niu: Then the partial usage saving is here. So that's amount of partial saving -partial usage saving. Yeah.
>>: Here's the schedule for the day.
>> Di Niu: Sure.
>>: I'm sure you are in very good hands. And then it's Ken and after that it's basically
Jay. [inaudible].
>> Di Niu: Okay. Okay. Thank you so much.
>>: I think basically ->> Di Niu: Okay.
Download