Dynamic Scaling of Virtual Clusters with Bandwidth Guarantee in Cloud Datacenters

Dynamic Scaling of Virtual Clusters with
Bandwidth Guarantee in Cloud Datacenters
Lei Yu
Zhipeng Cai
Department of Computer Science, Georgia State University, Atlanta, Georgia
Email: [email protected], [email protected]
Abstract—Network virtualization with bandwidth guarantee is
essential for the performance predictability of cloud applications
because of the shared multi-tenant nature of the cloud. Several
virtual network abstractions have been proposed for the tenants
to specify and reserve their virtual clusters with bandwidth
guarantee. However, they require pre-determined fixed cluster
size and bandwidth, and do not support the scaling of the cluster
in size and bandwidth requirements. On the other hand, the
existing works on virtual cluster scaling focus on dynamically
adjusting the cluster size without considering any bandwidth
guarantee targeted by current network abstractions. To fill
the gap, this paper considers the problem of scaling up a
virtual network abstraction with bandwidth guarantee. Efficient
algorithms are proposed to find the valid allocation for the scaled
cluster abstraction with optimization on the VM locality of the
cluster. We also point out the case that a virtual cluster cannot be
scaled without changing its original VM placement, and propose
an optimal allocation algorithm that exploits the VM migration
to address this issue while minimizing the total migration cost
for the virtual cluster scaling. Extensive simulations demonstrate
the effectiveness and efficiency of our algorithms.
I. Introduction
With modern virtualization technologies, cloud computing
offers infrastructure as a service that provisions virtualized
computing resources over the internet with pay-as-you-go
pricing model, such as Amazon EC2 [1]. The resource virtualization in the cloud datacenter enables efficient resource
sharing and multiplexing across multiple tenants as well as
on-demand resource provision. However, a major concern for
moving applications to the cloud is the lack of performance
guarantee in the cloud. The applications of different tenants
compete for the shared datacenter network resources, and no
bandwidth guarantee in toady’s public cloud platforms causes
them to suffer from unpredictable performance [2].
To efficiently virtualize the network resource with bandwidth guarantee in the cloud, several virtual network abstractions [3]–[7] have been proposed, which are used to express
the network requirements of the tenants. In these abstractions,
a tenant can specify the number of virtual machines (VMs), the
topology of the virtualized network, and bandwidth requirements between VMs. Most of these abstractions are variants
of the hose model, in which all VMs are connected to a
central virtual switch by a bidirectional link with the specified
bandwidth, like Oktopus [4], TIVC [5] and SVC [6]. Other
network abstractions, either specify bandwidth requirement
between every pairs of VMs as virtual pipes [3], or are derived
based on the true application communication structure [7]. All
these abstractions not only provide a simple and accurate way
for the tenants to specify their network demands, but also
facilitate the network resource allocation for the datacenter.
Various algorithms are proposed along with them for allocating
VM and bandwidth resources in the datacenter to realize
corresponding network abstractions.
The resource virtualization of the datacenters provides elasticity [8], a key feature of the cloud, which enables capacity
scaling up and down on demand for the cloud applications to
adapt to the changes in workloads. It is achieved through ondemand dynamic resource scaling either at the VM level [9]–
[12], namely, on the CPU, memory and I/O resources allocated
to a VM, or at the cluster level [12]–[14], namely, on the
number of VMs or VM types with different capacities in the
virtual cluster. Such resource scaling not only is an efficient
approach to maintain the performance of cloud applications
under varying workloads but also improves the resource utilization of the datacenter.
The existing works on virtual cluster scaling [12]–[14],
however, overlook the network requirement targeted by virtual
network abstractions. They focus on dynamically adjusting the
number of VMs in a virtual cluster, but without maintaining
any bandwidth guarantee along with the cluster scaling. On
the other hand, existing virtual network abstractions [3]–[7]
use pre-determined fixed bandwidth and number of VMs, and
they cannot support the elastic scaling of the cluster in size
and bandwidth requirements. Given a virtual cluster that has
been deployed in the datacenter, the shrinking of its size and
bandwidth requirement can be trivially performed by releasing
the unneeded VMs and bandwidths. However, its expansion is
not trivial. The increase of cluster size and VM bandwidth in
a virtual network abstraction is limited by available VM slots
and network bandwidth in the datacenter. The cloud might
not be able to accommodate the scaled network abstraction
without changing its original VM placement. It is critical
to properly and efficiently allocate/re-allocate the VMs for
scaling while maintaining the bandwidth guarantee. To the best
of our knowledge, no previous works have been proposed to
address the scaling of virtual network abstractions. Therefore,
we aim to fill this gap in this paper.
In this paper, we address the problem of scaling up a virtual
network abstraction with bandwidth guarantee. We consider
the common hose model based virtual cluster abstraction [4]–
[6], denoted by < N, B >, in which N VMs are connected to a
virtual switch with links each having specified bandwidth B.
We focus on its scaling in size from < N, B > to < N , B >
(N > N) with unchanged bandwidth B. We identify the
challenging issues arising from such scaling, and propose
efficient allocation algorithms to determine the VM placement
for the scaled cluster. Specifically, we first propose a dynamic
programming algorithm to search for the placement of N − N
additional VMs in the abstraction that can still ensure the
bandwidth B for each VM through appropriately reserving
bandwidth on network links. The algorithm is further improved
to maximize the VM locality such that the newly added VMs
can be put as closer as possible to the pre-existing VMs
to reduce the communication latency and network overhead.
Further more, we point out that a virtual cluster might not be
scaled without making changes to the original VM placement
and in this case the above algorithm cannot find a valid
placement. To address this issue, we exploit the VM migration and develop an optimal algorithm to allocate the scaled
cluster that minimizes the total VM migration cost. Extensive
simulations demonstrate the effectiveness and efficiency of
our algorithms. The results indicate that the scalability of
virtual clusters is seriously hindered by their pre-existing VM
placement but can be significantly improved at small VM
migration cost. Last but not the least, in our discussion part,
we show that our algorithms can be easily used to solve other
kinds of virtual cluster scaling, including the scaling from
< N, B > to < N, B >(B > B) and from < N, B > to
< N , B >(N > N, B > B).
The rest of this paper is organized as follows. Section II
introduces the related work. Section III describes the problem
and presents our first scaling allocation algorithm. Section
IV and V present the scaling allocation algorithms with
locality optimization, and with allowing VM migration while
optimizing its total cost, respectively. Section VI shows our
simulation results. Section VII presents our discussion and
section VIII concludes the paper.
II. Related Work
A. Virtual Network Abstraction with Bandwidth Guarantee
Several virtual network cluster abstractions [3]–[7] have
been proposed for efficient network resource allocation with
bandwidth guarantee. Oktopus [4] proposes a virtual cluster
abstraction based on the hose model, denoted by < N, B >,
in which N VMs are connected to a central virtual switch by
a bidirectional link with the specified bandwidth B. Based on
that, a temporally interleaved virtual cluster model TIVC [5] is
further proposed, which allows different bandwidth specifications during different time intervals to address the time-varying
bandwidth requirements of many cloud applications. SVC [6]
incorporates the information of the statistical distributions
of bandwidth demand into the virtual cluster abstraction to
address the demand uncertainty. CloudMirror [7] proposes a
network abstraction that represents the true application communication structure and distinguish bandwidth requirements
for different components and their inter-communication in
an application. Secondnet [3] proposes a virtual datacenter
abstraction that describes the bandwidth requirements between
each pair of VMs and uses a central controller to determine
the flow rate and the path for each VM-to-VM pair. To
achieve bandwidth guaranteed virtual to physical mapping,
various VM allocation algorithms and bandwidth enforcement
mechanisms have been proposed along with these network
B. Elastic Resource Scaling
A number of works have been proposed to dynamically
scale the resources either at VM level or at cluster level or both
TABLE I: Notations
C, C N, N Tv
T v [k]
S v [k]
The original cluster and the corresponding scaled cluster
The original cluster size (the number of VMs) and new size
(N > N)
The subtree rooted at the vertex v
The allocable #VM set of the vertex v
The subtree consisting of the root v and its first k child
The set containing the number of VMs that can be allocated
into the subtree T v [k] regardless of v’s uplink bandwidth
The number of VMs of the original cluster C located in the
subtree T v
to meet the performance requirement of cloud applications
under workload changes. Padala et al. [9] proposed a feedback control system to dynamically allocate CPU and disk
I/O resources to VMs based on an online model that captures
the relationship between the allocation of resource shares
and the application performance. Shen et al. [10] proposed
a prediction-driven automatic resource scaling system, which
uses adaptive estimation error correction for resource demand
to minimize the impact of under-estimation error, and support
multi-VM concurrent scaling with resolving scaling conflicts
on the same PM by VM migration. Gong et al. [11] proposed
a lightweight online prediction scheme to predict the resource
usage of VMs in an application and perform resource scaling
based on the prediction results. AGILE [13] is a resource
scaling system at the cluster level, which dynamically adjusts
the number of VMs assigned to a cloud application with live
VM replication to maintain to meet the application’s service
level objectives. Herodotou et al. [14] proposes an automatic
system to determine the cluster size and VM instance type
required to meet desired performance and cost for a given
workload. Han et al. [12] proposed an approach that integrates
both fine-grained resource scaling at the VM level in addition
to cluster size scaling. These cluster level scaling approaches
do not consider the bandwidth guarantee targeted by the virtual
network abstractions.
III. Virtual Cluster Scaling
A. Problem description
In this paper, we consider the hose model based virtual
cluster abstraction [4]–[6], which consists of N homogenous
VMs, with bandwidth B for each VM, denoted by < N, B >.
There are three types of problem instances for scaling up a
virtual cluster < N, B >, which are either to increase the cluster
size from N to N (N > N), or to increase the bandwidth
from B to B (B > B), or to increase both. We mainly focus
on the scaling of the abstraction < N, B > to < N , B >, and in
Section VII we show that the algorithms proposed for it can
also efficiently solve other two types of scaling problems. As
previous works [4]–[6], we assume typical tree topology for
the datacenter, from physical machines at level 0 to the root at
level H. Each physical machine is divided into multiple slots
where tenant VMs can be placed. For the clarity, Table I lists
the notations used in this paper.
B. Scaling Allocation Algorithm
Suppose that a virtual cluster C < N, B > has been deployed
in a datacenter. Considering a link L in the network, it con-
ο௅௥ ൌ ͵
ο௅௟ ൌ ͳ
RL= 100
RL= 100
2 VMs
1 VM
Bandwidth reservation on L : 100
4 VMs
Bandwidth reservation on L : 300
Fig. 1: This example shows the difference on the valid
allocation between virtual cluster scaling and virtual cluster
nects two separate network components in the tree topology.
Suppose that the virtual cluster has m (N ≥ m ≥ 0) VMs in one
component and hence N − m VMs in the other. Because each
VM cannot send or receive at a rate more than B, the maximum
bandwidth needed on link L is min{m, N − m} ∗ B. Therefore,
the amount of bandwidth reservation for the virtual cluster on
link L should be min{m, N −m}∗ B in order to guarantee access
bandwidth B for each VM.
When the size of the virtual cluster needs to be increased
to N (N > N), Δ = N − N number of new VMs have
to be allocated in the datacenter. Let the number of new
VMs allocated to the two components connected by link
L be ΔrL (ΔrL ≥ 0) and ΔlL (ΔlL ≥ 0), respectively. Then,
the bandwidth reservation on link L should be increased to
min{m + ΔrL , N − m + ΔlL } ∗ B, which obviously is not less than
previous bandwidth reservation min{m, N − m} ∗ B.
If min{m + ΔrL , N − m + ΔlL } > min{m, N − m}, residual
bandwidth on link L has to be enough for the increase of
the bandwidth reservation. The increment depends on the
allocation of new VMs resulting in different ΔrL and ΔlL .
If L does not have enough bandwidth for the increment of
bandwidth reservation, such allocation is not valid. Therefore,
the problem of increasing the size of a virtual cluster can be
solved through finding the valid allocation for Δ VMs to be
added. We can regard these VMs as a complete new virtual
cluster and search its valid allocation in the datacenter with
a dynamic programming approach similar to the methods in
TIVC [5] and SVC [6]. Let the residual bandwidth of link
L be RL . The key difference compared to the allocation for a
new virtual cluster is that, the validity of VM allocation across
any link L should be determined by the following condition
RL ≥ min{m + ΔrL , N − m + ΔlL } − min{m, N − m} ∗ B (1)
rather than RL ≥ min{ΔrL , ΔlL }∗ B. These two conditions are not
equivalent, exemplified in Figure1. It shows a virtual cluster
of size 3 having 2 VMs and 1 VM in two subtrees respectively
before scaling. The link L has 100Mbps residual bandwidth.
Suppose that the size is increased from 3 to 7, and ΔlL = 1
and ΔrL = 3 VMs are placed to two subtrees respectively.
Considering new VMs as a separate virtual cluster, 100Mbps
residual bandwidth on L is sufficient for such allocation.
However, for scaling, the resulting allocation for the scaled
virtual cluster requires 300Mbps bandwidth on link L and
hence 200Mbps increment of bandwidth reservation for the
virtual cluster, which cannot be satisfied by link L.
Based on (1), we introduce the dynamic programming based
search algorithm for allocating additional VMs for cluster
scaling. The algorithm requires the information of the network
topology, the number of empty slots in each physical machine
(PM, for short), the residual bandwidth of each link, and
the current placement of the virtual cluster to be scaled. The
algorithm starts from the leaves (i.e., PMs) and traverses the
tree topology level-by-level in bottom-up manner. The vertices
at the same level are visited in left-to-right order. For each
vertex v, the algorithm determines the number of VMs that can
be allocated into the subtree rooted at v based on the recorded
results for each of its children. The number of VMs that can be
allocated into a subtree may be multiple. Thus, the algorithm
computes an allocable #VM set for each vertex that contains
all feasible numbers of VMs. Given a vertex v, its allocable
#VM set is denoted by Mv . We first introduce the computation
process at the leaves and then describe the general procedure
at a vertex that has children. Consider a virtual cluster C of
size N and VM bandwidth B, and Δ number of new VMs to
be added for scaling.
(1) Suppose that v is a leaf at level 0, (i.e., a PM), which
has cv number of empty slots and mv VMs from the virtual
cluster C. The number of VMs that can be allocated in v,
denoted by Δv , is constrained by both the available VM slots
and the residual bandwidth on the link L connecting v to the
upper level. Thus, the first condition for the allocability is that
Δv ≤ cv . If Δv VMs are placed at v, Δ − Δv VMs are placed at
the other side of link L. As a result, the link L splits the scaled
virtual cluster into two components each with Δv +mv VMs and
N − mv + Δ − Δv VMs respectively. Accordingly, the bandwidth
reservation required on link L is min{Δv +mv , N−mv +Δ−Δv }∗B.
Based on (1), the second condition for the allocability is
min{Δv +mv , N −mv +Δ−Δv }−min{mv , N −mv } ∗ B ≤ RL (2)
where RL is the residual bandwidth of link L.
To compute v’s allocable #VM set Mv , the algorithm checks
the condition (2) for each number from 1 to min{Δ, cv }, and
adds the number to Mv if it is true.
(2) Suppose that v is a vertex that has n number of children,
denoted by v1 , v2 , . . ., vn . The corresponding allocable #VM
set of each child vk is Mvk . Similar to previous works [5], [6],
the algorithm visits all of v’s children in left-to-right order.
For each child vk , the algorithm iteratively computes a set,
denoted by S v [k], that contains the number of VMs that can
be allocated in the subtree consisting of v and the first k child
subtrees, without considering the uplink bandwidth constraint
of v. Figure 2 shows the part counted by S v [k], computed as
S v [k] = {m | m = a + b where a ∈ Mvk and b ∈ S v [k − 1]} (3)
where S v [0] = 0. Finally, we can obtain S v [n]. Each number
in S v [n] is a candidate for Δv , i.e, the number of VMs that
can be allocated in the subtree T v rooted at v, regardless of
the bandwidth constraint of v’s uplink that connects v to the
upper level. Then, for each candidate, the algorithm checks
its validity by verifying the condition (2), given RL being the
residual bandwidth of v’s uplink and mv being the number of
preexisting VMs of the virtual cluster C in T v . If the value
passes the validation, it is added to Mv .
During the search process, the algorithm memorizes the
element a contributing to m in (3) for each m in S v [k]. Such
information is then used by backtracking to find the number
ܶ௩ ሾ݇ሿ
Fig. 2: The computation of S v [k] is based on the subtree T v [k].
of VMs that should be allocated to vertex vk . The search
continues until it finds a vertex v, of which Mv contains Δ,
i.e., the number of VMs to be added. Such vertex is referred
to as allocation point. After that, starting from Δ at the
allocation point, a backtracking algorithm uses the memorized
information to find the number of VMs allocated to each
subtree level-by-level in top-down manner. The backtracking
procedure is the same as in previous work [5], [6] . Eventually,
we can obtain the number of VMs to be added on each PM.
The search algorithm has the time complexity of O(Δ2 |V|D)
where |V| is the number of vertices in the network and D is
the maximum number of children that any vertices have. We
also note that this algorithm can be reduced to an algorithm
for allocating a complete virtual cluster, similar to those in [5],
[6], regarding Δ as the size of the virtual cluster and zero VMs
for its existing allocation.
IV. Scaling with Locality Optimization
The previous algorithm adds new VMs to the datacenter
without considering the existing placement of the virtual
cluster. This may cause that the newly added VMs are distant
from the locations of previously allocated VMs. However,
good spatial locality is desired for the allocation of a virtual
cluster because it not only reduces the communication latency
among VMs but also can conserve the bandwidth of the links
in the upper levels. In this section, we consider how to scale
a virtual cluster with maximizing its VM spatial locality.
We measure the spatial locality of a virtual cluster by
the average distance (in hops) between any pair of VMs in
the cluster. Let C be a virtual cluster of size N and C be
the scaled C with larger size N , and VC , VC be their VM
sets, respectively. The VM set added for the scaling can be
represented by VC \ VC . The average VM-pair distance of C ,
denoted by avgdC , is calculated as
avgdC 2
= h(i, j)
N (N − 1) i, j∈V C
⎜⎜⎜ 2
= h(i, j) +
h(i, j)⎟⎟⎟⎠⎟
N (N − 1) i, j∈V
i∈V \V , j∈V C
where h(i, j) is the number of hops between two physical
machines where VM i and j are
placed. Smaller avgdC indicates higher spatial locality. i, j∈VC h(i, j) is determined
by the pre-existing placement of C. Thus, to minimize avgdC ,
we need to find the optimal placement of the new VMs that
minimize the total distance between each new VM and all the
other VMs in C .
A straightforward approach is to use the algorithm in the
previous section to find all the feasible allocations for the
scaling, calculate the average VM pair distance of the scaled
cluster for every allocation and choose the one with the
minimum avgdC . However, the number of feasible allocations
can be huge. By observing Formula (3), we can see that the
different combinations of a and b can result in the same value
of m, like m = 3 for both (a = 1, b = 2) and (a = 2, b = 1). This
indicates possibly many valid choices for determining how
the VMs are allocated into each child subtree at every vertex.
From an allocation point down to the physical machines, the
combinations of different choices at each vertex are multiplied
level-by-level, each leading to different spatial locality. Thus,
this approach needs to examine all these combinations, which
is inefficient. Instead, we can show that the problem has the
optimal substructure and can be efficiently solved by dynamic
Consider placing Δv new VMs to v’s subtree T v for scaling
C. Let VT v be this new VM set and VTv be the VM set
containing C’s pre-existing VMs in T v . Then, given a feasible
allocation A, the sum of the distances between each new VM in
VT v and all the other
by dTv (Δv , A),
VMs in VTv ∪VTv is denoted
i.e., dTv (Δv , A) = i∈VT , j∈VT ∪VTv h(i, j). Let d∗ (T v , Δv ) be the
minimum of dTv (Δv , A) among every feasible allocation A
that places Δv VMs into T v . Now we consider a subtree that
consists of the vertex v and its first k child subtrees, as shown
in dashed box in Figure 2, denoted by T v [k]. Similarly, given
an allocation A that places e new VMs to T v [k], dTv [k] (e, A)
represents the sum of the distances between each new VMs
and all the other VMs of the scaled cluster in T v [k], and
d∗ (T v [k], e) is the minimum one among all feasible allocations.
Suppose that an allocation A assigns x new VMs to the
k + 1-th child subtree T vk+1 and e new VMs to the subtree
T v [k]. Then, it is easy to see that dTv [k+1] (e , A) is the sum
of distances of the VM pairs from T v [k] and T vk+1 , and also
the VM pairs between x VMs in T vk+1 and e VMs in T v [k].
If vertex v is at level l, the distance between any two VMs
from v’s two different child subtrees is 2l. Thus, the distance
between any VM pair between T vk+1 and T v [k] is 2l. The
number of VM pairs between T vk+1 and T v [k] is x·(e+ ki=1 mvi )
where mvi is the number of C’s pre-existing VMs in its child
subtrees T vi . Thus, we have
dTv [k+1] (e , A) = dTv [k] (e, A) + dvk+1 (x, A) + 2 · level · x · (e +
mvi ) (5)
For each element e ∈ S v [k + 1], according to (3), there can
exist multiple combinations of the VM numbers that can be
allocated to the subtrees from T v1 to T vk+1 to sum up to e .
Based on the above equation (5), we can derive the following
recursive equation:
d∗ (T v [k + 1], e ) =
⎨ ∗
min ⎪
mvi )⎪
⎪d (T v [k], e) + d (T vk+1 , x) + 2 · level · x · (e +
(x,e)∈Ψ(e ) ⎩
d∗ (T v [1], e) = d∗ (T v1 , e)
where Ψ(e ) = {(x, e) | x + e == e , x ∈ Mvk+1 , e ∈ S v [k]}, the
min is operated over all the pairs of x and e of which the
sum is equal to e , x ∈ Mvk+1 and e ∈ S v [k]. For each x ∈ Mv ,
d∗ (T v , x) = d∗ (T v [n], x) where n is the number of v’s children.
The above formula indicates the optimal substructure of our
problem, and based on that we can easily derive a dynamic
programming algorithm to solve the problem.
To scale a virtual cluster with bandwidth guarantee while
achieving optimal locality, we combine the above algorithm
with the algorithm for searching a valid allocation in the
previous section. Actually, during the computation of the
allocable #VM set Mv of a vertex v, we can simultaneously
compute d∗ (T v , e) for each e ∈ Mv . Besides, note that there can
be multiple allocation points in the tree and each may lead to
different average VM-pair distance for the scaled cluster. Thus,
to find the optimal allocation among all possible allocation
points, the algorithm traverses the whole tree, discovers all the
allocation points, and chooses the one with minimum d∗ (T v , Δ)
where v is an allocation point and Δ = N − N. For clarity, the
pseudo code of the complete algorithm is given in Algorithm
1. The dynamic programming for maximizing the locality
is between Line 14∼23 and the backtracking information is
recorded in Dv [x + e, i]. It is easy to see that Algorithm 1 has
the same time complexity O(Δ2 |V|D) as the previous algorithm
in Section III.
Algorithm 1: Scaling Allocation Algorithm
V. Scaling with VM Migration
The algorithm proposed in the previous section aims to
properly allocate VMs to be added for scaling a virtual cluster.
However, it may not be able to find a feasible solution, even
when the datacenter has resource to accommodate the virtual
cluster with scaled size. Figure 3 shows an example where a
network consists of four PMs, each with four VM slots and
connected by 1Gbps links. Two virtual clusters VC1 and VC2
are deployed. VC1 has 5 VMs and the bandwidth for each is
500Mbps. VC2 has 2 VMs with bandwidth 200Mbps. On link
L, the bandwidth reservation for VC1 is 500 and for VC2 is
400, and thus the residual bandwidth on L is 100. Suppose
that a new VM needs to be added to scale VC2 up to 6 VMs.
It can be easily verified that, no matter to which empty slot it
is allocated, the scaled virtual cluster needs 600 bandwidth on
link L. It requires the 200 increment of bandwidth reservation
for VC2 on L, more than the 100 residual bandwidth of L.
That is, no feasible allocations can be found for the new VM
and thus VC2 cannot be scaled up.
However, in Figure 3(a), it is obvious that the subtree T u2
can accommodate the whole virtual cluster VC2 with scaled
size 6. This indicates that existing VM placement could hinder
the scalability of a virtual cluster. To address this issue, we
exploit VM migration, with which it is possible to change
the layout of existing VMs to accommodate new VMs. As an
example, in Figure 3(b), after a VM is migrated from T u1 to
T u2 , VC2 can add another new VM in T u2 without increasing
the bandwidth reservation on link L. Thus, in this section
we consider VM allocation with allowing VM migration for
scaling a virtual cluster. Because VM migration incurs VM
service downtime [15] while performance isolation among
different virtual clusters is desired in cloud, we only consider
migrating VMs from the virtual cluster that is to be scaled,
rather than VMs in other virtual clusters, such that the scaling
of a virtual cluster does not interrupt other virtual clusters.
Since VM migration incurs significant traffic and service
downtime [15], [16], it is a must to minimize the VM migration cost for virtual cluster scaling. Thus, only if no proper
Input: Tree topology T , a virtual cluster C =< N, B > to be
scaled to < N , B >, bandwidth allocation information on
each link.
Δ ← N − N ;
AllocationpointS et ← ∅;
for level l ← 0 to Height(T ) do
for each vertex v at level l do
n ← the number of v’s children;
if l = 0 then // leaf v is a PM
S v [0] ← {0, 1, . . . , min{cv , Δ}} ;
// cv is the
number of empty VM slots of v
S v [0] ← {0};
for i from 1 to n do
S v [i] ← ∅;
for each x ∈ Mvi do
for each e ∈ S v [i − 1] do
if x + e S v [i] then
d∗ (T v [i], x + e) ← in f ;
S v [i] ← S v [i] ∪ {x + e};
if i = 1 then
d(T v [i], x + e) ← d∗ (T v1 , x) ;
d(T v [i], x + e) ← d∗ (T v [i − 1], e) +
d∗ (T vi , x) + 2 · level · x · (e + k=1
mvk ) ;
if d∗ (T v [i], x + e) > d(T v [i], x + e) then
d∗ (T v [i], x + e) ← d(T v [i], x + e);
Dv [x + e, i] ← x;
Mv ← ∅ ;
for each e ∈ S v [n] do
min{e+mv , N −mv +Δ−e}−min{mv , N −mv } ∗ B ≤ RLv
then // RLv is the residual bandwidth of
v’s uplink
Mv = Mv ∪ {e} ;
d∗ (T v , e) ← d∗ (T v [n], e);
if Mv = ∅ then
return false;
if Δ ∈ Mv then
AllocationpointS et ←
AllocationpointS et ∪ (v, d∗ (T v , Δ), Dv );
if AllocationpointS et ∅ then
Choose v that has the minimum d∗ (T v , Δ);
Alloc (v, Δ, Dv );
return true;
return false;
Procedure Alloc(v, x, Dv )
if v is a machine then
allocate x VMs into v;
for v’s child i from n to 1 do
Alloc(vi , Dv [x, i]);
Update bandwidth allocation information on v’s
downlink to its i-th child;
x = x − Dv [x, i];
allocation can be found for new VMs with keeping existing
VM placement, we turn to the allocation approach with
exploiting VM migration. The complete solution is described
with answering “when to migrate” and “how to migrate”.
A. When to migrate
Algorithm 1 can find a feasible allocation for new VMs if it
exists. If the solution does not exist, the algorithm fails, which
occurs in the following two cases:
VC 1
VM 500Mbps
VC 2
VM 200Mbps
Link capacity: 1Gbps L
Fig. 3: Figure (a) shows the case that no feasible allocations
exist for scaling a virtual cluster. Figure (b) shows that the
virtual cluster VC2 can be scaled with one VM migration.
(a) The search reaches at the root, and the algorithm finds no
vertices (including the root) in the tree of which the allocable
#VM sets contain Δ, i.e., the number of VMs to be added.
(b) During the search, if any vertex is found to have
empty allocable #VM set, the solution does not exist and the
algorithm terminates. Given a vertex v, its allocable #VM set
Mv is generated by filtering any elements that do not satisfy
v’s uplink bandwidth constraint out from the candidate set (
i.e., S v [n] in Formula (3)). Mv = ∅ means that the uplink
bandwidth constraint cannot be satisfied for any placement of
new VMs in the tree. In this case, even placing zero VMs
in v’s subtree T v is not a valid choice, which indicates an
essential difference between placing a complete virtual cluster
and scaling an existing cluster. Zero VMs in T v imply that all
new VMs are placed outside of T v but additional bandwidth
needs to be reserved for the traffic to and from existing VMs
in T v .
If any of the above cases happens, we terminate the previous
algorithm with returning false at line 30 and 37 in Algorithm
1 and turn to VM allocation with migration.
B. How to migrate
In this section we propose our allocation approach that
allows VM migration. We first look into the following fact:
Proposition 1: Assume that a virtual cluster C with size N
is scaled up to C with larger size N in the datacenter DC. If
a valid allocation can be found for the virtual cluster C in DC
where the original cluster C and its corresponding bandwidth
reservation are hypothetically removed, then C must be able
to scale up to size N with allowing VM migration, and vice
This proposition is straightforward, since we can always
migrate C and additional VMs to the slots where C is
allocated hypothetically. If we cannot find any solutions for
C , C cannot be scaled up even with VM migration. Based
on that, our allocation algorithm first hypothetically removes
the virtual cluster C and releases its bandwidth reservation,
and finds valid allocations for the scaled cluster C , from
which it derives the final solution. This also guarantees that as
long as the datacenter has enough resource to accommodate
the scaled virtual cluster, our algorithm always finds a valid
allocation and will not falsely reject a scaling request that can
be satisfied. On the other hand, the allocation should minimize
the migration of VMs considering its significant overhead. To
do that, our algorithm aims to find the allocation that has
maximum overlap with existing VM placement of the virtual
cluster, which let the most VMs of C stay the same. We first
introduce the overlap measurement as follows:
Definition 1 (Allocation overlap size): Given a set of PMs
{P1 , P2 , . . . , Pl }, suppose that an allocation assigns ai VMs to
each Pi , and another allocation assigns bi VMs to each Pi ,
then the overlap size between
two allocations is
min{ai , bi }
For example, an allocation assigns 2, 3 and 1 VMs to P1 , P2
and P3 respectively, and another one assigns 1, 2 and 2 VMs
to P1 , P2 and P3 . Their overlap size on each single machine is
1, 2, 1 respectively, and hence the overlap size between them
with respect to all three machines is 4. If the latter allocation
represents a virtual cluster C’s existing VM placement and the
former one represents its VM placement after being scaled,
then 4 VMs of C can stay the same and only one VM in P3
needs to be moved out.
Next we describe how to find the allocation having the
largest overlap size with the original allocation of the cluster
C and how to migrate. Because here we are considering the
allocation of a whole cluster instead of only new VMs, for
a vertex v, its allocable #VM set Mv is redefined as the set
containing the number of VMs out of C that can be allocated
to v’s subtree T v .
1) Finding the allocation with the largest overlap size: For
each element x ∈ Mv , there are one or multiple valid ways to
allocate x VMs to the PMs in the subtree T v . Every allocation
can have different overlap size with the VM placement of
original cluster C in T v . Let OTv [x] be the maximum overlap
size among all these allocations which place x VMs in the
subtree T v . Our algorithm calculates OTv [x] for each x ∈ Mv .
The procedure is similar to that for computing the maximum
locality in Section IV. Reminding that T v [k] is the subtree
consisting of v and v’s first k child subtrees rooted at v1 , . . . , vk ,
and v has total n child subtrees. Let OTv [k] [x] be the maximum
overlap size among all the feasible allocations that place e
VMs to T v [k] where e ∈ S v [k]. Then, for each element e ∈
S v [k + 1],
OTv [k+1] [e ] = max OTv [k] [e] + OTvk+1 [x]
(x,e)∈Ψ(e )
OTv[1] [e] = OTv1 [e]
where Ψ(e ) = {(x, e) | x + e == e , x ∈ Mvk+1 , e ∈ S v [k]}.
Finally, for each element e ∈ S v [n], we have OTv [n] [e ]. If e
satisfies the uplink bandwidth constraints, it is added into Mv
and the corresponding OTv [n] [e ] is kept along with it, referred
to as OTv [e ].
The algorithm starts the search from the leaf vertices (i.e.,
PMs). For a leaf u, T u is u itself. For each x ∈ Mu , OTu [x] is
the smaller of x and the number of VMs that the cluster C
originally has in the machine u. Then, level-by-level, for each
vertex, the algorithm computes its allocable #VM set and the
maximum overlap size associated with each element in the
set. As opposed to the previous algorithm in Section III, the
algorithm needs to find an allocation point that not only has N in the allocable #VM set, but also should achieve the largest
overlap size, because there can be multiple allocation points
that have different maximum overlap sizes associated with N .
Take Figure 3 as an example. At the level of u1 and u2 , the
algorithm can find that u2 is an allocation point. However, the
Fig. 4: The allocation point with largest maximum overlap
size can appear at the root. The boxes filled with backslash
represent VMs from other virtual clusters.
allocation point with largest overlap size 4, shown in Figure
3(b), actually exist in the subtree at the upper level.
Let T rC be the lowest-level subtree that contains the original
cluster C, rooted at rC . Let S C be the set that contains
the vertices of T rC and the ancestors of rC . It is obvious
that only the allocation points in S C may have allocations
overlapping with C. Any allocation points not in S C must have
zero maximum overlap size. To find the optimal allocation
that achieves the largest overlap size, the algorithm needs to
determine all the allocation points in S C and their maximum
overlap sizes. The reason to examine all the vertices in S C is
that the optimal allocation point can be at any level. See Figure
4 for an example, in which 3 more VMs are added to extend
a virtual cluster of 3 VMs (represented by blue box) shown in
Figure 4(a). Two downlinks of u1 have residual bandwidth 200
and 100. We assume sufficient bandwidth on other links. The
allocation point u1 with the allocation shown in Figure 4(b) has
the maximum overlap size of one. But the allocation point u at
the upper level achieves a higher maximum overlap size, two,
as shown in Figure 4(c). This example simply indicates the
algorithm has to traverse the tree until to the root to determine
the optimal allocation point. Denote the set of the allocation
points with non-zero maximum overlap size in S C by S Ca .
S Ca = ∅ if either no allocation points exist in S C or all the
allocation points in it have zero maximum overlap size. Then,
if S Ca ∅, the algorithm chooses the one that has the largest
maximum overlap size in S Ca as the final allocation solution
for C ; otherwise, the algorithm chooses the allocation point
at the lowest level among all allocation points in the whole
tree as the final solution.
Locality v.s. Overlap size The above algorithm aims to maximize the overlap size with the original cluster C, regardless
of VM locality. Good locality implies that a virtual cluster
needs to be placed in a subtree as lower as possible, but the
largest overlap size may appear in a higher level. We can see
that the allocation in Figure 4(c) achieves the largest overlap
size but the VMs of C are distributed to different subtrees T u1
and T u2 . In contrast, Figure 4(b) has smaller overlap size but
higher locality with VMs being in the same subtree T u1 . To
make a trade-off, we propose a simple solution that evaluates
an allocation point based on the weighted sum of locality and
overlap size. Given an allocation point v at level L in the tree
of height H, let OTv be its maximum overlap size for C ’s
allocation in its subtree, v’s goodness is measured by
OT v
+ (1 − α)
where N is the size of original cluster C and 0 ≤ α ≤ 1.
The weights can be determined by the cloud providers, and
different weights indicate different preferences to either mitigating the migration cost or compacting the virtual cluster.
If α = 0, the goal is to maximize the locality for the scaled
cluster, regardless of the migration cost; if α = 1, the migration
cost is the only focus. Based on that, we can adapt the above
algorithm to choose the allocation point with the maximum
goodness among all as the final solution.
The algorithm for finding optimal allocation of C is similar
to Algorithm 1 in that the search is also based on dynamic
programming but uses a different goodness measurement and
searches for the allocation of N VMs. It has the time
complexity O(N 2 |V|D) .
2) Determining how to migrate VMs: Given the allocation
for the cluster C , the next step is to decide how to migrate
non-overlapped VMs from C to the slots allocated to C .
Migrating a VM to different PMs may go through routing paths
of different lengths and hen different durations. To reduce the
service downtime and network overhead, the mapping between
VMs and slots should minimize the total number of hops of
VM migrations. To this end, we transform it to the minimum
weight perfect matching problem in bipartite graph [17] that
is to find a subset of edges with minimum total weight such
that every vertex has exactly one edge incident on it.
Suppose the datacenter has ND PMs. The allocation for C can be represented by {mi |1 ≤ i ≤ ND } which means the
allocation assigns mi VMs to PM Pi . Similarly, the original
allocation for C is {mi |1 ≤ i ≤ ND }. We construct the
corresponding bipartite matching problem as follows. Let V
and S be two sets of vertices representing VMs and slots
respectively. Initially they are empty. For each PM Pi , we
compute δi = mi − mi . If δi > 0, which means that δi VMs
need to be migrated out from Pi , then δi vertices are added
to V, each labeled with its PM identifier Pi ; if δi < 0, which
means that |δi | empty slots on Pi are allocated to C ’s VMs,
then |δi | vertices are added to S , each also labeled with Pi .
After that, Δ = N − N vertices are added to V to represent
new VMs to be added for scaling, labeled with “New”. It is
easy to prove that V and S have the same number of vertices,
and no vertices in S have the same labels as the vertices in
V. An edge is added for each pair of vertices u (u ∈ V) and
v (v ∈ S ), and finally we obtain a complete bipartite graph. If
u’s PM label is Pu and v’s is Pv , then the weight of edge (u, v)
is the number of hops between Pu and Pv , i.e., the hops for
migrating the VM at u to the slot at v. If u’s label is “New”, the
edge between u and any vertex in S has zero weight because
no migration is needed for newly added VMs.
Based on the complete bipartite graph constructed above,
we can use well-known Hungarian algorithm [17] to find its
minimum weight perfect matching where each edge indicates
the assignment of a VM in V to a slot in S and the total number
of hops for VM migrations is minimized with such assignment.
Note that, since we assume that all VMs in a virtual cluster are
homogenous, we do not distinguish C’s VMs that are placed
in one PM from each other, nor distinguish the newly added
VMs. That is, for the assignment (u, v), we can migrate any
one of C’s VMs on Pu to v, or add any one of the newly
added VMs to v if u’s label is “New”. In the worst case that
the overlap size is zero, N VMs need to be migrated and the
Hungarian algorithm takes O(N 3 ) time complexity.
In this section we evaluate the effectiveness and efficiency
of our algorithms for scaling a virtual cluster by simulations.
A. Simulation Setup
Datacenter topology We simulate a datacenter of three-level
tree topology with no path diversity. A rack consists of 10
machines each with 4 VM slots and a 1Gbps link to connect
to a Top-of-Rack (ToR) switch. Every 10 ToR switches are
connected to a level-2 aggregation switch and 5 aggregation
switches are connected to the datacenter core switch. There
are total 500 machines at level 0. The oversubscription of the
physical network is 2, which means that the link bandwidth
between a ToR switch and an aggregation switch is 5Gbps
and the link bandwidth between an aggregation switch and
the core switch is 25Gbps.
Workload Our simulations are conducted under the scenario
of dynamically arriving tenant jobs. A job specifies a virtual
cluster abstraction < N, B >. The jobs dynamically arrive
over time, and if a job cannot be allocated upon its arrival,
it is rejected. Each job runs for a random duration and its
virtual cluster is removed from the datacenter when the job
is completed. The number of VMs N in each virtual cluster
request, i.e., the job size, is exponentially distributed around
a mean of 20 at the default. The bandwidth B of each job
is randomly chosen from [100, 200]. The job arrival follows
a Poisson process with rate λ, then the load on a datacenter
with M VMs is λN̄T c /M where N̄ is the mean job size and
T c is the mean running time of all jobs. The running time of
each job is randomly chosen from [200, 500].
During the job arrival process, we randomly choose a job
running in the datacenter and generate a scaling request for it
with a given probability Pr scal before the arrival of the next job.
The size increment for a virtual cluster < N, B > is determined
by a given percentage increase %inc, then the new size N =
(%inc + 1)N. We simulate the arriving process of 500 tenant
jobs with varying the load, the scaling probability Pr scal , and
the percentage increase of the cluster size %inc.
B. Simulation Results
We evaluate and compare our scaling algorithms in terms
of the rejection rate of the scaling requests, the locality of
the scaled cluster, and the number of VM migrations for the
scaling. The figures present the average across multiple runs.
For brevity, we refer to the scaling algorithm in Section III as
“Scaling”, the algorithm with locality optimization in Section
IV as “Scaling-L” and the algorithm allowing VM migration
in Section V as “Scaling-M”.
1) Rejection rate of the scaling requests: Figure 5 and
6 show the rejection rate with different %inc under light
load and heavy load settings where load = 0.2 and 0.6
respectively. In Figure 5, the rejection rates of Scaling-M
are zero. Except that, the rejection rate increases with %inc
for all three algorithms. It can be seen that Scaling-L can
achieve lower rejection rate than Scaling under light load. The
reason can be that the new VMs are put as closer as to the
pre-existing placement with the locality optimization, which
actually reduces the fragmentation and saves more continuous
space to accommodate incoming job requests. But under the
heavy load, limited resource is main bottleneck and locality
optimization does not help, thus, Scaling-L and Scaling have
close rejection rates in Figure 6. Compared with Scaling-M,
both Scaling-L and Scaling have much higher rejection rates,
up to 50% under the heady load. This indicates that the preexisting VM placement can seriously hinder the scalability
of virtual clusters. However, with allowing VM migration for
the scaling, Scaling-M can achieve much lower rejection rate.
We also vary Pr scal to change the arrival rate of the scaling
requests and show the change of the rejection rate in Figure
7. The rejection rate increases with Pr scal , and Scaling-M still
has the lowest rejection rate. From the evaluation blow, we
will see that Scaling-M achieves such low rejection rate only
at much small VM migration cost.
2) Locality: Scaling-L searches for the allocation with
the minimum average VM-pair distance for scaling a virtual
cluster. To verify its effectiveness, we consider how much
improvement can be obtained from Scaling-L compared with
Scaling. Using Scaling and Scaling-L to scale a cluster can
lead to different average VM-pair distance. Thus, given the
same workload and scaling requests under load = 0.2, we calculate the difference of the average VM-pair distances between
two scaled clusters resulted from Scaling and Scaling-L, respectively, for the same job j, that is, avgdCj − avgdCj .
S caling
S caling−L
Figure 8 shows the empirical cumulative distribution of such
difference under %inc = 20% and %inc = 30% respectively.
As we can see, 80% of all the differences are larger than
zero and 20% less. The 20 percent that are less than zero
indicate that Scaling-L does not always generate an allocation
for the scaled cluster with smaller average VM-pair distance
than the allocation obtained by Scaling. This is to be expected,
because two algorithms lead to different VM layout and
bandwidth usage in the datacenter during the dynamic job
arrival process. Even for the same job, Scaling-L and Scaling
may work on the different pre-existing placement and network
status. Nevertheless, in terms of the overall effect, Scaling-L
reduces the average VM-pair distance for the large majority
of scaled jobs. Besides, from the distribution we can tell
that the improvement by Scaling-L becomes more pronounced
under higher %inc. For %inc = 30%, 20 percent of distance
reduction fall in [0.2, 0.6] but for %inc = 20% only 10%.
3) Migration Cost: The Scaling-M algorithm minimizes the
number of VM migrations by maximizing the overlap size with
pre-existing cluster placement. To show its effectiveness, we
compare it with the cluster allocation algorithm in [5], [6],
referred to as “C-A”, which is used to search the allocation
for the scaled cluster in the datacenter where the pre-existing
cluster is hypothetically removed, without considering the
overlap with pre-existing VM placement. We replay the same
workload and scaling request sequences and measure the total
number of VM migrations incurred by these algorithms for all
the scaling operations. Let load = 0.6 and Pr scal = 0.2. From
Table II we can see that without considering the overlap, the
number of VM migrations can be prohibitive, from 748 to 915.
Scaling-M significantly reduces the total number of migrated
VMs for the scaling by at least 90%. Combining with Figure
6, it can be concluded that Scaling-M can greatly decrease the
rejection rate at very small migration cost.
We also evaluated the effect of α in Formula (9) by varying
it from 0 to 1 by step 0.2. Given a workload with load = 0.6,
Fig. 6: The rejection rate under load = 0.6,Pr scal = 0.2.
TABLE II: Total number of VM migrations for scaling with
different %inc.
#VM Migrations
Total Avg pair distance
Fig. 5: The rejection rate under load = 0.2,Pr scal = 0.2.
cumulative probability distribution
Rejection rate
Rejection rate
Rejection rate
TABLE III: The number of VM migrations and average VMpair distance with different α.
%inc = 30% and Pr scal = 0.2, we examine the total number
of VM migrations for the scaling and the sum of all the
scaled clusters’ average VM pair distance under different α.
The result is shown in Table III. Both two measurements vary
with α significantly and monotonously overall1 The overlap
size increases with α, leading to decreasing number of VM
migrations when α increases; the average VM-pair distance
increases with α. Such results verify the effectiveness of our
weight sum design in (9), in which α can act as a control knob
to capture users’ preferences.
VII. Discussion
In this paper we mainly focus on the scaling of virtual
cluster abstraction in size, i.e., from < N, B > to < N , B >
(N > N). Nevertheless, we can use the algorithms in this paper
to solve two other types of scaling that increase the bandwidth
requirement. To scale a cluster C from < N, B > to < N, B >
(B > B), we first hypothetically increase the bandwidth of
C’s VMs to B and check whether there are any links that do
not have enough residual bandwidth for the increment of C’s
bandwidth reservation. If no such links exist, C can be scaled
in-place with accordingly increasing the bandwidth reservation
on links; otherwise, C’s original placement has to be changed
for the scaling and we use the allocation algorithm with VM
migration in Section V to find the allocation of new cluster
abstraction < N, B > that has the largest overlap size with
C’s original placement and conduct the corresponding VM
migrations. For the scaling from < N, B > to < N , B >,
we first try to scale the cluster in-place with two steps: first
< N, B >→< N, B > and then < N, B >→< N , B >. We
use the algorithm in Section III or IV for the second step. If
both two steps succeed, the cluster can be scaled in-place; If
any of two steps failed, we turn to the algorithm with VM
migration to allocate < N , B > to minimize the number of
VM migrations.
1 In Table III 134 is an exception. For a particular workload it may have such
small variance. The observations from multiple random workloads indicate
that the overall trend is monotonic.
Fig. 7: The rejection rate under load = 0.6, %inc = 30%
Difference of average VM-pair distance
Fig. 8: CDF of avgdCj avgdCj S caling−L
S caling
VIII. Conclusion
This paper addresses the scaling problem of virtual cluster
abstraction with bandwidth guarantee. We propose efficient
algorithms to scale up the size of the virtual cluster with
and without allowing VM migration. Besides demonstrating
the efficiency of these algorithms, simulation results also
indicate that the pre-existing VM placement can seriously
hinder the scalability of virtual clusters and the scalability can
be significantly improved at small VM migration cost.
Acknowledgement: This research is supported by NSF
[1] “Amazon EC2,” http://aws.amazon.com/ec2/.
[2] J. Schad, J. Dittrich, and J.-A. Quiané-Ruiz, “Runtime measurements in
the cloud: observing, analyzing, and reducing variance,” Proc. of VLDB
Endow., vol. 3, no. 1-2, pp. 460–471, Sep. 2010.
[3] C. Guo, G. Lu, H. J. Wang, S. Yang, C. Kong, P. Sun, W. Wu, and
Y. Zhang, “Secondnet: a data center network virtualization architecture
with bandwidth guarantees,” in Proc. of Co-NEXT. New York, NY,
USA: ACM, 2010, pp. 15:1–15:12.
[4] H. Ballani, P. Costa, T. Karagiannis, and A. Rowstron, “Towards
predictable datacenter networks,” in Proc. of ACM SIGCOMM. New
York, NY, USA: ACM, 2011, pp. 242–253.
[5] D. Xie, N. Ding, Y. C. Hu, and R. R. Kompella, “The only constant
is change: incorporating time-varying network reservations in data
centers,” in Proc. of ACM SIGCOMM, 2012, pp. 199–210.
[6] L. Yu and H. Shen, “Bandwidth guarantee under demand uncertainty in
multi-tenant clouds,” in ICDCS, 2014, pp. 258–267.
[7] J. Lee, Y. Turner, M. Lee, L. Popa, S. Banerjee, J.-M. Kang, and
P. Sharma, “Application-driven bandwidth guarantees in datacenters,”
in Proc. of ACM SIGCOMM, 2014, pp. 467–478.
[8] N. R. Herbst, S. Kounev, and R. Reussner, “Elasticity in cloud computing: What it is, and what it is not,” in ICAC. San Jose, CA: USENIX,
2013, pp. 23–27.
[9] P. Padala, K.-Y. Hou, K. G. Shin, X. Zhu, M. Uysal, Z. Wang, S. Singhal,
and A. Merchant, “Automated control of multiple virtualized resources,”
in Proceedings of the 4th ACM European conference on Computer
systems. ACM, 2009, pp. 13–26.
[10] Z. Shen, S. Subbiah, X. Gu, and J. Wilkes, “Cloudscale: elastic resource
scaling for multi-tenant cloud systems,” in Proceedings of the 2nd ACM
Symposium on Cloud Computing. ACM, 2011, p. 5.
[11] Z. Gong, X. Gu, and J. Wilkes, “Press: Predictive elastic resource scaling
for cloud systems,” in Network and Service Management (CNSM), 2010
International Conference on. IEEE, 2010, pp. 9–16.
[12] R. Han, L. Guo, M. M. Ghanem, and Y. Guo, “Lightweight resource
scaling for cloud applications,” in ACM/IEEE CCGrid, 2012, pp. 644–
[13] H. Nguyen, Z. Shen, X. Gu, S. Subbiah, and J. Wilkes, “Agile: Elastic
distributed resource scaling for infrastructure-as-a-service,” in ICAC.
San Jose, CA: USENIX, 2013, pp. 69–82.
[14] H. Herodotou, F. Dong, and S. Babu, “No one (cluster) size fits all:
Automatic cluster sizing for data-intensive analytics,” in Proc. of ACM
SOCC. New York, NY, USA: ACM, 2011, pp. 18:1–18:14.
[15] C. Clark, K. Fraser, S. Hand, J. G. Hansen, E. Jul, C. Limpach, I. Pratt,
and A. Warfield, “Live migration of virtual machines,” in Proc.of NSDI.
Berkeley, CA, USA: USENIX Association, 2005, pp. 273–286.
[16] L. Yu, L. Chen, Z. Cai, H. Shen, Y. Liang, and Y. Pan, “Stochastic
load balancing for virtual resource management in datacenters,” IEEE
Transactions On Cloud Computing, 2016.
[17] C. H. Papadimitriou and K. Steiglitz, Combinatorial Optimization:
Algorithms and Complexity. Upper Saddle River, NJ, USA: PrenticeHall, Inc., 1982.