Dynamic Scaling of Virtual Clusters with Bandwidth Guarantee in Cloud Datacenters Lei Yu Zhipeng Cai Department of Computer Science, Georgia State University, Atlanta, Georgia Email: lyu13@student.gsu.edu, zcai@gsu.edu Abstract—Network virtualization with bandwidth guarantee is essential for the performance predictability of cloud applications because of the shared multi-tenant nature of the cloud. Several virtual network abstractions have been proposed for the tenants to specify and reserve their virtual clusters with bandwidth guarantee. However, they require pre-determined fixed cluster size and bandwidth, and do not support the scaling of the cluster in size and bandwidth requirements. On the other hand, the existing works on virtual cluster scaling focus on dynamically adjusting the cluster size without considering any bandwidth guarantee targeted by current network abstractions. To fill the gap, this paper considers the problem of scaling up a virtual network abstraction with bandwidth guarantee. Efficient algorithms are proposed to find the valid allocation for the scaled cluster abstraction with optimization on the VM locality of the cluster. We also point out the case that a virtual cluster cannot be scaled without changing its original VM placement, and propose an optimal allocation algorithm that exploits the VM migration to address this issue while minimizing the total migration cost for the virtual cluster scaling. Extensive simulations demonstrate the effectiveness and efficiency of our algorithms. I. Introduction With modern virtualization technologies, cloud computing offers infrastructure as a service that provisions virtualized computing resources over the internet with pay-as-you-go pricing model, such as Amazon EC2 [1]. The resource virtualization in the cloud datacenter enables efficient resource sharing and multiplexing across multiple tenants as well as on-demand resource provision. However, a major concern for moving applications to the cloud is the lack of performance guarantee in the cloud. The applications of different tenants compete for the shared datacenter network resources, and no bandwidth guarantee in toady’s public cloud platforms causes them to suffer from unpredictable performance [2]. To efficiently virtualize the network resource with bandwidth guarantee in the cloud, several virtual network abstractions [3]–[7] have been proposed, which are used to express the network requirements of the tenants. In these abstractions, a tenant can specify the number of virtual machines (VMs), the topology of the virtualized network, and bandwidth requirements between VMs. Most of these abstractions are variants of the hose model, in which all VMs are connected to a central virtual switch by a bidirectional link with the specified bandwidth, like Oktopus [4], TIVC [5] and SVC [6]. Other network abstractions, either specify bandwidth requirement between every pairs of VMs as virtual pipes [3], or are derived based on the true application communication structure [7]. All these abstractions not only provide a simple and accurate way for the tenants to specify their network demands, but also facilitate the network resource allocation for the datacenter. Various algorithms are proposed along with them for allocating VM and bandwidth resources in the datacenter to realize corresponding network abstractions. The resource virtualization of the datacenters provides elasticity [8], a key feature of the cloud, which enables capacity scaling up and down on demand for the cloud applications to adapt to the changes in workloads. It is achieved through ondemand dynamic resource scaling either at the VM level [9]– [12], namely, on the CPU, memory and I/O resources allocated to a VM, or at the cluster level [12]–[14], namely, on the number of VMs or VM types with different capacities in the virtual cluster. Such resource scaling not only is an efficient approach to maintain the performance of cloud applications under varying workloads but also improves the resource utilization of the datacenter. The existing works on virtual cluster scaling [12]–[14], however, overlook the network requirement targeted by virtual network abstractions. They focus on dynamically adjusting the number of VMs in a virtual cluster, but without maintaining any bandwidth guarantee along with the cluster scaling. On the other hand, existing virtual network abstractions [3]–[7] use pre-determined fixed bandwidth and number of VMs, and they cannot support the elastic scaling of the cluster in size and bandwidth requirements. Given a virtual cluster that has been deployed in the datacenter, the shrinking of its size and bandwidth requirement can be trivially performed by releasing the unneeded VMs and bandwidths. However, its expansion is not trivial. The increase of cluster size and VM bandwidth in a virtual network abstraction is limited by available VM slots and network bandwidth in the datacenter. The cloud might not be able to accommodate the scaled network abstraction without changing its original VM placement. It is critical to properly and efficiently allocate/re-allocate the VMs for scaling while maintaining the bandwidth guarantee. To the best of our knowledge, no previous works have been proposed to address the scaling of virtual network abstractions. Therefore, we aim to fill this gap in this paper. In this paper, we address the problem of scaling up a virtual network abstraction with bandwidth guarantee. We consider the common hose model based virtual cluster abstraction [4]– [6], denoted by < N, B >, in which N VMs are connected to a virtual switch with links each having specified bandwidth B. We focus on its scaling in size from < N, B > to < N , B > (N > N) with unchanged bandwidth B. We identify the challenging issues arising from such scaling, and propose efficient allocation algorithms to determine the VM placement for the scaled cluster. Specifically, we first propose a dynamic programming algorithm to search for the placement of N − N 2 additional VMs in the abstraction that can still ensure the bandwidth B for each VM through appropriately reserving bandwidth on network links. The algorithm is further improved to maximize the VM locality such that the newly added VMs can be put as closer as possible to the pre-existing VMs to reduce the communication latency and network overhead. Further more, we point out that a virtual cluster might not be scaled without making changes to the original VM placement and in this case the above algorithm cannot find a valid placement. To address this issue, we exploit the VM migration and develop an optimal algorithm to allocate the scaled cluster that minimizes the total VM migration cost. Extensive simulations demonstrate the effectiveness and efficiency of our algorithms. The results indicate that the scalability of virtual clusters is seriously hindered by their pre-existing VM placement but can be significantly improved at small VM migration cost. Last but not the least, in our discussion part, we show that our algorithms can be easily used to solve other kinds of virtual cluster scaling, including the scaling from < N, B > to < N, B >(B > B) and from < N, B > to < N , B >(N > N, B > B). The rest of this paper is organized as follows. Section II introduces the related work. Section III describes the problem and presents our first scaling allocation algorithm. Section IV and V present the scaling allocation algorithms with locality optimization, and with allowing VM migration while optimizing its total cost, respectively. Section VI shows our simulation results. Section VII presents our discussion and section VIII concludes the paper. II. Related Work A. Virtual Network Abstraction with Bandwidth Guarantee Several virtual network cluster abstractions [3]–[7] have been proposed for efficient network resource allocation with bandwidth guarantee. Oktopus [4] proposes a virtual cluster abstraction based on the hose model, denoted by < N, B >, in which N VMs are connected to a central virtual switch by a bidirectional link with the specified bandwidth B. Based on that, a temporally interleaved virtual cluster model TIVC [5] is further proposed, which allows different bandwidth specifications during different time intervals to address the time-varying bandwidth requirements of many cloud applications. SVC [6] incorporates the information of the statistical distributions of bandwidth demand into the virtual cluster abstraction to address the demand uncertainty. CloudMirror [7] proposes a network abstraction that represents the true application communication structure and distinguish bandwidth requirements for different components and their inter-communication in an application. Secondnet [3] proposes a virtual datacenter abstraction that describes the bandwidth requirements between each pair of VMs and uses a central controller to determine the flow rate and the path for each VM-to-VM pair. To achieve bandwidth guaranteed virtual to physical mapping, various VM allocation algorithms and bandwidth enforcement mechanisms have been proposed along with these network abstractions. B. Elastic Resource Scaling A number of works have been proposed to dynamically scale the resources either at VM level or at cluster level or both TABLE I: Notations Notation C, C N, N Tv Mv T v [k] S v [k] mv Description The original cluster and the corresponding scaled cluster The original cluster size (the number of VMs) and new size (N > N) The subtree rooted at the vertex v The allocable #VM set of the vertex v The subtree consisting of the root v and its first k child subtrees The set containing the number of VMs that can be allocated into the subtree T v [k] regardless of v’s uplink bandwidth constraint The number of VMs of the original cluster C located in the subtree T v to meet the performance requirement of cloud applications under workload changes. Padala et al. [9] proposed a feedback control system to dynamically allocate CPU and disk I/O resources to VMs based on an online model that captures the relationship between the allocation of resource shares and the application performance. Shen et al. [10] proposed a prediction-driven automatic resource scaling system, which uses adaptive estimation error correction for resource demand to minimize the impact of under-estimation error, and support multi-VM concurrent scaling with resolving scaling conflicts on the same PM by VM migration. Gong et al. [11] proposed a lightweight online prediction scheme to predict the resource usage of VMs in an application and perform resource scaling based on the prediction results. AGILE [13] is a resource scaling system at the cluster level, which dynamically adjusts the number of VMs assigned to a cloud application with live VM replication to maintain to meet the application’s service level objectives. Herodotou et al. [14] proposes an automatic system to determine the cluster size and VM instance type required to meet desired performance and cost for a given workload. Han et al. [12] proposed an approach that integrates both fine-grained resource scaling at the VM level in addition to cluster size scaling. These cluster level scaling approaches do not consider the bandwidth guarantee targeted by the virtual network abstractions. III. Virtual Cluster Scaling A. Problem description In this paper, we consider the hose model based virtual cluster abstraction [4]–[6], which consists of N homogenous VMs, with bandwidth B for each VM, denoted by < N, B >. There are three types of problem instances for scaling up a virtual cluster < N, B >, which are either to increase the cluster size from N to N (N > N), or to increase the bandwidth from B to B (B > B), or to increase both. We mainly focus on the scaling of the abstraction < N, B > to < N , B >, and in Section VII we show that the algorithms proposed for it can also efficiently solve other two types of scaling problems. As previous works [4]–[6], we assume typical tree topology for the datacenter, from physical machines at level 0 to the root at level H. Each physical machine is divided into multiple slots where tenant VMs can be placed. For the clarity, Table I lists the notations used in this paper. B. Scaling Allocation Algorithm Suppose that a virtual cluster C < N, B > has been deployed in a datacenter. Considering a link L in the network, it con- 3 ο ൌ ͵ ο ൌ ͳ RL= 100 RL= 100 L L 2 VMs 1 VM Bandwidth reservation on L : 100 3VMs 4 VMs Bandwidth reservation on L : 300 Fig. 1: This example shows the difference on the valid allocation between virtual cluster scaling and virtual cluster allocation. nects two separate network components in the tree topology. Suppose that the virtual cluster has m (N ≥ m ≥ 0) VMs in one component and hence N − m VMs in the other. Because each VM cannot send or receive at a rate more than B, the maximum bandwidth needed on link L is min{m, N − m} ∗ B. Therefore, the amount of bandwidth reservation for the virtual cluster on link L should be min{m, N −m}∗ B in order to guarantee access bandwidth B for each VM. When the size of the virtual cluster needs to be increased to N (N > N), Δ = N − N number of new VMs have to be allocated in the datacenter. Let the number of new VMs allocated to the two components connected by link L be ΔrL (ΔrL ≥ 0) and ΔlL (ΔlL ≥ 0), respectively. Then, the bandwidth reservation on link L should be increased to min{m + ΔrL , N − m + ΔlL } ∗ B, which obviously is not less than previous bandwidth reservation min{m, N − m} ∗ B. If min{m + ΔrL , N − m + ΔlL } > min{m, N − m}, residual bandwidth on link L has to be enough for the increase of the bandwidth reservation. The increment depends on the allocation of new VMs resulting in different ΔrL and ΔlL . If L does not have enough bandwidth for the increment of bandwidth reservation, such allocation is not valid. Therefore, the problem of increasing the size of a virtual cluster can be solved through finding the valid allocation for Δ VMs to be added. We can regard these VMs as a complete new virtual cluster and search its valid allocation in the datacenter with a dynamic programming approach similar to the methods in TIVC [5] and SVC [6]. Let the residual bandwidth of link L be RL . The key difference compared to the allocation for a new virtual cluster is that, the validity of VM allocation across any link L should be determined by the following condition RL ≥ min{m + ΔrL , N − m + ΔlL } − min{m, N − m} ∗ B (1) rather than RL ≥ min{ΔrL , ΔlL }∗ B. These two conditions are not equivalent, exemplified in Figure1. It shows a virtual cluster of size 3 having 2 VMs and 1 VM in two subtrees respectively before scaling. The link L has 100Mbps residual bandwidth. Suppose that the size is increased from 3 to 7, and ΔlL = 1 and ΔrL = 3 VMs are placed to two subtrees respectively. Considering new VMs as a separate virtual cluster, 100Mbps residual bandwidth on L is sufficient for such allocation. However, for scaling, the resulting allocation for the scaled virtual cluster requires 300Mbps bandwidth on link L and hence 200Mbps increment of bandwidth reservation for the virtual cluster, which cannot be satisfied by link L. Based on (1), we introduce the dynamic programming based search algorithm for allocating additional VMs for cluster scaling. The algorithm requires the information of the network topology, the number of empty slots in each physical machine (PM, for short), the residual bandwidth of each link, and the current placement of the virtual cluster to be scaled. The algorithm starts from the leaves (i.e., PMs) and traverses the tree topology level-by-level in bottom-up manner. The vertices at the same level are visited in left-to-right order. For each vertex v, the algorithm determines the number of VMs that can be allocated into the subtree rooted at v based on the recorded results for each of its children. The number of VMs that can be allocated into a subtree may be multiple. Thus, the algorithm computes an allocable #VM set for each vertex that contains all feasible numbers of VMs. Given a vertex v, its allocable #VM set is denoted by Mv . We first introduce the computation process at the leaves and then describe the general procedure at a vertex that has children. Consider a virtual cluster C of size N and VM bandwidth B, and Δ number of new VMs to be added for scaling. (1) Suppose that v is a leaf at level 0, (i.e., a PM), which has cv number of empty slots and mv VMs from the virtual cluster C. The number of VMs that can be allocated in v, denoted by Δv , is constrained by both the available VM slots and the residual bandwidth on the link L connecting v to the upper level. Thus, the first condition for the allocability is that Δv ≤ cv . If Δv VMs are placed at v, Δ − Δv VMs are placed at the other side of link L. As a result, the link L splits the scaled virtual cluster into two components each with Δv +mv VMs and N − mv + Δ − Δv VMs respectively. Accordingly, the bandwidth reservation required on link L is min{Δv +mv , N−mv +Δ−Δv }∗B. Based on (1), the second condition for the allocability is min{Δv +mv , N −mv +Δ−Δv }−min{mv , N −mv } ∗ B ≤ RL (2) where RL is the residual bandwidth of link L. To compute v’s allocable #VM set Mv , the algorithm checks the condition (2) for each number from 1 to min{Δ, cv }, and adds the number to Mv if it is true. (2) Suppose that v is a vertex that has n number of children, denoted by v1 , v2 , . . ., vn . The corresponding allocable #VM set of each child vk is Mvk . Similar to previous works [5], [6], the algorithm visits all of v’s children in left-to-right order. For each child vk , the algorithm iteratively computes a set, denoted by S v [k], that contains the number of VMs that can be allocated in the subtree consisting of v and the first k child subtrees, without considering the uplink bandwidth constraint of v. Figure 2 shows the part counted by S v [k], computed as S v [k] = {m | m = a + b where a ∈ Mvk and b ∈ S v [k − 1]} (3) where S v [0] = 0. Finally, we can obtain S v [n]. Each number in S v [n] is a candidate for Δv , i.e, the number of VMs that can be allocated in the subtree T v rooted at v, regardless of the bandwidth constraint of v’s uplink that connects v to the upper level. Then, for each candidate, the algorithm checks its validity by verifying the condition (2), given RL being the residual bandwidth of v’s uplink and mv being the number of preexisting VMs of the virtual cluster C in T v . If the value passes the validation, it is added to Mv . During the search process, the algorithm memorizes the element a contributing to m in (3) for each m in S v [k]. Such information is then used by backtracking to find the number 4 v v1 v2 vk vn ܶ௩భ ܶ௩మ ܶ௩ೖ ܶ௩ ܶ௩ ሾ݇ሿ Fig. 2: The computation of S v [k] is based on the subtree T v [k]. of VMs that should be allocated to vertex vk . The search continues until it finds a vertex v, of which Mv contains Δ, i.e., the number of VMs to be added. Such vertex is referred to as allocation point. After that, starting from Δ at the allocation point, a backtracking algorithm uses the memorized information to find the number of VMs allocated to each subtree level-by-level in top-down manner. The backtracking procedure is the same as in previous work [5], [6] . Eventually, we can obtain the number of VMs to be added on each PM. The search algorithm has the time complexity of O(Δ2 |V|D) where |V| is the number of vertices in the network and D is the maximum number of children that any vertices have. We also note that this algorithm can be reduced to an algorithm for allocating a complete virtual cluster, similar to those in [5], [6], regarding Δ as the size of the virtual cluster and zero VMs for its existing allocation. IV. Scaling with Locality Optimization The previous algorithm adds new VMs to the datacenter without considering the existing placement of the virtual cluster. This may cause that the newly added VMs are distant from the locations of previously allocated VMs. However, good spatial locality is desired for the allocation of a virtual cluster because it not only reduces the communication latency among VMs but also can conserve the bandwidth of the links in the upper levels. In this section, we consider how to scale a virtual cluster with maximizing its VM spatial locality. We measure the spatial locality of a virtual cluster by the average distance (in hops) between any pair of VMs in the cluster. Let C be a virtual cluster of size N and C be the scaled C with larger size N , and VC , VC be their VM sets, respectively. The VM set added for the scaling can be represented by VC \ VC . The average VM-pair distance of C , denoted by avgdC , is calculated as avgdC 2 = h(i, j) N (N − 1) i, j∈V C ⎞ ⎛ ⎟⎟⎟ ⎜⎜⎜ 2 ⎜ ⎜⎜⎜ = h(i, j) + h(i, j)⎟⎟⎟⎠⎟ ⎝ N (N − 1) i, j∈V i∈V \V , j∈V C C C (4) C where h(i, j) is the number of hops between two physical machines where VM i and j are placed. Smaller avgdC indicates higher spatial locality. i, j∈VC h(i, j) is determined by the pre-existing placement of C. Thus, to minimize avgdC , we need to find the optimal placement of the new VMs that minimize the total distance between each new VM and all the other VMs in C . A straightforward approach is to use the algorithm in the previous section to find all the feasible allocations for the scaling, calculate the average VM pair distance of the scaled cluster for every allocation and choose the one with the minimum avgdC . However, the number of feasible allocations can be huge. By observing Formula (3), we can see that the different combinations of a and b can result in the same value of m, like m = 3 for both (a = 1, b = 2) and (a = 2, b = 1). This indicates possibly many valid choices for determining how the VMs are allocated into each child subtree at every vertex. From an allocation point down to the physical machines, the combinations of different choices at each vertex are multiplied level-by-level, each leading to different spatial locality. Thus, this approach needs to examine all these combinations, which is inefficient. Instead, we can show that the problem has the optimal substructure and can be efficiently solved by dynamic programming. Consider placing Δv new VMs to v’s subtree T v for scaling C. Let VT v be this new VM set and VTv be the VM set containing C’s pre-existing VMs in T v . Then, given a feasible allocation A, the sum of the distances between each new VM in VT v and all the other by dTv (Δv , A), VMs in VTv ∪VTv is denoted i.e., dTv (Δv , A) = i∈VT , j∈VT ∪VTv h(i, j). Let d∗ (T v , Δv ) be the v v minimum of dTv (Δv , A) among every feasible allocation A that places Δv VMs into T v . Now we consider a subtree that consists of the vertex v and its first k child subtrees, as shown in dashed box in Figure 2, denoted by T v [k]. Similarly, given an allocation A that places e new VMs to T v [k], dTv [k] (e, A) represents the sum of the distances between each new VMs and all the other VMs of the scaled cluster in T v [k], and d∗ (T v [k], e) is the minimum one among all feasible allocations. Suppose that an allocation A assigns x new VMs to the k + 1-th child subtree T vk+1 and e new VMs to the subtree T v [k]. Then, it is easy to see that dTv [k+1] (e , A) is the sum of distances of the VM pairs from T v [k] and T vk+1 , and also the VM pairs between x VMs in T vk+1 and e VMs in T v [k]. If vertex v is at level l, the distance between any two VMs from v’s two different child subtrees is 2l. Thus, the distance between any VM pair between T vk+1 and T v [k] is 2l. The total number of VM pairs between T vk+1 and T v [k] is x·(e+ ki=1 mvi ) where mvi is the number of C’s pre-existing VMs in its child subtrees T vi . Thus, we have dTv [k+1] (e , A) = dTv [k] (e, A) + dvk+1 (x, A) + 2 · level · x · (e + k mvi ) (5) i=1 For each element e ∈ S v [k + 1], according to (3), there can exist multiple combinations of the VM numbers that can be allocated to the subtrees from T v1 to T vk+1 to sum up to e . Based on the above equation (5), we can derive the following recursive equation: d∗ (T v [k + 1], e ) = ⎫ ⎧ k ⎪ ⎪ ⎪ ⎪ ⎬ ⎨ ∗ ∗ min ⎪ mvi )⎪ ⎪ ⎪d (T v [k], e) + d (T vk+1 , x) + 2 · level · x · (e + ⎭ (x,e)∈Ψ(e ) ⎩ d∗ (T v [1], e) = d∗ (T v1 , e) i=1 (6) where Ψ(e ) = {(x, e) | x + e == e , x ∈ Mvk+1 , e ∈ S v [k]}, the min is operated over all the pairs of x and e of which the sum is equal to e , x ∈ Mvk+1 and e ∈ S v [k]. For each x ∈ Mv , d∗ (T v , x) = d∗ (T v [n], x) where n is the number of v’s children. The above formula indicates the optimal substructure of our 5 problem, and based on that we can easily derive a dynamic programming algorithm to solve the problem. To scale a virtual cluster with bandwidth guarantee while achieving optimal locality, we combine the above algorithm with the algorithm for searching a valid allocation in the previous section. Actually, during the computation of the allocable #VM set Mv of a vertex v, we can simultaneously compute d∗ (T v , e) for each e ∈ Mv . Besides, note that there can be multiple allocation points in the tree and each may lead to different average VM-pair distance for the scaled cluster. Thus, to find the optimal allocation among all possible allocation points, the algorithm traverses the whole tree, discovers all the allocation points, and chooses the one with minimum d∗ (T v , Δ) where v is an allocation point and Δ = N − N. For clarity, the pseudo code of the complete algorithm is given in Algorithm 1. The dynamic programming for maximizing the locality is between Line 14∼23 and the backtracking information is recorded in Dv [x + e, i]. It is easy to see that Algorithm 1 has the same time complexity O(Δ2 |V|D) as the previous algorithm in Section III. Algorithm 1: Scaling Allocation Algorithm 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 V. Scaling with VM Migration The algorithm proposed in the previous section aims to properly allocate VMs to be added for scaling a virtual cluster. However, it may not be able to find a feasible solution, even when the datacenter has resource to accommodate the virtual cluster with scaled size. Figure 3 shows an example where a network consists of four PMs, each with four VM slots and connected by 1Gbps links. Two virtual clusters VC1 and VC2 are deployed. VC1 has 5 VMs and the bandwidth for each is 500Mbps. VC2 has 2 VMs with bandwidth 200Mbps. On link L, the bandwidth reservation for VC1 is 500 and for VC2 is 400, and thus the residual bandwidth on L is 100. Suppose that a new VM needs to be added to scale VC2 up to 6 VMs. It can be easily verified that, no matter to which empty slot it is allocated, the scaled virtual cluster needs 600 bandwidth on link L. It requires the 200 increment of bandwidth reservation for VC2 on L, more than the 100 residual bandwidth of L. That is, no feasible allocations can be found for the new VM and thus VC2 cannot be scaled up. However, in Figure 3(a), it is obvious that the subtree T u2 can accommodate the whole virtual cluster VC2 with scaled size 6. This indicates that existing VM placement could hinder the scalability of a virtual cluster. To address this issue, we exploit VM migration, with which it is possible to change the layout of existing VMs to accommodate new VMs. As an example, in Figure 3(b), after a VM is migrated from T u1 to T u2 , VC2 can add another new VM in T u2 without increasing the bandwidth reservation on link L. Thus, in this section we consider VM allocation with allowing VM migration for scaling a virtual cluster. Because VM migration incurs VM service downtime [15] while performance isolation among different virtual clusters is desired in cloud, we only consider migrating VMs from the virtual cluster that is to be scaled, rather than VMs in other virtual clusters, such that the scaling of a virtual cluster does not interrupt other virtual clusters. Since VM migration incurs significant traffic and service downtime [15], [16], it is a must to minimize the VM migration cost for virtual cluster scaling. Thus, only if no proper 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 Input: Tree topology T , a virtual cluster C =< N, B > to be scaled to < N , B >, bandwidth allocation information on each link. Δ ← N − N ; AllocationpointS et ← ∅; for level l ← 0 to Height(T ) do for each vertex v at level l do n ← the number of v’s children; if l = 0 then // leaf v is a PM S v [0] ← {0, 1, . . . , min{cv , Δ}} ; // cv is the number of empty VM slots of v else S v [0] ← {0}; for i from 1 to n do S v [i] ← ∅; for each x ∈ Mvi do for each e ∈ S v [i − 1] do if x + e S v [i] then d∗ (T v [i], x + e) ← in f ; S v [i] ← S v [i] ∪ {x + e}; if i = 1 then d(T v [i], x + e) ← d∗ (T v1 , x) ; else d(T v [i], x + e) ← d∗ (T v [i − 1], e) + i−1 d∗ (T vi , x) + 2 · level · x · (e + k=1 mvk ) ; if d∗ (T v [i], x + e) > d(T v [i], x + e) then d∗ (T v [i], x + e) ← d(T v [i], x + e); Dv [x + e, i] ← x; Mv ← ∅ ; for each e ∈ S v [n] do if min{e+mv , N −mv +Δ−e}−min{mv , N −mv } ∗ B ≤ RLv then // RLv is the residual bandwidth of v’s uplink Mv = Mv ∪ {e} ; d∗ (T v , e) ← d∗ (T v [n], e); if Mv = ∅ then return false; if Δ ∈ Mv then AllocationpointS et ← AllocationpointS et ∪ (v, d∗ (T v , Δ), Dv ); if AllocationpointS et ∅ then Choose v that has the minimum d∗ (T v , Δ); Alloc (v, Δ, Dv ); return true; return false; Procedure Alloc(v, x, Dv ) if v is a machine then allocate x VMs into v; else for v’s child i from n to 1 do Alloc(vi , Dv [x, i]); Update bandwidth allocation information on v’s downlink to its i-th child; x = x − Dv [x, i]; allocation can be found for new VMs with keeping existing VM placement, we turn to the allocation approach with exploiting VM migration. The complete solution is described with answering “when to migrate” and “how to migrate”. A. When to migrate Algorithm 1 can find a feasible allocation for new VMs if it exists. If the solution does not exist, the algorithm fails, which occurs in the following two cases: 6 u1 VC 1 VM 500Mbps VC 2 VM 200Mbps u2 Link capacity: 1Gbps L u1 u2 L (a) (b) Fig. 3: Figure (a) shows the case that no feasible allocations exist for scaling a virtual cluster. Figure (b) shows that the virtual cluster VC2 can be scaled with one VM migration. (a) The search reaches at the root, and the algorithm finds no vertices (including the root) in the tree of which the allocable #VM sets contain Δ, i.e., the number of VMs to be added. (b) During the search, if any vertex is found to have empty allocable #VM set, the solution does not exist and the algorithm terminates. Given a vertex v, its allocable #VM set Mv is generated by filtering any elements that do not satisfy v’s uplink bandwidth constraint out from the candidate set ( i.e., S v [n] in Formula (3)). Mv = ∅ means that the uplink bandwidth constraint cannot be satisfied for any placement of new VMs in the tree. In this case, even placing zero VMs in v’s subtree T v is not a valid choice, which indicates an essential difference between placing a complete virtual cluster and scaling an existing cluster. Zero VMs in T v imply that all new VMs are placed outside of T v but additional bandwidth needs to be reserved for the traffic to and from existing VMs in T v . If any of the above cases happens, we terminate the previous algorithm with returning false at line 30 and 37 in Algorithm 1 and turn to VM allocation with migration. B. How to migrate In this section we propose our allocation approach that allows VM migration. We first look into the following fact: Proposition 1: Assume that a virtual cluster C with size N is scaled up to C with larger size N in the datacenter DC. If a valid allocation can be found for the virtual cluster C in DC where the original cluster C and its corresponding bandwidth reservation are hypothetically removed, then C must be able to scale up to size N with allowing VM migration, and vice versa. This proposition is straightforward, since we can always migrate C and additional VMs to the slots where C is allocated hypothetically. If we cannot find any solutions for C , C cannot be scaled up even with VM migration. Based on that, our allocation algorithm first hypothetically removes the virtual cluster C and releases its bandwidth reservation, and finds valid allocations for the scaled cluster C , from which it derives the final solution. This also guarantees that as long as the datacenter has enough resource to accommodate the scaled virtual cluster, our algorithm always finds a valid allocation and will not falsely reject a scaling request that can be satisfied. On the other hand, the allocation should minimize the migration of VMs considering its significant overhead. To do that, our algorithm aims to find the allocation that has maximum overlap with existing VM placement of the virtual cluster, which let the most VMs of C stay the same. We first introduce the overlap measurement as follows: Definition 1 (Allocation overlap size): Given a set of PMs {P1 , P2 , . . . , Pl }, suppose that an allocation assigns ai VMs to each Pi , and another allocation assigns bi VMs to each Pi , then the overlap size between two allocations is l min{ai , bi } (7) i=1 For example, an allocation assigns 2, 3 and 1 VMs to P1 , P2 and P3 respectively, and another one assigns 1, 2 and 2 VMs to P1 , P2 and P3 . Their overlap size on each single machine is 1, 2, 1 respectively, and hence the overlap size between them with respect to all three machines is 4. If the latter allocation represents a virtual cluster C’s existing VM placement and the former one represents its VM placement after being scaled, then 4 VMs of C can stay the same and only one VM in P3 needs to be moved out. Next we describe how to find the allocation having the largest overlap size with the original allocation of the cluster C and how to migrate. Because here we are considering the allocation of a whole cluster instead of only new VMs, for a vertex v, its allocable #VM set Mv is redefined as the set containing the number of VMs out of C that can be allocated to v’s subtree T v . 1) Finding the allocation with the largest overlap size: For each element x ∈ Mv , there are one or multiple valid ways to allocate x VMs to the PMs in the subtree T v . Every allocation can have different overlap size with the VM placement of original cluster C in T v . Let OTv [x] be the maximum overlap size among all these allocations which place x VMs in the subtree T v . Our algorithm calculates OTv [x] for each x ∈ Mv . The procedure is similar to that for computing the maximum locality in Section IV. Reminding that T v [k] is the subtree consisting of v and v’s first k child subtrees rooted at v1 , . . . , vk , and v has total n child subtrees. Let OTv [k] [x] be the maximum overlap size among all the feasible allocations that place e VMs to T v [k] where e ∈ S v [k]. Then, for each element e ∈ S v [k + 1], OTv [k+1] [e ] = max OTv [k] [e] + OTvk+1 [x] (x,e)∈Ψ(e ) OTv[1] [e] = OTv1 [e] (8) where Ψ(e ) = {(x, e) | x + e == e , x ∈ Mvk+1 , e ∈ S v [k]}. Finally, for each element e ∈ S v [n], we have OTv [n] [e ]. If e satisfies the uplink bandwidth constraints, it is added into Mv and the corresponding OTv [n] [e ] is kept along with it, referred to as OTv [e ]. The algorithm starts the search from the leaf vertices (i.e., PMs). For a leaf u, T u is u itself. For each x ∈ Mu , OTu [x] is the smaller of x and the number of VMs that the cluster C originally has in the machine u. Then, level-by-level, for each vertex, the algorithm computes its allocable #VM set and the maximum overlap size associated with each element in the set. As opposed to the previous algorithm in Section III, the algorithm needs to find an allocation point that not only has N in the allocable #VM set, but also should achieve the largest overlap size, because there can be multiple allocation points that have different maximum overlap sizes associated with N . Take Figure 3 as an example. At the level of u1 and u2 , the algorithm can find that u2 is an allocation point. However, the 7 Fig. 4: The allocation point with largest maximum overlap size can appear at the root. The boxes filled with backslash represent VMs from other virtual clusters. allocation point with largest overlap size 4, shown in Figure 3(b), actually exist in the subtree at the upper level. Let T rC be the lowest-level subtree that contains the original cluster C, rooted at rC . Let S C be the set that contains the vertices of T rC and the ancestors of rC . It is obvious that only the allocation points in S C may have allocations overlapping with C. Any allocation points not in S C must have zero maximum overlap size. To find the optimal allocation that achieves the largest overlap size, the algorithm needs to determine all the allocation points in S C and their maximum overlap sizes. The reason to examine all the vertices in S C is that the optimal allocation point can be at any level. See Figure 4 for an example, in which 3 more VMs are added to extend a virtual cluster of 3 VMs (represented by blue box) shown in Figure 4(a). Two downlinks of u1 have residual bandwidth 200 and 100. We assume sufficient bandwidth on other links. The allocation point u1 with the allocation shown in Figure 4(b) has the maximum overlap size of one. But the allocation point u at the upper level achieves a higher maximum overlap size, two, as shown in Figure 4(c). This example simply indicates the algorithm has to traverse the tree until to the root to determine the optimal allocation point. Denote the set of the allocation points with non-zero maximum overlap size in S C by S Ca . S Ca = ∅ if either no allocation points exist in S C or all the allocation points in it have zero maximum overlap size. Then, if S Ca ∅, the algorithm chooses the one that has the largest maximum overlap size in S Ca as the final allocation solution for C ; otherwise, the algorithm chooses the allocation point at the lowest level among all allocation points in the whole tree as the final solution. Locality v.s. Overlap size The above algorithm aims to maximize the overlap size with the original cluster C, regardless of VM locality. Good locality implies that a virtual cluster needs to be placed in a subtree as lower as possible, but the largest overlap size may appear in a higher level. We can see that the allocation in Figure 4(c) achieves the largest overlap size but the VMs of C are distributed to different subtrees T u1 and T u2 . In contrast, Figure 4(b) has smaller overlap size but higher locality with VMs being in the same subtree T u1 . To make a trade-off, we propose a simple solution that evaluates an allocation point based on the weighted sum of locality and overlap size. Given an allocation point v at level L in the tree of height H, let OTv be its maximum overlap size for C ’s allocation in its subtree, v’s goodness is measured by α OT v H−L + (1 − α) N H (9) where N is the size of original cluster C and 0 ≤ α ≤ 1. The weights can be determined by the cloud providers, and different weights indicate different preferences to either mitigating the migration cost or compacting the virtual cluster. If α = 0, the goal is to maximize the locality for the scaled cluster, regardless of the migration cost; if α = 1, the migration cost is the only focus. Based on that, we can adapt the above algorithm to choose the allocation point with the maximum goodness among all as the final solution. The algorithm for finding optimal allocation of C is similar to Algorithm 1 in that the search is also based on dynamic programming but uses a different goodness measurement and searches for the allocation of N VMs. It has the time complexity O(N 2 |V|D) . 2) Determining how to migrate VMs: Given the allocation for the cluster C , the next step is to decide how to migrate non-overlapped VMs from C to the slots allocated to C . Migrating a VM to different PMs may go through routing paths of different lengths and hen different durations. To reduce the service downtime and network overhead, the mapping between VMs and slots should minimize the total number of hops of VM migrations. To this end, we transform it to the minimum weight perfect matching problem in bipartite graph [17] that is to find a subset of edges with minimum total weight such that every vertex has exactly one edge incident on it. Suppose the datacenter has ND PMs. The allocation for C can be represented by {mi |1 ≤ i ≤ ND } which means the allocation assigns mi VMs to PM Pi . Similarly, the original allocation for C is {mi |1 ≤ i ≤ ND }. We construct the corresponding bipartite matching problem as follows. Let V and S be two sets of vertices representing VMs and slots respectively. Initially they are empty. For each PM Pi , we compute δi = mi − mi . If δi > 0, which means that δi VMs need to be migrated out from Pi , then δi vertices are added to V, each labeled with its PM identifier Pi ; if δi < 0, which means that |δi | empty slots on Pi are allocated to C ’s VMs, then |δi | vertices are added to S , each also labeled with Pi . After that, Δ = N − N vertices are added to V to represent new VMs to be added for scaling, labeled with “New”. It is easy to prove that V and S have the same number of vertices, and no vertices in S have the same labels as the vertices in V. An edge is added for each pair of vertices u (u ∈ V) and v (v ∈ S ), and finally we obtain a complete bipartite graph. If u’s PM label is Pu and v’s is Pv , then the weight of edge (u, v) is the number of hops between Pu and Pv , i.e., the hops for migrating the VM at u to the slot at v. If u’s label is “New”, the edge between u and any vertex in S has zero weight because no migration is needed for newly added VMs. Based on the complete bipartite graph constructed above, we can use well-known Hungarian algorithm [17] to find its minimum weight perfect matching where each edge indicates the assignment of a VM in V to a slot in S and the total number of hops for VM migrations is minimized with such assignment. Note that, since we assume that all VMs in a virtual cluster are homogenous, we do not distinguish C’s VMs that are placed in one PM from each other, nor distinguish the newly added VMs. That is, for the assignment (u, v), we can migrate any one of C’s VMs on Pu to v, or add any one of the newly added VMs to v if u’s label is “New”. In the worst case that the overlap size is zero, N VMs need to be migrated and the Hungarian algorithm takes O(N 3 ) time complexity. 8 VI. EVALUATION In this section we evaluate the effectiveness and efficiency of our algorithms for scaling a virtual cluster by simulations. A. Simulation Setup Datacenter topology We simulate a datacenter of three-level tree topology with no path diversity. A rack consists of 10 machines each with 4 VM slots and a 1Gbps link to connect to a Top-of-Rack (ToR) switch. Every 10 ToR switches are connected to a level-2 aggregation switch and 5 aggregation switches are connected to the datacenter core switch. There are total 500 machines at level 0. The oversubscription of the physical network is 2, which means that the link bandwidth between a ToR switch and an aggregation switch is 5Gbps and the link bandwidth between an aggregation switch and the core switch is 25Gbps. Workload Our simulations are conducted under the scenario of dynamically arriving tenant jobs. A job specifies a virtual cluster abstraction < N, B >. The jobs dynamically arrive over time, and if a job cannot be allocated upon its arrival, it is rejected. Each job runs for a random duration and its virtual cluster is removed from the datacenter when the job is completed. The number of VMs N in each virtual cluster request, i.e., the job size, is exponentially distributed around a mean of 20 at the default. The bandwidth B of each job is randomly chosen from [100, 200]. The job arrival follows a Poisson process with rate λ, then the load on a datacenter with M VMs is λN̄T c /M where N̄ is the mean job size and T c is the mean running time of all jobs. The running time of each job is randomly chosen from [200, 500]. During the job arrival process, we randomly choose a job running in the datacenter and generate a scaling request for it with a given probability Pr scal before the arrival of the next job. The size increment for a virtual cluster < N, B > is determined by a given percentage increase %inc, then the new size N = (%inc + 1)N. We simulate the arriving process of 500 tenant jobs with varying the load, the scaling probability Pr scal , and the percentage increase of the cluster size %inc. B. Simulation Results We evaluate and compare our scaling algorithms in terms of the rejection rate of the scaling requests, the locality of the scaled cluster, and the number of VM migrations for the scaling. The figures present the average across multiple runs. For brevity, we refer to the scaling algorithm in Section III as “Scaling”, the algorithm with locality optimization in Section IV as “Scaling-L” and the algorithm allowing VM migration in Section V as “Scaling-M”. 1) Rejection rate of the scaling requests: Figure 5 and 6 show the rejection rate with different %inc under light load and heavy load settings where load = 0.2 and 0.6 respectively. In Figure 5, the rejection rates of Scaling-M are zero. Except that, the rejection rate increases with %inc for all three algorithms. It can be seen that Scaling-L can achieve lower rejection rate than Scaling under light load. The reason can be that the new VMs are put as closer as to the pre-existing placement with the locality optimization, which actually reduces the fragmentation and saves more continuous space to accommodate incoming job requests. But under the heavy load, limited resource is main bottleneck and locality optimization does not help, thus, Scaling-L and Scaling have close rejection rates in Figure 6. Compared with Scaling-M, both Scaling-L and Scaling have much higher rejection rates, up to 50% under the heady load. This indicates that the preexisting VM placement can seriously hinder the scalability of virtual clusters. However, with allowing VM migration for the scaling, Scaling-M can achieve much lower rejection rate. We also vary Pr scal to change the arrival rate of the scaling requests and show the change of the rejection rate in Figure 7. The rejection rate increases with Pr scal , and Scaling-M still has the lowest rejection rate. From the evaluation blow, we will see that Scaling-M achieves such low rejection rate only at much small VM migration cost. 2) Locality: Scaling-L searches for the allocation with the minimum average VM-pair distance for scaling a virtual cluster. To verify its effectiveness, we consider how much improvement can be obtained from Scaling-L compared with Scaling. Using Scaling and Scaling-L to scale a cluster can lead to different average VM-pair distance. Thus, given the same workload and scaling requests under load = 0.2, we calculate the difference of the average VM-pair distances between two scaled clusters resulted from Scaling and Scaling-L, respectively, for the same job j, that is, avgdCj − avgdCj . S caling S caling−L Figure 8 shows the empirical cumulative distribution of such difference under %inc = 20% and %inc = 30% respectively. As we can see, 80% of all the differences are larger than zero and 20% less. The 20 percent that are less than zero indicate that Scaling-L does not always generate an allocation for the scaled cluster with smaller average VM-pair distance than the allocation obtained by Scaling. This is to be expected, because two algorithms lead to different VM layout and bandwidth usage in the datacenter during the dynamic job arrival process. Even for the same job, Scaling-L and Scaling may work on the different pre-existing placement and network status. Nevertheless, in terms of the overall effect, Scaling-L reduces the average VM-pair distance for the large majority of scaled jobs. Besides, from the distribution we can tell that the improvement by Scaling-L becomes more pronounced under higher %inc. For %inc = 30%, 20 percent of distance reduction fall in [0.2, 0.6] but for %inc = 20% only 10%. 3) Migration Cost: The Scaling-M algorithm minimizes the number of VM migrations by maximizing the overlap size with pre-existing cluster placement. To show its effectiveness, we compare it with the cluster allocation algorithm in [5], [6], referred to as “C-A”, which is used to search the allocation for the scaled cluster in the datacenter where the pre-existing cluster is hypothetically removed, without considering the overlap with pre-existing VM placement. We replay the same workload and scaling request sequences and measure the total number of VM migrations incurred by these algorithms for all the scaling operations. Let load = 0.6 and Pr scal = 0.2. From Table II we can see that without considering the overlap, the number of VM migrations can be prohibitive, from 748 to 915. Scaling-M significantly reduces the total number of migrated VMs for the scaling by at least 90%. Combining with Figure 6, it can be concluded that Scaling-M can greatly decrease the rejection rate at very small migration cost. We also evaluated the effect of α in Formula (9) by varying it from 0 to 1 by step 0.2. Given a workload with load = 0.6, 9 0.5 Scaling-M 0.2 0.15 0.1 0.05 Scaling Scaling-L 0.5 Scaling-M 0.4 0.3 0.2 0.1 0 0 0.2 0.3 0.4 0.5 %inc Scaling-M C-A 0.2 0.6 20% 34 748 0.3 0.4 0.5 0.6 %inc Fig. 6: The rejection rate under load = 0.6,Pr scal = 0.2. 30% 38 820 40% 70 852 50% 34 915 TABLE II: Total number of VM migrations for scaling with different %inc. α #VM Migrations Total Avg pair distance 0 1193 73 0.2 1057 97 Scaling-L 1 Scaling-M 0.4 0.3 0.2 0.1 0 %inc Fig. 5: The rejection rate under load = 0.2,Pr scal = 0.2. Scaling cumulative probability distribution Scaling-L Rejection rate Scaling Rejection rate Rejection rate 0.3 0.25 0.4 841 125 0.6 581 134 0.8 300 128 1 38 133 TABLE III: The number of VM migrations and average VMpair distance with different α. %inc = 30% and Pr scal = 0.2, we examine the total number of VM migrations for the scaling and the sum of all the scaled clusters’ average VM pair distance under different α. The result is shown in Table III. Both two measurements vary with α significantly and monotonously overall1 The overlap size increases with α, leading to decreasing number of VM migrations when α increases; the average VM-pair distance increases with α. Such results verify the effectiveness of our weight sum design in (9), in which α can act as a control knob to capture users’ preferences. VII. Discussion In this paper we mainly focus on the scaling of virtual cluster abstraction in size, i.e., from < N, B > to < N , B > (N > N). Nevertheless, we can use the algorithms in this paper to solve two other types of scaling that increase the bandwidth requirement. To scale a cluster C from < N, B > to < N, B > (B > B), we first hypothetically increase the bandwidth of C’s VMs to B and check whether there are any links that do not have enough residual bandwidth for the increment of C’s bandwidth reservation. If no such links exist, C can be scaled in-place with accordingly increasing the bandwidth reservation on links; otherwise, C’s original placement has to be changed for the scaling and we use the allocation algorithm with VM migration in Section V to find the allocation of new cluster abstraction < N, B > that has the largest overlap size with C’s original placement and conduct the corresponding VM migrations. For the scaling from < N, B > to < N , B >, we first try to scale the cluster in-place with two steps: first < N, B >→< N, B > and then < N, B >→< N , B >. We use the algorithm in Section III or IV for the second step. If both two steps succeed, the cluster can be scaled in-place; If any of two steps failed, we turn to the algorithm with VM migration to allocate < N , B > to minimize the number of VM migrations. 1 In Table III 134 is an exception. For a particular workload it may have such small variance. The observations from multiple random workloads indicate that the overall trend is monotonic. 0.2 0.3 Prscal 0.4 0.5 Fig. 7: The rejection rate under load = 0.6, %inc = 30% 0.9 0.8 %inc=30% %inc=20% 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 Difference of average VM-pair distance Fig. 8: CDF of avgdCj avgdCj S caling−L S caling − . VIII. Conclusion This paper addresses the scaling problem of virtual cluster abstraction with bandwidth guarantee. We propose efficient algorithms to scale up the size of the virtual cluster with and without allowing VM migration. Besides demonstrating the efficiency of these algorithms, simulation results also indicate that the pre-existing VM placement can seriously hinder the scalability of virtual clusters and the scalability can be significantly improved at small VM migration cost. Acknowledgement: This research is supported by NSF CNS-1252292. References [1] “Amazon EC2,” http://aws.amazon.com/ec2/. [2] J. Schad, J. Dittrich, and J.-A. Quiané-Ruiz, “Runtime measurements in the cloud: observing, analyzing, and reducing variance,” Proc. of VLDB Endow., vol. 3, no. 1-2, pp. 460–471, Sep. 2010. [3] C. Guo, G. Lu, H. J. Wang, S. Yang, C. Kong, P. Sun, W. Wu, and Y. Zhang, “Secondnet: a data center network virtualization architecture with bandwidth guarantees,” in Proc. of Co-NEXT. New York, NY, USA: ACM, 2010, pp. 15:1–15:12. [4] H. Ballani, P. Costa, T. Karagiannis, and A. Rowstron, “Towards predictable datacenter networks,” in Proc. of ACM SIGCOMM. New York, NY, USA: ACM, 2011, pp. 242–253. [5] D. Xie, N. Ding, Y. C. Hu, and R. R. Kompella, “The only constant is change: incorporating time-varying network reservations in data centers,” in Proc. of ACM SIGCOMM, 2012, pp. 199–210. [6] L. Yu and H. Shen, “Bandwidth guarantee under demand uncertainty in multi-tenant clouds,” in ICDCS, 2014, pp. 258–267. [7] J. Lee, Y. Turner, M. Lee, L. Popa, S. Banerjee, J.-M. Kang, and P. Sharma, “Application-driven bandwidth guarantees in datacenters,” in Proc. of ACM SIGCOMM, 2014, pp. 467–478. [8] N. R. Herbst, S. Kounev, and R. Reussner, “Elasticity in cloud computing: What it is, and what it is not,” in ICAC. San Jose, CA: USENIX, 2013, pp. 23–27. [9] P. Padala, K.-Y. Hou, K. G. Shin, X. Zhu, M. Uysal, Z. Wang, S. Singhal, and A. Merchant, “Automated control of multiple virtualized resources,” in Proceedings of the 4th ACM European conference on Computer systems. ACM, 2009, pp. 13–26. [10] Z. Shen, S. Subbiah, X. Gu, and J. Wilkes, “Cloudscale: elastic resource scaling for multi-tenant cloud systems,” in Proceedings of the 2nd ACM Symposium on Cloud Computing. ACM, 2011, p. 5. [11] Z. Gong, X. Gu, and J. Wilkes, “Press: Predictive elastic resource scaling for cloud systems,” in Network and Service Management (CNSM), 2010 International Conference on. IEEE, 2010, pp. 9–16. [12] R. Han, L. Guo, M. M. Ghanem, and Y. Guo, “Lightweight resource scaling for cloud applications,” in ACM/IEEE CCGrid, 2012, pp. 644– 651. [13] H. Nguyen, Z. Shen, X. Gu, S. Subbiah, and J. Wilkes, “Agile: Elastic distributed resource scaling for infrastructure-as-a-service,” in ICAC. San Jose, CA: USENIX, 2013, pp. 69–82. [14] H. Herodotou, F. Dong, and S. Babu, “No one (cluster) size fits all: Automatic cluster sizing for data-intensive analytics,” in Proc. of ACM SOCC. New York, NY, USA: ACM, 2011, pp. 18:1–18:14. [15] C. Clark, K. Fraser, S. Hand, J. G. Hansen, E. Jul, C. Limpach, I. Pratt, and A. Warfield, “Live migration of virtual machines,” in Proc.of NSDI. Berkeley, CA, USA: USENIX Association, 2005, pp. 273–286. [16] L. Yu, L. Chen, Z. Cai, H. Shen, Y. Liang, and Y. Pan, “Stochastic load balancing for virtual resource management in datacenters,” IEEE Transactions On Cloud Computing, 2016. [17] C. H. Papadimitriou and K. Steiglitz, Combinatorial Optimization: Algorithms and Complexity. Upper Saddle River, NJ, USA: PrenticeHall, Inc., 1982. 0.6