This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TETC.2015.2477778, IEEE Transactions on Emerging Topics in Computing 1 A Provident Resource Defragmentation Framework for Mobile Cloud Computing Weigang Hou, Rui Zhang, Wen Qi, Kejie Lu, Senior Member, IEEE, Jianping Wang, Lei Guo Abstract—To facilitate mobile cloud computing, a cloud service provider must dynamically create and terminate a large number of virtual machines (VMs), causing fragmented resources that cannot be further utilized. To solve this problem proactively, most existing studies have been based on server consolidation, with the main objective of minimizing the number of active servers. Although this approach can minimize resource fragmentation at a particular time, it may be over aggressive at the price of too frequent VM migration and low system stability. To address this issue, we propose a novel provident resource defragmentation framework that is revenue-oriented with the goal to reduce unnecessary VM migration. Within the proposed framework, we formulate an optimization problem for resource defragmentation at a particular time epoch, with the consideration of the future impact of any VM migration. We then develop an efficient heuristic algorithm that can obtain near-optimal results. Extensive numerical results confirm that our framework can provide the highest profit and can significantly reduce the VM migration cost in practical scenarios. Index Terms—Cloud computing, Virtual machine (VM), VM migration, defragmentation ✦ 1 I NTRODUCTION I N recent years, we have witnessed the significant growth of the global mobile economy, with a rapidly increasing number of smartphones and explosively expanding data traffic. According to a recent study by Cisco [1], in 2014, 439 million new smartphones were connected to the mobile network and the total mobile data traffic increased 69 percent worldwide, to 2.5 Exabytes per month by the end of 2014. Such a trend has motivated the emerging of mobile cloud computing [2], [3], with which a smartphone can dynamically offload computationally intensive tasks to remote cloud service providers as so to improve both the performance and the battery life. Despite the promising future of mobile cloud computing, there are still many challenges to be solved [2], [3], [4], [5], [6], [7]. One of the key challenges the cloud providers face is that, to accommodate the diverse requests from a large number of smartphones, a cloud data center must create and terminate a large number of virtual machines (VM) dynamically. Such frequent VM creation and termination can lead to the resource fragmentation problem, because residual resources (such as CPU, memory, and I/O capacity) may not be fully utilized at a later time [8], [9], [10]. To illustrate such a phenomenon, Fig. 1(a) shows a simple example, in which two servers S1 and S2 are currently hosting three VMs and the resource capacity of each server • Weigang Hou and Lei Guo are with the School of Information Science and Engineering, Northeastern University, China. • Rui Zhang, Wen Qi, and Jianping Wang are with the Department of Computer Science, City University of Hong Kong. • Kejie Lu is the corresponding author and he is with the School of Computer Engineering, Shanghai University of Electric Power, Shanghai, China, and with the Department of Electrical and Computer Engineering, University of Puerto Rico at Mayagüez, PR, USA. The work described in this paper was fully supported by a grant from the Research Grants Council of the Hong Kong Special Administrative Region, China (Project No. CityU 120612). S1 S1 S2 S2 1 3 6 2 VM1 3 VM2 VM3 (a) Before server consolidation 2 2 VM3 2 VM1 3 VM2 8 (b) After server consolidation Fig. 1. An example of resource fragmentation and consolidation. is 8 units. Furthermore, VM1 and VM3 occupy 2 units of resources, and VM2 occupies 3 units of resources. In such an example, if a new VM requires 7 units of resources, it cannot be accommodated with the current VM placement. In this paper, we refer to this problem as the fragmentation problem because it occurs due to fragmented resources in different servers. Existing approaches dealing with fragmentation in cloud can be classified into two categories: (1) VM migration upon new VM request arrivals and (2) proactive server consolidation. In the first category, when a new VM request arrives, the cloud data center will determine whether it can accommodate the request with existing available resources [11], [12], [13]. If such a placement is not possible, the data center may migrate one or more existing VMs and then place the new VM into the system [14]. Although this approach is viable, when there are hundreds of VMs that need to be provisioned in a short time, many VM migrations may be invoked, which will certainly consume extra bandwidth and cause service degradation to existing VMs. In the literature, some recent studies have discovered that VM migrations can lead to significant migration cost and delay [15], [16]. In the second category, server consolidation can be performed before the arrival of new VM requests. In recent This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TETC.2015.2477778, IEEE Transactions on Emerging Topics in Computing 2 years, server consolidation has been investigated extensively in the literature [11], [13], [17], [18], [19], [20], [21], [22], [23], [24], [25], [26], [27], [28], [29]. Usually, an outcome of server consolidation is that existing VMs are packed into the minimal number of servers, which can reduce resource fragments. Another outcome is that some servers that do not host any VM will be shut down to conserve energy. In Fig. 1(b), with server consolidation, VM3 will be migrated to server S1 , and thus S2 can accommodate the new VM request that requires 7 units of resources. Despite the importance of server consolidation, we observe that there are two major issues that have not yet been addressed. First, shutting down a server may not be desirable if the same server shall be restarted after a short period due to the increasing demands. Second, also more important, server consolidation may lead to excessive VM migrations because some residual resources may be utilized in the near future. For instance, in Fig. 1(a), if there are two new VMs to arrive, requiring 5 and 3 units of resources, respectively, they can be directly assigned without any VM migration. In other words, migrating VM3 is unnecessary in such a case. Motivated by the observation above, in this paper, we propose a provident resource defragmentation framework, where the main idea is to migrate VMs as few as necessary, based on an estimation of future demands. In our framework, a major challenge is how to make two inter-correlated decisions, namely, when to conduct defragmentation and which VMs need to be re-provisioned in one defragmentation. To address this challenge, we formulate an optimization problem to avoid unnecessary VM migrations, given a set of existing VMs and an expected set of incoming and terminating VMs. After showing that the problem is NPhard, we develop an efficient heuristic algorithm that can obtain a near-optimal solution. To the best of the authors’ knowledge, this is the first study that systematically investigates the defragmentation problem in a cloud data center. Within this scope, our main contributions are summarized as follows. • • • We propose a general framework for resource defragmentation with the objective to avoid unnecessary VM migrations. In our framework, defragmentation can be fine-tuned by determining when to trigger the defragmentation operation and how to migrate existing VMs. Our framework is general in that it can also be applied to other operations, such as VM placement and server consolidation. Within the proposed framework, we formulate an optimal resource defragmentation problem for an arbitrary time epoch. To avoid unnecessary VM migrations, we consider that each VM is associated with an accommodation revenue and a migration cost. We then formulate the problem to maximize the profit (i.e., total revenue minus costs) with a prediction of future demands. In our formulation and algorithm design, we consider a demand prediction that requires only the estimation of the total number of arrival and the request distribution of different VM instances. Our numerical results show that such an approach can lead to reasonably good performance even if we do not have an accurate estimation of the number of arriving VMs. The rest of the paper is organized as the following. First, we discuss related work in Section 2. We then elaborate on the framework for provident resource defragmentation in Section 3. Based on the proposed framework, we formulation the optimal provident resource defragmentation problem in Section 4. Since the problem is hard by nature, we develop an efficient heuristic algorithm in Section 5. Finally, we present numerical results in Section 6, before concluding the paper. 2 R ELATED WORK In this section, we review the related work on VM placement, VM migration and server consolidation. The problem of VM placement is essentially multi-dimensional vector packing [30]. Given a set of VM requests and vacant cloud resources, VM placement aims to maximize the overall utility. Some VM provisioning also considers live VM migration [14]. Server consolidation has been widely considered as an approach to optimize various utility measurements, such as balancing server workload [21], [22], [23], maintaining the reliability of hardware devices [24], enhancing scalability of server consolidation algorithm [25], preserving fairness [26] and reducing operational cost [31]. Although live VM migration gives the cloud vendors more freedom on optimizing their utility measurements, it also induces extra operational cost and may disrupt the service of existing VMs. Reducing such migration cost is the major focus of this paper. Leveraging on the knowledge of reliable prediction on future VM requests, unnecessary VM migrations may be avoided. He et al. [12] utilized a multi-dimensional normal distribution of historical VM requests for predicting the resource requirement of future VM requests. When the residual resources on a server are abundant to satisfy the mean resource requirement of a VM request, this server will not be considered for server consolidation. However, in reality, a cloud service provider may offer several types of VM instances. Therefore, using the mean resource requirement of VM requests may still lead to unnecessary VM migrations. In the literature, there are several other models for predicting the changes of the future resource demands of VMs [22]. In this study, we consider a much weaker assumption in that, for each type of VMs, we only use the expected number of arrivals at a future time epoch. 3 A P ROVIDENT R ESOURCE D EFRAGMENTATION F RAMEWORK In this section, we propose a provident resource defragmentation framework to address the resource fragmentation issue. We first briefly explain the main idea of the proposed framework. We then elaborate on important components of the framework. This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TETC.2015.2477778, IEEE Transactions on Emerging Topics in Computing 3 Demand Estimation center so as to maximize the overall profit of the data center. Fragmentation Monitoring Migration Defragmentation Demands of VM Arrival and Departure Servers in Datacenter Placement VM Placement Fig. 2. A defragmentation framework. 3.1 The Main Idea As introduced previously, the main idea of our framework is to perform VM migrations as few as necessary. Here “necessary” first means that the defragmentation operation shall aim to improve the resource utilization of a data center. Moreover, it also means that some fragments can be kept without defragmentation if they can be used to accommodate expected future VM requests. For instance, in Fig. 1(a), we do not need to migrate VM3 if we predict that an incoming VM will require up to 6 units of resources. Besides the discussions above, we also consider that defragmentation shall be performed over time as a process, because the status of the data center is varying due to the arrival and termination of VMs. 3.2 The Framework To address the fragmentation problem, we propose a framework that includes the following key components, which are also shown in 2. • • • • Demand estimation: This component observes and records the demands of VM arrival and departure. Based on the historical data and instantaneous demands, it can estimate and predict future VM arrival events. In this module, various estimation schemes and algorithms can be adopted. The output of this module will be the arrival and departure models that will be used by the fragmentation monitoring module. Fragmentation monitoring: This component keeps monitoring the fragmentation condition of a data center and determines when the defragmentation operation shall be performed. Such a signal will be forwarded to the defragmentation module if defragmentation is necessary. Defragmentation: This component determines how to minimize the VM migration cost while obtaining the best possible system revenue by accommodating expected VMs. One output of this module is the schedule to migrate some existing VMs in the data center. If some VMs have been migrated, the updated VM allocation will be sent to the VM placement module. VM placement: This component terminates VMs ondemand and also determines how to migration existing VMs and how to place arrived VMs into the data 4 A N O PTIMAL P ROVIDENT R ESOURCE D EFRAG P ROBLEM MENTATION In this section, we first introduce the system model and key notations. For an arbitrary time epoch, we then formulate the problem and discuss its complexity. 4.1 The System Model In our study, we consider a general cloud data center that consists of S networked physical servers. For each server j (1 ≤ j ≤ S ), we let Oj be the fixed capacity; we also consider that the resource utilization is time-varying, denoted as Lj (Lj ≤ Oj ); we finally define Fj = Oj − Lj , which means the remaining resources in server j . We shall note that, for a more general case that a server has various types of resources (e.g., disk I/O, CPU, and memory), we merely need to replace the current real variables with multi~j, L ~ j and F~j . dimensional vectors, namely, O Following the common practice (such as Amazon EC2), we assume that the cloud data center accepts C types of different VM requests. Each VM request can be represented by a 3−tuple: < c, ec , rc >, where c is the type index, ec is the payment, and rc is the resource requirement. We also consider that rc is a number and r1 > r2 > · · · > rC , and e1 > e2 > · · · > eC , i.e., a request consuming more resources will generate a higher revenue, a very reasonable assumption. Similarly, when we consider the diverse resource requirements, rc will be changed into ~rc .. At a particular time epoch t, there are some existing VMs in the cloud data center, there can also be some newly arrived VMs. In our problem, we also assume that an estimated VM arrival for a future time epoch t′ is available. In other words, we consider a two-period problem, i.e., a current time t and a future time t′ , where some VM assignment and migration decisions have to be made at time t with the consideration of the future impact at time t′ . 4.2 Notation Definitions To facilitate further discussions, we summarize important notations below with an alphabetic order. • • • • • • • • • • • • • A (with subscription and superscription): A two dimensional VM assignment matrix. C : The total number of VM types. ec : The revenue of accommodating a type-c VM. Fj : The remaining resources in the j th server. Gs : The set of servers that contain VMs to be migrated. Gt : The set of servers that have available resources. i: The index of a VM. j : The index of a server. ℓr : The migration cost coefficient (0 < ℓr < 1). ℓu : The server cost coefficient (0 < ℓu < 1). Lj : The resource utilization of the j th server. mc : The number of existing type-c VMs in the system. nc : The number of type-c VMs arrived at t. This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TETC.2015.2477778, IEEE Transactions on Emerging Topics in Computing 4 • • • • • • • • • • • n′ : The total number of VMs to arrive at t′ . n′c : The number of type-c VMs to arrive at t′ . Oj : The fixed capacity of the j th server. ωj : A boolean viable that indicates whether the j th server is hosting VMs. pc (n′c ): The probability that there are n′c type-c VM requests to arrive at t′ . φc : The proportion of type-c VM requests over all incoming VM requests at t′ . Φ: An one dimensional array of φc , i.e., Φ = {φ1 , φ1 , · · · , φC }. rc : The resource requirement of each type-c VM request. S : The total number of servers in the system. t: The current time epoch. t′ : A future time epoch. Next, the total migration and server costs can then be represented by κ= mc X C X S S X X {ℓr · rc · a(c)ij · [1 − α(c)ij ]} + ℓu · ωj , (5) c=1 i=1 j=1 where ∀j ∈ [1, S], ( Pmc Pnc 0, if i=1 α(c)ij + i=1 β(c)ij = 0; ωj = 1, otherwise. S X At time t, we let an mc × S matrix At− (c) record the assignment of all existing type-c VMs. Specifically, the element of this matrix, denoted as a(c)ij , is 1 if the ith typec VM is placed in the j th server; otherwise a(c)ij = 0. And we have S X a(c)ij = 1, ∀i ∈ [1, mc ], ∀c ∈ [1, C], (1) j=1 and C X c=1 rc · "m c X # a(c)ij ≤ Oj , ∀j ∈ [1, S]. i=1 (2) To accommodate newly arrived VMs, some existing VMs may be migrated. To formulate this case, we let an mc × S matrix At,α (c), with elements α(c)ij , record the assignment of all existing type-c VMs after VM migrations, where the definition of α(c)ij is similar to that of a(c)ij . With such a definition, we have a(c)ij · [1 − α(c)ij ] = 1, (3) if and only if the i-th type-c VM has been migrated from server j . We further consider that, at time t, there are nc type-c VM arrivals. We let an nc × S matrix At,β (c) record the assignment of the newly arrived type-c VMs. We let the element of At,β (c) be β(c)ij , where the definition of β(c)ij is similar to that of a(c)ij . Consequently, we can first calculate the total revenue by π= C X c=1 ec · nc X S X i=1 j=1 [β(c)ij ]. (4) α(c)ij = 1, ∀i ∈ [1, mc ], ∀c ∈ [1, C], (7) β(c)ij ≤ 1, ∀i ∈ [1, nc ], ∀c ∈ [1, C], (8) j=1 S X 4.3 The Problem Formulation 4.3.1 The Optimal VM Placement with Migration Problem (6) To formulate the problem, the above matrices shall satisfy a number of constraints. and With the system model and assumptions, we can now investigate the optimal defragmentation problem at a particular time t. Next, we first consider a basic VM placement problem with migration. We then discuss how to formulate the defragmentation issue. j=1 j=1 and C X c=1 rc · "m c X i=1 α(c)ij + nc X i=1 # β(c)ij ≤ Oj , ∀j ∈ [1, S]. (9) Here, Eq. (7) ensures that no VM can be divided. Eq. (8) indicates that some VM requests could be blocked even with VM migrations. Eq. (9) indicates that the number of accommodated VMs is constrained by the server capacity. With the definitions and constraints above, we can formulate the problem of VM placement and migration by the following objective function: Maximize (π − κ). (10) Here, we note that the migration cost can be skipped if VM migrations are not allowed, which means that the problem will be degenerated to a problem that is similar to the classic bin-packing problem. 4.3.2 The Optimal Provident Resource Defragmentation Problem In our framework, we can predict the demands to arrive at a future time epoch t′ with a much weak assumption that we merely need to estimate the total number of VM arrivals and the proportion of each type of VM instances. Based on these information, provident resource defragmentation can be executed so as to improve the utilization and to reduce the waiting time before assigning expected VMs at t′ . Nevertheless, with provident resource defragmentation, some existing VMs could be migrated, which leads to a higher migration cost that shall be taken into account. To formulate the optimal provident resource defragmentation problem, we first assume that by virtue of some simple predictions, we can obtain pc (n′c ) which is the probability that there are n′c expected type-c VMs to arrive at t′ . We take this probability to reflect the prediction accuracy. For example, when pc (n′c ) = 1, it means that the prediction accuracy is up to 100%. Then we can define an n′c × S ′ ′ matrix At ,β (c) that records the assignment of expected type-c VMs. Similar to previous definitions, we let β ′ (c)ij This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TETC.2015.2477778, IEEE Transactions on Emerging Topics in Computing 5 ′ ′ be the element of At ,β (c); and we let β ′ (c)ij = 1 if the ith expected type-c VM is placed in the j th server, otherwise, β ′ (c)ij = 0. With this definition, we can derive the estimated revenue by n′c S C ∞ X X X X ′ ′ [β ′ (c)ij ]}. (11) π = ec · {pc (nc ) · n′c =0 c=1 β ′ (c)ij ≤ 1, ∀i ∈ [1, n′c ], ∀c ∈ [1, C], 0 (12) n′c mc X X rc · α(c)ij + β ′ (c)ij ≤ Oj , ∀j ∈ [1, S]. (13) i=1 i=1 Clearly, an optimization problem can be formulated with the objective: Maximize (π ′ − κ). (14) Note that Eq. (11), as a generic formulation, includes an infinite number of states. In practice, the number of VM requests at any time is bounded. So to study a practical QC version of the problem, with the probability c=1 pc (n′c ), we can obtain an estimated total number VMs to arrive Qof C ′ at t′ , denoted as n′ . Similarly, when c=1 pc (nc ) = 1, the prediction accuracy is 100%. We then obtain a vector Φ = {φc }c∈[1,C], where φc is the proportion of type-c VM requests over all incoming VM requests at t′ and we have PC c=1 φc = 1. In this manner, n′c can be simply estimated by n′c = n′ × φc . (15) Consequently, we can keep all the constraints in Eq. (7), Eq. (12), and Eq. (13); and we can modify the revenue by π̃ ′ = C X c=1 ′ ec · nc S X X [β ′ (c)ij ]. (16) i=1 j=1 Finally, based on the estimations of the total number of VM arrivals and the proportion, the optimal provident resource defragmentation problem can be formulated with the following objective function: Maximize (π̃ ′ − κ), (17) and we can analyze the complexity of this problem in the following theorem. Theorem 1. The optimal provident resource defragmentation problem is NP-hard. Proof. For the problem of VM placement and migration formulated in Eq. (10), if we skip the migration cost, this problem is degraded into the classic NP-hard bin-packing problem with the objective function that has C · nc · S variables (see Eq. (4)), and the constraints totally have (S + C · mc ) variables (see Eq. (1) and Eq. (2)). Then, k1 k* 2 0 k2 Input: (At− , Φ, n′ ) ′ ′ Output: (At,α , At ,β ) t,α t− A (0, 0) = A ; ′ ′ (At ,β (0, 0), nt ) ← BVF(At− , Φ, n′ ); if All n′ VMs have been allocated then ′ ′ Return (At,α (0, 0), At ,β (0, 0)); end for c = 1, 2, ...C do if ntc < n′ · φc then Execute the maximum profit defragmentation algorithm (MaxPD) for expected type-c VMs: ′ ′ (At,α (c, kc∗ ), At ,β (c, kc∗ )) ← ′ ′ ∗ ∗ MaxPD(At,α (c−1, kc−1 ), At ,β (c−1, kc−1 ), n′c , ntc ); else kc∗ = ntc ; ∗ At,α (c, kc∗ ) = At,α (c − 1, kc−1 ); t′ ,β ′ ∗ t′ ,β ′ ∗ A (c, kc ) = A (c − 1, kc−1 ); end end ′ ′ Return ({At,α (c, kc∗ )}c∈[1,C] , {At ,β (c, kc∗ )}c∈[1,C] ); and c=1 k* 1 Fig. 3. An illustration of the PRD algorithm. j=1 C X Profit BVF i=1 j=1 Since the cost is associated only with existing VMs, it can still be represented by Eq. (5). Similar to the previous problem, the placement of VMs shall satisfy Eq. (7), and the following constraints: S X Revenue Algorithm 1: The PRD Algorithm. the storage complexity of this classic NP-hard problem is approximately (C · nc · S) + (S + C · mc ). For example, if we assume that C = 10, nc = mc = 103 and S = 102 , then this problem will have 1, 010, 100 variables. For our optimal provident resource defragmentation problem with the objective function (see Eq. (17)) that has C · (n′c + mc ) · S variables, the constraints (Eq. (7), Eq. (12), and Eq. (13)) totally have [S + S + C · (mc + n′c )] variables. As a result, the optimal provident resource defragmentation problem will have 2, 020, 200 variables if n′c is 103 . Therefore, our problem is also NP-hard because it has more variables during the process of solving the optimal solution. 5 A N E FFICIENT A LGORITHM SOURCE D EFRAGMENTATION FOR P ROVIDENT R E - In the previous section, we have formulated an optimal provident resource defragmentation problem. Since the problem is NP-hard, in this section, we develop an efficient algorithm to solve it. Since we are focused on the migration cost, we ignore the server cost in the algorithm. In the rest of this section, we first present the main procedure of the algorithm. We then elaborate on the details of the algorithm. 5.1 The Provident Resource Defragmentation (PRD) Algorithm In our PRD algorithm, which is illustrated in Algorithm 1, we consider that the following parameters are This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TETC.2015.2477778, IEEE Transactions on Emerging Topics in Computing 6 given: • • • At− : The allocation matrix for the current status of the cloud data center. Specifically, At− consists of all At− (c) we defined in the last section. Φ: The vector that consists of the proportion of each type of VM. n′ : The expected number of VMs to arrive at t′ . With the above inputs, we first execute a biggest volume first (BVF) algorithm, in which we place expected VMs into servers one-by-one, from the VM with the largest volume. The output of the BVF is a tentative allocation matrix for ′ ′ expected VMs, denoted as At ,β (0, 0). We can also obtain t t a vector n = {nc }c∈[1,C], where element ntc represents the number of expected type-c VMs that can be assigned. Then we can obtain the revenue and cost after this step as the following: C X π̃0,0 = ec · ntc , (18) c=1 and κ0,0 = 0. Algorithm 2: The BVF algorithm. (19) If all expected VMs can be allocated, we stop the algorithm. Otherwise, starting from the smallest index c where ntc < n′c = φc × n′ , we execute an iterative algorithm, namely, maximum profit defragmentation (MaxPD). Each time we execute the MaxPD algorithm, we obtain the maximum profit by migrating existing VMs and allocating expected VMs, based on the best allocation matrices we obtained in the previous step. In particular, we let At,α (c, kc ) and ′ ′ At ,β (c, kc ) be the VM assignment for existing and expected VMs if kc (0 ≤ kc ≤ n′c − ntc ) expected type-c VMs can be accommodated after the BVF assignment. For each pair of ′ ′ At,α (c, kc ) and At ,β (c, kc ), we can determine the revenue and cost, denoted as π̃c,kc and κc,kc , respectively. We further define kc∗ to be the kc that leads to the maximum profit vc,kc = π̃c,kc − κc,kc . Note that we can calculate π̃c,kc recursively with: ∗ π̃c,kc = π̃c−1,kc−1 + kc × ec , (20) where k0∗ = 0. On the other hand, κc,kc can be obtained by comparing At,α (c, kc ) and At− (c) using κc,kc = S XX {ℓr · rc · a(c)ij [1 − α(c, kc )ij ]}, Input: (At− , Φ, n′ ). ′ ′ Output: (At ,β (0, 0), nt ) Calculate all Fj based on At− according to the definition; ′ ′ Initialize an n′ × S empty matrix At ,β (0, 0); for c = 1, 2, ...C do n0 = n′ · φc ; ntc = 0; for j = 1, 2..., S do if Fj ≥ rc and ntc < n′ · φc then F n1 = min(⌊ rcj ⌋, n0 ); t t nc ← nc + n1 ; Fj ← Fj − n1 · rc ; n0 ← n0 −′ n′1 ; update At ,β (0, 0) by placing n1 expected type-c VMs into server j ; if ntc = n′ · φc then break. end end end end ′ ′ Return (At ,β (0, 0), nt ); (21) ∀i j=1 where we let α(c, kc )ij be the element of At,α (c, kc ). Finally, after executing the PRD algorithm, we can deter′ ′ mine the best allocation matrices At,α and At ,β . Here, At,α consists of all At,α (c, kc ) we define above. The definition of ′ ′ At ,β is similar to that of At,α . Fig. 3 illustrates the main idea of the PRD algorithm. We can observe that, for each type of VM, the revenue increases linearly with the increase of kc . However, due to the migration cost, the overall profit may not be increased for some kc . Therefore, we design the MaxPD algorithm to find the best solution kc∗ that leads to the maximal profit. In each iteration of MaxPD, we try to accommodate one expected type-c VM, by using a minimum cost migration (MinCM) algorithm, until all n′c − ntc expected VMs are placed or the residual resources are not sufficient to support more type-c VMs. 5.2 The Biggest Volume First (BVF) Algorithm The BVF algorithm is shown in Algorithm 2. In this algorithm, start from the VM with the largest volume, we go through all types of expected VMs in an increasing order. Specifically, for a particular type c, we traverse all possible servers and determine whether we can accommodate some expected type-c VMs in each server, until all servers have been evaluated or all VMs have been assigned. To illustrate the operation of the BVF algorithm, we show a simple example in Fig. 4(a) and (b), where C =3, r1 =16, φ1 =0.2, r2 =8, φ2 =0.5, r3 =2, φ3 =0.3 and Oj =40, ∀j ∈ [1, 8]. Assuming the status, At− , of the cloud is shown in Fig. 4(a). We assume that n′ =10. With the BVF algorithm, we orderly determine nt1 , nt2 and nt3 . Fig. 4(b) demonstrates that nt1 = 0 due to fragmented resources. Next, we try to place expected type-2 VMs and we notice that all of them can be accommodated. That is nt2 =n′ · φ2 = 5. Finally, we try to assign expected type-3 VMs and all of them can be assigned as well, i.e., nt3 = n′ · φ3 = 3. Therefore, we have nt1 = 0 < n′ · φ1 , nt2 = n′ · φ2 , and nt3 = n′ · φ3 . 5.3 The Maximum Profit Defragmentation (MaxPD) Algorithm As explained previously, in this algorithm, we try to find the maximum profit when assigning expected type-c VMs. The details of the algorithm can be found in Algorithm 3. It shall be noted that, during the VM assignment process, some existing VMs may be migrated. Therefore, it is important to avoid the ping-pong effect, which means that one VM will be migrated back and forth. In our design, we avoid such ping-pong effect by enforcing a one-way migration policy that prevents any VM to be migrated back to its original server. This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TETC.2015.2477778, IEEE Transactions on Emerging Topics in Computing 7 1 2 3 4 5 6 2 2 2 2 2 8 8 8 8 8 8 16 16 16 16 16 16 8 8 8 7 16 16 8 16 16 (a) Existing VMs. 1 2 2 2 2 8 8 2 8 16 3 4 5 8 8 8 8 8 2 2 2 2 8 8 8 8 8 16 16 16 16 16 8 6 7 16 16 8 16 16 t t (b) After BVF (nt 1 = 0, n2 = 5, n2 = 3). 1 2 2 2 2 8 8 2 8 16 3 4 5 8 8 8 8 2 2 2 2 8 8 8 8 8 16 16 16 16 16 8 6 7 8 8 16 16 16 16 16 (c) After the first round of MinCM. Fig. 4. An example for VM placement and migration. ′ ′ ∗ ∗ Input: (At,α (c − 1, kc−1 ), At ,β (c − 1, kc−1 ), n′c , ntc ) t,α ∗ t′ ,β ′ ∗ Output: (A (c, kc ), A (c, kc )) kc∗ = 0; ∗ vc,kc∗ = vc−1,kc−1 ; t,α ∗ ∗ A (c, kc ) = At,α (c − 1, kc−1 ); t′ ,β ′ ∗ t′ ,β ′ ∗ A (c, kc ) = A (c − 1, kc−1 ); Initialize Gs and Gt ; for kc = 1, 2, ...(n′c − ntc ) do Execute the MinCM algorithm to assign the kcth expected type-c VM: ′ ′ (At,α (c, kc ), At ,β (c, kc ), Gs , Gt ) ← ′ ′ MinCM(At,α (c, kc − 1), At ,β (c, kc − 1), Gs , Gt ); if we can accommodate the kcth expected type-c VM then Calculate π̃c,kc and κc,kc ; vc,kc = π̃c,kc − κc,kc ; if vc,kc >vc,kc∗ then kc∗ = kc ; end else break; end end ′ ′ Return (At,α (c, kc∗ ), At ,β (c, kc∗ )); Algorithm 3: The MaxPD Algorithm. To understand this algorithm, we can first prove the follow Lemma. Lemma 1. After executing the BVF algorithm, if there are expected VMs that cannot be assigned, there exists c∗ such that one or more expected type-c∗ VMs cannot be assigned and ∀j (1 ≤ j ≤ S ), Fj < rc∗ . Proof. This Lemma can be proved by contradiction. If there exists any server j with Fj ≥ rc∗ , then we can assign an expected type-c∗ VM into this server in BVF. If we continue this process, then eventually all expected type-c∗ VMs can be assigned, which contradicts with the definition of c∗ . With the introduction of c∗ above, we can now initialize two sets: (1) Gs that consists of all servers that contain VMs to be migrated; and (2) Gt that consists of all servers that have available resources. Here we note that, based on Lemma 1, only type-c′ VMs (c∗ < c′ ≤ C ) can be migrated in our algorithm. These two sets can then be initialized as the following: Gt Gs = = {j|Fj ∈ [rC , rc∗ )}; (22) ′ ∗ ′ {j|∃ type-c VM(s) in server j, c < c ≤ C}.(23) ′ ′ Input: (At,α (c, kc − 1), At ,β (c, kc − 1), Gs , Gt ) ′ ′ Output: (At,α (c, kc ), At ,β (c, kc ), Gs , Gt ) while |Gs | 6= 0 do j ∗ = arg maxj∈Gs Fj ; if j ∗ ∈ Gt then G′t ← Gt − {j ∗ }; end Determine fc′ , ∀c′ c∗ < c′ ≤ C ; if we can further accommodate one expected type-c VM then Execute the IFF algorithm to accommodate the VM: ′ ′ {At,α (c, kc ), At ,β (c, kc )} ← ′ ′ IF F (At,α(c, kc − 1), At ,β (c, kc − 1), j ∗ , G′t , fc′ ); ′ Gt = Gt ; else Gs ← Gs − {j ∗ }; end if Eq. (23) is satisfied for the server j ∗ then Gt ← Gt − {j ∗ }; end end ′ ′ Return (At,α (c, kc ), At ,β (c, kc ), Gs , Gt ); Algorithm 4: The MinCM Algorithm In the example of Fig. 4(b), we can observe that c∗ = 1, and we can also see that the remaining resources of every server is less than rc∗ = 16. So only type-2 and type-3 VMs can be migrated. In addition, we have Gs = {1, 2, 3, 4, 5, 6} and Gt = {2, 3, 6, 7, 8}. 5.4 The Minimum Cost Migration (MinCM) Algorithm The purpose of the MinCM algorithm is to assign one expected type-c VM with the smallest migration cost. The main idea is to identify a server in Gs such that we can make space for this expected type-c VM by migrating VMs from this server to servers in Gt . The details of the algorithm can be found in Algorithm 4. As shown in this algorithm, we evaluate all possible servers in Gs one-by-one, according to the remaining resources, from the largest to the smallest. Particularly, in each round, we find the server j ∗ in Gs that has the largest vacant space. Next, if j ∗ is in Gt , we define G′t = Gt − {j ∗ }. To determine whether we can accommodate the expected type-c VM in server j ∗ , we first count the number of existing VMs that can be migrated. In other words, we count the number of existing type-c′ VMs in server j ∗ , denoted as fc′ (j ∗ ), c∗ < c′ ≤ C . This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TETC.2015.2477778, IEEE Transactions on Emerging Topics in Computing 8 Next, we consider whether there are enough remaining resources in G′t . Suppose we have a sufficiently large number of type-c′ VMs (c∗ < c′ ≤ C ), we can use the BVF algorithm to determine the maximum number of type-c′ VMs that can be assigned, denoted as fc′ (G′t ), ∀c∗ < c′ ≤ C . With fc′ (Gt ) and fc′ (j ∗ ), we can determine the maximum number of type-c′ VMs that can be moved out of server j ∗ , denoted as fc′ , where fc′ = min[fc′ (j ∗ ), fc′ (G′t )]. We can then judge whether we can accommodate the expected type-c VM by checking " C # X (fc′ · rc′ ) + Fj ∗ ≥ rc . (24) ′ c′ =c∗ +1 If the inequality above holds, we execute an increasing first fit (IFF) algorithm shown in Algorithm 5 to migrate existing VMs in server j ∗ to servers in Gt . Otherwise, we let Gs = Gs − {j ∗ } and continue the while loop until Gs is empty. As mentioned above, server j ∗ will migrate out some type-c′ VMs (∀c∗ < c′ ≤ C ) so that it has enough space of carrying type-c VMs. After that, if this server j ∗ will become the target server to receive the same group of type-c′ VMs it previously migrated out, the ping-pong effect will occur in the PRD algorithm. Therefore, to avoid this ping-pong effect, we add a condition such that server j ∗ no longer functions as the target server in the following process, i.e., the residual resource becomes smaller than rC after the server j ∗ receives type-c VMs. Note that, the qualification of being a target server can be seen in Eq. (22). Above all, the condition of avoiding ping-pong effect can be represented in the following equation. " C X # (fc′ · rc′ ) + Fj ∗ %rc < rC . c′ =c∗ +1 (25) We now use the example in Fig. 4(b) and (c) to illustrate the operation of the MinCM algorithm. In Fig. 4(b), we can identify j ∗ = 6. According to our definition, f2 (6) = 2 and f3 (6) = 0. On the other hand, since server 6 is in Gt , we temporarily remove it so that G′t = {2, 3, 7, 8}. Consequently, we can calculate f2 (G′t ) = 2 and f3 (G′t ) = 4. Clearly, f2 = 2 and f3 = 0. Since F6 +f2 ∗r2 = 8+2×8 = 24 > r1 = 16, we can see that an expected type-1 VM can be accommodated. By using the IFF algorithm, we can migrate one type-2 VM from server 6 to server 7 and then place an expected type1 VM into server 6, as shown in Fig. 4(b). In a similar way, we can accommodate another expected type-1 VM by identifying j ∗ = 2, as well as by migrating a type-2 VM from server 2 to server 8 and migrating a type-3 VM from server 2 to server 3, as illustrated in Fig. 4(c). With the algorithms above, we can prove the following theorem. Theorem 2. Ping-pong effect does not exist in the PRD algorithm. Proof. As in Eq. (25), we have added the condition of avoiding ping-pong effect into MinCM shown in Algorithm 4. This condition is not very rigid because: (1) ∀j ∗ , Fj∗ does not decrease after executing the MinCM algorithm, (2) ′ Input: (At,α (c, kc − 1), At ,β (c, kc − 1), j ∗ , Gt , fc′ ) ′ ′ Output: (At,α (c, kc ), At ,β (c, kc )) ′ ∗ ∗ for c =c + 1, c + 2, ...C do Initialize the set V R∗ : ∀i ∈ j ∗ , V R∗ ← {i|ri = rc′ }; while |V R∗ | 6= 0 do i = V R∗ .top; while |Gt | 6= 0 do j 0 ← Gt .top; if Fj 0 < rc′ then Gt .pop; else Fj 0 ← Fj 0 − rc′ ; fc′ ← fc′ − 1; break. end end Fj ∗ ← Fj ∗ + rc′ ; fc′ (j ∗ ) ← fc′ (j ∗ ) − 1; update At,α (c, kc ); if fc′ = 0 or Fj ∗ ≥ rc then break. else V R∗ .pop; end end if Fj ∗ ≥ rc then ′ ′ update At ,β (c, kc ); break. end end ′ ′ Return (At,α (c, kc ), At ,β (c, kc )); Algorithm 5: The IFF algorithm. ∀j ∈ G′t , Fj does not increase after executing the MinCM algorithm. 5.5 Algorithm Complexity The complexity of the PRD algorithm is mainly dependent on the times of running the sub-algorithm IFF whose complexity is about O(C · |Gt |)≈O(C · S) shown in Algorithm 5. The MinCM algorithm shown in Algorithm 4 needs to execute IFF at most (|Gs |≈S ) times, while the MaxPD algorithm shown in Algorithm 3 needs to execute MinCM at most kc times. As shown in Algorithm 1, since our PRD algorithm executes MaxPD at most C times in the main procedure, the final algorithm complexity is about O(kc · C 2 · S 2 ). 6 N UMERICAL R ESULTS To evaluate the proposed provident resource defragmentation framework, we have conducted extensive simulation experiments. In this section, we summarize important results that bring insights to the design. First, we introduce the simulation settings. We then compare the performance of our design with existing approaches under various conditions. 6.1 Simulation Settings In our experiments, we consider a homogeneous cloud data center, where the capacity of a server can be selected This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TETC.2015.2477778, IEEE Transactions on Emerging Topics in Computing 9 • Scenario 1: S = 35 and the average number of arrived VM requests n is in {150, 200, 250, 300, 350}. Scenario 2: S ∈ {35, 40, 45, 50, 55} and n increases from 150 to 300. In particular, n = 150 in the first five slots, then n = 200 in the next five slots, etc. To evaluate the proposed algorithm, we assume that each VM can only be terminated in the middle of a time slot, after which we can perform the PRD algorithm. In our experiments, we compare our algorithm with (1) pure VM placement without defragmentation and consolidation, and (2) periodic consolidation with a fixed period T ∈ {1, 2, 3, 4, 5, 6, 10, 20} slots. In the rest of this section, we first compare the total profit, average migration cost and service delay of different schemes in the two scenarios. We then measure the running time of our algorithm with the objective to demonstrate the acceptable operation duration. We finally discuss the impact factors, such as n′ and ℓr . 6.2 The Total Profit In Table 1, we compare the total profit of different schemes. In this experiment, we choose ℓr = 0.01 and let n′ = n. We can first observe that, for each approach, the total profit increases with the increase of n. Moreover, T could be an important parameter to periodic consolidation because there exists an optimal T that leads to the best profit. Nevertheless, we can observe from the table that the proposed PRD algorithm outperforms all existing approaches with different n settings. To demonstrate its optimality, we also present an upper bound for each case in Table 1, which is obtained by assuming that all servers are replaced by one single server with aggregated resources so that there is no fragmentation. As we can see from the comparison, our PRD algorithm can achieve near-optimal results, particularly, about 6% average convergent rate, in different n cases. To show the robustness of our method, in Table 2, we compare different approaches in Scenario 2, where we choose ℓr = 0.01 but let n′ = 230, which means that our estimation of n′ is not totally accurate. We can observe from the results that the total profit increases with the increase of S , which is reasonable because a larger S means more resources to accommodate VM requests. The numerical results also show that, even with an inaccurate estimation of n′ , our algorithm can still lead to the best profit performance that is close to the upper bound. 4 3 2 Our method Consolidation(T=1) Consolidation(T=5) 1 0 150 200 250 300 Number of VMs (a) 350 Average migration cost Average migration cost 10 5 8 6 4 Our method Consolidation(T=1) Consolidation(T=5) 2 0 35 40 45 50 Number of servers (b) 55 Fig. 5. The average migration cost in Scenario 1 and Scenario 2. 7 Average service delay (hour) • 6 Average service delay (hour) from {40, 60, 80}. To simulate dynamic demands, we consider a certain duration that has been partitioned into equalsized time slots [14]. We assume that VMs can arrive only at the beginning of a time slot. We also assume that each VM can last for a number of slots, which is a random number chosen uniformly in [1, 4]. Moreover, we consider that there are C = 3 types of VMs, with r1 =16, r2 =8 and r3 =2, as illustrated in Fig. 4. We also let ec = rc . As to the proportion of VMs, we assume that Φ=(0.2, 0.5, 0.3). In our experiments, we evaluate mainly four performance metrics: (1) the total profit in 20 slots, (2) the average migration cost per operation, (3) the average service delay per operation, and (4) the running time. We present results for two scenarios: 6 5 4 Consolidation(T=1) Consolidation(T=5) Our method 3 2 1 0 150 200 250 300 Number of VMs (a) 350 10 8 6 Consolidation(T=1) Consolidation(T=5) Our method 4 2 0 35 40 45 50 55 Number of servers (b) Fig. 6. The average service delay in Scenario 1 and Scenario 2. 6.3 The Average Migration Cost and Service Delay To understand the impact of migration overheads, we define average migration cost as the total migration cost during the designated duration divided by the total number of times that the defragmentation (or consolidation) operations are performed. For example, our PRD algorithm is performed in each time slot. Therefore, the average migration cost is the total cost in 20 slots divided by 20. On the other hand, for periodic server consolidation with T = 5, the average migration cost is the total cost in 20 slots divided by 3 = 20 5 − 1. The definition of the average service delay is similar to the aforementioned definition for migration costs. Here we note that, in practice, the total migration delay depends on many factors, as shown in [32]. In this study, to simplify the discussions, we consider that the service delay is 40 seconds during the precopy-based migration of 512MBytes memory, which is an experimental result evaluated in the data center network with GB-level bandwidth. In our experiments, we assume that 512MBytes memory is one unit of resource, so the service delay of migrating three types of VMs are 80, 320, and 1280 seconds, respectively. Moreover, we assume that VMs shall be migrated consecutively. In Fig. 5(a) and (b), we compare the average migration cost of our scheme with periodic consolidation, for Scenarios 1 and 2, respectively. Since the average costs are similar for periodic consolidation with different T , we only show the results for T = 1 and T = 5 for demonstration. We can observe that, the proposed algorithm can significantly reduce the average migration cost (e.g., less than 2% of that of periodic consolidation), which meets our design objective. These results suggest that, with the proposed PRD algorithm, the cloud data center can accommodate more VM requests and have much fewer VM migrations to achieve this goal. Another interesting observation from Fig. 5 is that, in This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TETC.2015.2477778, IEEE Transactions on Emerging Topics in Computing 10 TABLE 1 The total profit in Scenario 1. VMs 150 200 250 300 350 Pure placement 16434 16816 16806 17036 16858 T=1 16473.26 16818.5 16797.88 17042.86 16870.78 T=2 16495.44 16821.16 16761.12 17033.34 16865.18 T=3 16410.06 16759.5 16782.82 16975.12 16796.08 T=4 16504.72 16822.58 16763.56 17019.18 16822.92 T=5 16358.52 16834.72 16823.96 17006.18 16796.24 T=6 16442 16806.36 16788.26 17027.28 16831.82 T=10 16472 16828.24 16812.18 17032.92 16847.4 T=20 16445.22 16826.18 16816.26 17047.44 16867.06 Our method 16608.16 16989.6 17003.7 17241.74 17063.72 Upper bound 18064.8 17518.8 17277 17495.4 17300.4 ∗ Bold-faced numbers are the maximum for periodic consolidation. TABLE 2 The total profit in Scenario 2. S Pure placement T=1 T=2 T=3 T=4 T=5 T=6 T=10 T=20 Our method Upper bound 9 1400 1200 unning time (ms) unning time (ms) 800 7 600 500 R 400 300 200 150 P pcement Consolidation (T=1) Our method 200 250 300 350 Number of VMs (a) 35 16592 16657.06 16635.16 16638.62 16636.7 16586.32 16598.16 16629.2 16603.42 16789.76 17846.4 40 18480 18447.62 18495.72 18452.74 18482.76 18511.26 18468.18 18516.22 18496.86 18677.68 20085 6.4 ure lacement Consolidation (T=1) Our method 1000 800 600 400 35 40 45 50 Number of servers (b) 45 20312 20371.24 20373.46 20297.6 20400.44 20342.14 20300.54 20329.46 20329.18 20533.44 22815 55 Fig. 7. The running time in Scenario 1 and Scenario 2. terms of the average migration cost, the performance of our algorithm is not sensitive to n and S . By comparison, the average cost of the periodic consolidation increases with the increase of S . This result implies that the proposed PRD algorithm is more scalable for a large number of servers with various VM arrivals. In Fig. 6(a) and (b), we compare the average service delays of our scheme with that of the periodic consolidation, for Scenarios 1 and 2, respectively. We can see that, the periodic consolidation results in the service delay at the hour level, while utilizing the proposed algorithm makes this delay down to the minute level. Therefore, with the proposed PRD algorithm, the cloud data center also can achieve this goal with the lower service interruption. Moreover, the simulation results of the average service delay also demonstrates a desirable scalability of the proposed scheme. 50 22648 22705.54 22601.78 22595.84 22651.26 22691.44 22690.84 22664.28 22664.12 22821.92 26075.4 55 24536 24685.56 24622.6 24517.48 24624.98 24645.44 24584.68 24575.88 24552.22 24710.1 29117.4 The running time Fig. 7(a) and (b) demonstrate the comparative results of the running time among the pure VM placement, the periodic consolidation with T = 1 (with the most frequent VM migration), and the proposed scheme, for Scenarios 1 and 2, respectively. In particular, our computer is configured with an Intel Core i5 2.3GHz CPU and 4GB RAM. The simulation results show that, though the pure VM placement has the lowest running time due to the lack of VM migrations, the proposed algorithm still effectively reduces the required time compared with the periodic consolidation (e.g., less than 27% of that of periodic consolidation). This result shows that the duration of the proposed defragmentation scheme is very acceptable. Finally, the running time of all schemes increases with the increase of n and S . 6.5 The Impact of n′ In our previous discussions, we have demonstrated that the proposed PRD algorithm performs well in Scenario 2 even with inaccurate estimation of n′ . In this subsection, we further investigate the impact of inaccuracy in Scenario 1. In this experiment, we analyze the total profit, the average migration cost and the average service delay of our method with different n′ , n′ ∈ {50, 100, 150, 200, 250}, when n = 150. The results are shown in Table 3, where we can see that the total profit can be improved if the estimation is more accurate. In particular, higher profit can be achieved when n′ = 150 and n′ = 250. Nevertheless, by comparing Table 1 and Table 3, we can observe that our algorithm can still lead to a better profit than pure VM placement and periodic server consolidation (except for n′ = 50). On the This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TETC.2015.2477778, IEEE Transactions on Emerging Topics in Computing 11 TABLE 3 The impact of n′ on the total profit and cost in Scenario 1. n′ total profit average migration cost average service delay (minute) 60 [ Our method, n'=150 [ Consolidation, n'=150 [ Our method, n'=200 [ Consolidation, n'=200 40 Average migration cost Average migration cost 50 30 20 10 0 0.01 0.03 0.05 0.00 Cost coefficient (a) 0.0 50 40 50 16434 0 0 100 16536.38 0.081 5.4 [2] Our method S=353 Consolidation S=353 Our method S=403 Consolidation S=403 [3] 30 20 [4] 10 0 0 01 0 03 0 05 00 Cost coefficient (b) 0 0 [5] Fig. 8. The average migration cost vs. ℓr for Scenarios 1 and 2. [6] other hand, we observe that the average cost/service delay of our algorithm may increase with n′ in this experiment. 6.6 The Impact of ℓr [7] [8] Finally, since the service delay has the similar results to that of migration costs, we investigate the impact of ℓr on the average migration cost only. In Fig. 8, we illustrate the cost vs. ℓr for Scenarios 1 and 2. We can clearly observe that, with the increase of ℓr , the cost of both server consolidation and our scheme increase. However, the cost of our scheme increases much slower than that of the periodic server consolidation. [9] [10] [11] 7 C ONCLUSIONS AND F UTURE WORK In this paper, we have systematically investigated the resource fragmentation problem in cloud data centers that may limit the accommodation of dynamic VM demands in mobile cloud computing. In particular, we first identified the resource fragmentation problem due to fragmented resources in different servers. We then proposed a novel framework to minimize unnecessary VM migrations. Within the proposed framework, we formulated an optimization problem for resource defragmentation at a particular time epoch. We then developed an efficient heuristic algorithm that can obtain near-optimal results. To evaluate the proposed framework, we have also conducted extensive numerical results that demonstrate the advantages of our framework in practical scenarios. In this paper, we have only considered the defragmentation of a single type of resources. In our future study, we will investigate the fragmentation problem with multidimensional resources, such as computing, memory, disk I/O, and network I/O. In addition, we will also investigate the defragmentation issue when VM requests are correlated. R EFERENCES [1] Cisco, “Cisco visual networking index: Global mobile data traffic forecast update, 20142019,” Cisco, Tech. Rep., 2015. [12] [13] [14] [15] [16] [17] [18] [19] 150 16608.16 0.092 6.133 200 16608.1 0.095 6.333 250 16655.88 0.106 7.067 N. Fernando, S. W. Loke, and W. Rahayu, “Mobile cloud computing: A survey,” Future Generation Computer Systems, vol. 29, no. 1, pp. 84–106, 2013. Z. Sanaei, S. Abolfazli, A. Gani, and R. Buyya, “Heterogeneity in mobile cloud computing: taxonomy and open challenges,” Communications Surveys & Tutorials, IEEE, vol. 16, no. 1, pp. 369– 392, 2014. P. Shu, F. Liu, H. Jin, M. Chen, F. Wen, and Y. Qu, “etime: energy-efficient transmission between cloud and mobile devices,” in INFOCOM, 2013 Proceedings IEEE. IEEE, 2013, pp. 195–199. F. Liu, P. Shu, H. Jin, L. Ding, J. Yu, D. Niu, and B. Li, “Gearing resource-poor mobile devices with powerful clouds: architectures, challenges, and applications,” Wireless Communications, IEEE, vol. 20, no. 3, pp. 14–22, 2013. F. Liu, P. Shu, and J. Lui, “Appatp: An energy conserving adaptive mobile-cloud transmission protocol,” IEEE Transactions on Computers, vol. PP, no. 99, pp. 1–14, 2015. T. Zhang, X. Zhang, F. Liu, H. Leng, Q. Yu, and G. Liang, “etrain: Making wasted energy useful by utilizing heartbeats for mobile data transmissions,” in Distributed Computing Systems (ICDCS), 2015 IEEE 34th International Conference on. IEEE, pp. 1–10. M. Satyanarayanan, P. Bahl, R. Caceres, and N. Davies, “The case for vm-based cloudlets in mobile computing,” Pervasive Computing, IEEE, vol. 8, no. 4, pp. 14–23, 2009. B.-G. Chun, S. Ihm, P. Maniatis, M. Naik, and A. Patti, “Clonecloud: elastic execution between mobile device and cloud,” in Proceedings of the sixth conference on Computer systems. ACM, 2011, pp. 301–314. S. Abolfazli, Z. Sanaei, E. Ahmed, A. Gani, and R. Buyya, “Cloudbased augmentation for mobile devices: motivation, taxonomies, and open challenges,” Communications Surveys & Tutorials, IEEE, vol. 16, no. 1, pp. 337–368, 2014. M. Mishra and A. Sahoo, “On theory of VM placement: Anomalies in existing methodologies and their mitigation using a novel vector based approach,” in 2011 IEEE International Conference on Cloud Computing (CLOUD), 2011, pp. 275–282. S. He, L. Guo, M. Ghanem, and Y. Guo, “Improving resource utilisation in the cloud environment using multivariate probabilistic models,” in Cloud Computing (CLOUD), 2012 IEEE 5th International Conference on, 2012, pp. 574–581. Y. Wang and X. Wang, “Virtual batching: Request batching for server energy conservation in virtualized data centers,” IEEE Transactions on Parallel and Distributed Systems, vol. Early Access Online, 2012. N. Calcavecchia, O. Biran, E. Hadad, and Y. Moatti, “VM placement strategies for cloud scenarios,” in 2012 IEEE 5th International Conference on Cloud Computing (CLOUD), 2012, pp. 852–859. F. Xu, F. Liu, L. Liu, H. Jin, B. Li, and B. Li, “iaware: Making live migration of virtual machines interference-aware in the cloud,” IEEE Transactions on Computers, vol. 63, no. 12, pp. 3012–3025, 2014. F. Xu, F. Liu, H. Jin, and A. V. Vasilakos, “Managing performance overhead of virtual machines in cloud computing: a survey, state of the art, and future directions,” Proceedings of the IEEE, vol. 102, no. 1, pp. 11–31, 2014. G. Khanna, K. Beaty, G. Kar, and A. Kochut, “Application performance management in virtualized server environments,” in Network Operations and Management Symposium, 2006. NOMS 2006. 10th IEEE/IFIP, 2006, pp. 373–381. N. Bobroff, A. Kochut, and K. Beaty, “Dynamic placement of virtual machines for managing SLA violations,” in 10th IFIP/IEEE International Symposium on Integrated Network Management, 2007. IM ’07, 2007, pp. 119–128. B. Speitkamp and M. Bichler, “A mathematical programming approach for server consolidation problems in virtualized data centers,” IEEE Transactions on Services Computing, vol. 3, no. 4, pp. 266–278, 2010. This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TETC.2015.2477778, IEEE Transactions on Emerging Topics in Computing 12 [20] M. Wang, X. Meng, and L. Zhang, “Consolidating virtual machines with dynamic bandwidth demand in data centers,” in 2011 Proceedings IEEE INFOCOM, 2011, pp. 71–75. [21] T. Wood, P. Shenoy, A. Venkataramani, and M. Yousif, “Blackbox and gray-box strategies for virtual machine migration,” in Proceedings of the 4th USENIX conference on Networked systems design &#38; implementation, ser. NSDI’07, 2007, pp. 17–17. [22] L. Xu, W. Chen, Z. Wang, and S. Yang, “Smart-DRS: a strategy of dynamic resource scheduling in cloud data center,” in 2012 IEEE International Conference on Cluster Computing Workshops (CLUSTER WORKSHOPS), 2012, pp. 120–127. [23] A. Beloglazov and R. Buyya, “Managing overloaded hosts for dynamic consolidation of virtual machines in cloud data centers under quality of service constraints,” IEEE Transactions on Parallel and Distributed Systems, vol. 24, no. 7, pp. 1366–1379, 2013. [24] W. Deng, F. Liu, H. Jin, X. Liao, H. Liu, and L. Chen, “Lifetime or energy: Consolidating servers with reliability control in virtualized cloud datacenters,” in Cloud Computing Technology and Science (CloudCom), 2012 IEEE 4th International Conference on, 2012, pp. 18– 25. [25] E. Feller, C. Morin, and A. Esnault, “A case for fully decentralized dynamic vm consolidation in clouds,” 4th IEEE International Conference on Cloud Computing Technology and Science Proceedings, vol. 0, pp. 26–33, 2012. [26] M. Sridharan, P. Calyam, A. Venkataraman, and A. Berryman, “Defragmentation of resources in virtual desktop clouds for costaware utility-optimal allocation,” in 2011 Fourth IEEE International Conference on Utility and Cloud Computing (UCC), 2011, pp. 253–260. [27] T. Setzer and M. Bichler, “Using matrix approximation for highdimensional discrete optimization problems: Server consolidation based on cyclic time-series data,” European Journal of Operational Research, vol. 227, no. 1, pp. 62–75, 2013. [28] Y. Wang and X. Wang, “Virtual batching: Request batching for server energy conservation in virtualized data centers,” Parallel and Distributed Systems, IEEE Transactions on, vol. 24, no. 8, pp. 1695–1705, 2013. [29] K. Daudjee, S. Kamali, and A. López-Ortiz, “On the online fault-tolerant server consolidation problem,” in Proceedings of the 26th ACM symposium on Parallelism in algorithms and architectures. ACM, 2014, pp. 12–21. [30] N. Bansal, A. Caprara, and M. Sviridenko, “Improved approximation algorithms for multidimensional bin packing problems,” in Foundations of Computer Science, 2006. FOCS ’06. 47th Annual IEEE Symposium on, 2006, pp. 697–708. [31] A. Verma, P. Ahuja, and A. Neogi, “pmapper: power and migration cost aware application placement in virtualized systems,” in Proceedings of the 9th ACM/IFIP/USENIX International Conference on Middleware, ser. Middleware ’08, 2008, pp. 243–264. [32] A. J. Mashtizadeh, M. Cai, G. Tarasuk-Levin, R. Koller, T. Garfinkel, and S. Setty, “Xvmotion: Unified virtual machine migration over long distance,” in 2014 USENIX Annual Technical Conference (USENIX ATC 14). USENIX Association, Jun. 2014, pp. 97–108. Weigang Hou received the Ph.D. degree in communication and information systems from Northeastern University, Shenyang, China, in 2013. He is currently an associate professor in College of Information Science and Engineering, Northeastern University, Shenyang, China. His research interests include green optical networks, cloud computing and network virtualization. He has published over 20 technical papers in the above areas. Rui Zhang received his B.S. and M.S. degrees from the Department of Mathematics, Imperial College London, London, in 2010 and 2011, respectively. He is currently a Ph.D. student at the Department of Computer Science, City University of Hong Kong. His main research interest is cloud computing security. Wen Qi received his B.E. degree from the Department of Automation, Nankai University in 2009, and the M.S. degree in computer science from the City University of Hong Kong in 2013. He is currently a Ph.D student in the Department of Computer Science, City University of Hong Kong. His research interests include cloud computing, networking, and security. Dr. Kejie Lu (S’01-M’04-SM’07) received the BSc and MSc degrees in Telecommunications Engineering from Beijing University of Posts and Telecommunications, Beijing, China, in 1994 and 1997, respectively. He received the PhD degree in Electrical Engineering from the University of Texas at Dallas in 2003. In July 2005, he joined the Department of Electrical and Computer Engineering, University of Puerto Rico at Mayagüez, where he is currently an Associate Professor. Since January 2014, he has been an Oriental Scholar with the School of Computer Engineering, Shanghai University of Electric Power, Shanghai, China. His research interests include architecture and protocols design for computer and communication networks, performance analysis, network security, and wireless communications. Dr. Jianping Wang is currently an associate professor in the Department of Computer Science at City University of Hong Kong. She received her BSc and MSc degrees from Nankai University in 1996 and 1999 respectively, and her Ph.D. degree from University of Texas at Dallas in 2003. Her research interests include Dependable Networking, Optical Networking, Service Oriented Wireless Sensor/Ad Hoc Networking. Lei Guo is a Professor at Northeastern University, Shenyang, China. He received the Ph.D. degree from University of Electronic Science and Technology of China, China, in 2006. His research interests include communication networks, optical communications and wireless communications. He has published over 200 technical papers in the above areas on international journals and conferences, such as IEEE TCOM, IEEE TWC, IEEE/OSA JLT, IEEE/OSA JOCN, IEEE GLOBECOM, IEEE ICC, etc. Dr. Guo is a member of the IEEE and the OSA, and he is also a senior member of CIC. He is now serving as an editor for five international journals, such as Photonic Network Communications, etc. This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.