Live Migration(LM) Benchmark Research

Live Migration(LM) Benchmark Research College of Computer Science Zhejiang University China Outline  Background and Motives  Virt-LM Benchmark Overview  Further Issues and Possible Solutions  Conclusion  Our Possible Work under the Cloud WG Background and Motives Significance of Live Migration  Concept:   Migration: Move VM between different physical machines Live: Without disconnecting client or application (invisible)  Relation to Cloud Computing and Data Centers:   Cloud Infrastructures and data centers have to efficiently use their huge scales of hardware resources. Virtualization Technology provides two approaches:  Server Consolidation  Live Migration  Roles in a Data Center:     Flexibly remap hardware among VMs. Balance workload Save energy Enhance service availability and fault tolerance Motives of the LM Benchmark  Scale and frequency leads to a significant LM cost (TC):  S(Scale): How many servers?  Google: Estimated 200,000 to 500,000 servers, included in 36 data centers in 2008  MS: Added 10,000 servers per month in 2008  FaceBook: More than 30,000 servers in its data center in 2008  F(Frequency):How often it happens?  Load balancing  Online maintainance and proactive fault tolerance  Power management  C(Cost of Live Migration):  Hardware and network bandwidth：save and transfer VM state  Workload performance: share hardware  Service availability: downtime Motives of the LM Benchmark  A LM benchmark is in need.  LM benchmark helps make right decisions to reduce cost  Design better LM strategies  Choose better platform  Evaluation of a data center should include its LM performance  VMware released VMmark 2.0 for multi-server performance in DEC, 2010  Existing evaluation methodologies have their limitations.  VMmark 2.x  Dedicated to the VMware’s platforms  A macro benchmark -- no spefic metrics about LM performance  Existing research on LM  ([Vee09 Hines], [HPDC09 Liu], [Cluster09 Jin], [IWVT08 Liu], [NSDI05 Clark], …)  All dedicated to design LM strategies  No unified metrics and workloads. Results are not comparable to each other.  Some critical issues are not mentioned.  Still lack of a formal and qualified LM benchmark Virt-LM Benchmark Overview Goal and Criterias  Goal of Virt-LM Benchmark:  Compare LM performance among different hardware and software platform, especially in data center scenarios  Design Criteria:  Metric Workloads  Sufficient  Observable  Concise  Workload  Typical  Scalable  Stability  Produce repeatable results   platform Metric Results Metric Results … platform Scoring methodology  Impartial  platform Compatibility Usability Metric Results System Under Test  System Under Test（SUT）:  Evaluation Target  Hardware and software platform  Including its VMM and the LM strategies it used Workloads SUT SUT Metric Results Metric Results … SUT Metric Results Metrics  Metrics and Measurement:   Downtime  Metrics Sufficiency:  Cost :  Def: how long the VM is suspended  migration overhead,  Measure: ping  amount of migrated data (burden on network) Total migration time  Def: how long a LM lasts  QoS:  Measure: timing the LM command  Amount of migrated data  Def: how many data is transferred  Measure: transferred data on its exclusive TCP port   downtime,  total migration time  migration overhead, Migration overhead  Def: How much LM impaires performance of the workload  Measure: Declined percentage of the workloads’s score 9 Workloads  Representative to real scenarios migrate  Where:  Data centers service VM VM … VM OS  When:  Load balancing  power management,  service enhancement and fault tolerate Platform (HW and VMM) Workloads  During a live migration,  VM could run different services migrate  Mail Server  Application Server  File Server VM VM … service VM OS  Web Server  Database Server Platform (HW and VMM)  Standby Server  Other VMs exist on the same platform  Heavy during load balancing  Light during power management  Random during service enhancement and fault tolerance  Happens at any moments (Migrations Points) 11 Workload Implementation  Internal workload types       Mail Server: SPECmail2008 App Server: SPECjAppServer2004 File Server: Dbench Web Server: SPECweb2005 Database Server: Sysbench Standby Server: Idle VM External workload VM VM Platform (HW and VMM) Heavy: more VMs to fully utilize the machine  Increasing VMs until workload performances are undermined  Light: single VM on the platform VM Internal Workload OS  External workload types  … migrate Migration Points Problem  During the run of a workload   LM happens at random time Performance varies at different points workload: 483xalancbmk of SPECcpu2006  How to fully represent a workload’s performance variety？  Test as many migration points，spreading the whole run of a workload Migration Points Problem  Problem  too many points prolong the test significantly  Soution   More sample results in each run Only a few runs First run Second run Third run  Implementation  Divide a workload’s runtime into many time sectors  Each time sector is longer than total migration time  Migrate at the startpoint of each sector Scoring Method  Goal: compute an overall score  Each metric i，compute a final score Si  Normalize each result (Pij) using reference system(Rij)  Sum up results of all workloads:  Si of reference system is always 1000: Lower Score indicates higher performance   Open Problem: merge the 4 metrics’ Si    Different property，different variation Simply adding up is not appropriate Current implementation in Virt-LM: Final result have 4 scores Other Criterias  Usability  Easy to configure  VM images Provided  Workloads pre-installed  Easy to run  Automatically managed after launch  Compatibility  Successful on Xen and KVM  Scalable workload: Fully utilize the hardware  Heavy enough macro workload  Live migration lasts a long time.  (Multiple live migration)  more than one are migrated concurrently Benchmark Components  Logical components     System Under Test Migration Target Platform VM Image Storage Management Agent  Benchmark components  Workload VM images  Distributed on VM Image Storage  Running Scripts  Installed on Management Agent Internal Running Process  For every class of workload  Initialize the environment  Run the workload  Migrate the VM at different migration points  Fetch the metrics results  Collect all results and Compute an overall score  Management Agent automatically control the whole process 18 Experiments on Xen and KVM  Experiment Setup  SUT-XEN  VMM：Xen 3.3 on Linux 2.6.27  Hardware：DELL OPTIPLEX 755, 2.4GHz Intel Core Quad Q6600， 2GB memory, sata disk, 100Mbit network  SUT-KVM  VMM：KVM-84 on Linux 2.6.27  Hardware：Same as SUT-XEN  VM  Linux 2.6.27, 512MB mem, one core  Workload  Internal: SPECjvm2008, cpu/mem intensive workloads  External: Light: single VM  Migration Points:Spreading the whole running Experiments on Xen and KVM  Analysis  SUT-KVM intensively compress the data  Less migrated data and less total time  More overhead Experiments on Xen and KVM  Analysis  SUT-XEN strictly control the “downtime”  Less downtime  More migrated data：Due to more rounds of pre-copy to decrease downtime Experiments on Xen and KVM  Analysis  Conclusion  SUT-XEN less “downtime”and “overhead”,  But more consumption of network Further Issues and Possible Solutions 1. Workload Complexity  Total test takes a long time Total time = Runtime * N workload types  When workloads has too many combination N = I * E * P (* M )  Internal workload External workload (I) Internal workload types:  Mail Server,App Server, File Server, Web Server, DBServer , Standby Server  (E) External workload types:  Heavy, Light  (P) Migration points quantity:  Considerable due to the long run time of each workload Migration Points Multiple migration Possible Solutions  Speed up for migration points:   (Virt-LM’s current implementation) More samples in a run  Using time-insensitive workloads  Micro operation: CPU, Memory, IO…  Different memory r/w intensity  Advantage:  Eliminate the “Migration Points” dimension  Internal workloads are reduced  Runtime of each each workload is shorten  Disadvantage:  Different from real scenarios  Hybrid   Test time-insensitive micro workloads Analysis and predict typical workloads results  Redefine an average workload 2. Multiple/Concurrent Live Migration  Problem: Define overall metrics   Representative for platform’s maxium performance Other concerns:  When average results decreased obviously VM … VM VM VM Platform (HW and VMM)  When system cannot afford Thresholds: Concurrent numbers  Possible solutions   Maximum sum of metrics Define different thresholds Average decreased Obviously Sum decreased Obviously Maximum sum System cannot afford 3. Other Issues  Overall score computation  Virt-LM produces 4 scores as the final result  Definition of external workloads  Current implementation is simple  Repeatability   Need more experiment to exam Migration points are not precisely arranged  Compatibility  Should be compatible to other VMM, besides Xen and KVM  Usability  More easy to configure and run Conclusion Current Work  Investigation on recent work on LM  Summarize the critical problems  Migration points  Workload complexity  Scoring methods  Multiple live migration  Present some possible solutions  Implement a benchmark prototype – Virt-LM More details in “Virt-LM: A Benchmark for Live Migration of Virtual Machine”(ICPE2011) Future work  Improve and complete Virt-LM  Implement and test other solutions  Workload complexity  Multiple live migration  Overall score computation  Others  Test and compare their effectiveness and choose best one 30 Our Possible Work under the Cloud WG Possible Work  Relation to the cloud benchmark  Enough migration cost in the workload   Although the cost maybe not a metric, we have to ensure workload could cause enough cost. How fast could a cloud reallocate resources?  If implemented by live migration technology, it regards to following two factors:  1. how many migrations (determined by) resource management and reallocation strategies  2. how fast for each migration  live migration efficiency & cost  Possible future work under cloud benchmark  We may work on how to ensure the workload produce enough live migration cost 32 Possible Work  We hope to cooperate with other members, maybe join a sub-project related to live migration.  We hope can contribute to the design of the Cloud Benchmark 33 Team Members  Prof. Dr. Qinming He  hqm@zju.edu.cn  Kejiang Ye   Representative of the SPEC Research Group yekejiang@zju.edu.cn  Assoc. Prof. Dr. Deshi Ye  yedeshi@zju.edu.cn  Jianhai Chen  Chenjh919@zju.edu.cn  Dawei Huang  tossboyhdw@zju.edu.cn  ……. Appendix: Team’s Recent Work Virtualization Performance  Virtualization in Cloud Computing System  IEEE Cloud’2011, IEEE/ACM GreenCom’2010  Performance Evaluation & Benchmark of VM  ACM/SPEC ICPE’2011, IWVT’2008 (ISCA Workshop), EUC’2008  Performance Optimization of VM  ACM HPDC’2010, IEEE HPCC’2010, IEEE ISPA’2009  Performance Modeling of VM  IEEE HPCC’2010, IFIP NPC’2010 Performance Testing Toolkit for VM  IEEE ChinaGrid’2010 36 Publications  [1] Live Migration of Multiple Virtual Machines with Resource Reservation in Cloud Computing Environments (IEEE Cloud’2011, Accept)  [2] Virt-LM: A Benchmark for Live Migration of Virtual Machine (ACM/SPEC ICPE’2011)  [3] Virtual Machine Based Energy-Efficient Data Center Architecture for Cloud Computing: A Performance Perspective” (IEEE/ACM GreenCom’2010)  [4] Analyzing and Modeling the Performance in Xen-based Virtual Cluster Environment, (IEEE HPCC’2010 )  [5] Two Optimization Mechanisms to Improve the Isolation Property of Server Consolidation in Virtualized Multi-core Server, (IEEE HPCC’2010)  [6] Evaluate the Performance and Scalability of Image Deployment in Virtual Data Center, (IFIP NPC’2010)  [7] vTestkit: A Performance Benchmarking Framework for Virtualization Environments, (IEEE ChinaGrid’2010)  [8] Improving Host Swapping Using Adaptive Prefetching and Paging Notifier, (ACM HPDC’2010)  [9] Load Balancing in Server Consolidation, (IEEE ISPA’2009)  [10] A Framework to Evaluate and Predict Performances in Virtual Machines Environment, (IEEE EUC’2008)  [11] Performance Measuring and Comparing of Virtual Machine Monitors, (IWVT’2008, ISCA Workshop) 37 Thank you!

Live Migration(LM) Benchmark Research

Related documents

Products

Support

Live Migration(LM) Benchmark Research

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib