Live migration of Virtual Machines

advertisement

Live migration of Virtual

Machines

Nour Stefan, SCPD

• Introduction

• Related work

• Design

• Writable Working Sets

• Implementation Issues

• Evaluation

• Future work

• Conclusions

Introduction

• OS virtualization

– Data centers

– Cluster computing

• Live OS migration

– Avoid problem of “residual dependencies”

– In-memory state can be transferred in a consistent and efficient way

• Kernel-internal state

• Application-level state

– Separation of concerns between Users and Operator of a data center or cluster

– Separation of hardware and software considerations, and consolidating clustered hardware into a single coherent management domain

• High-performance migration support for Xen

Related work

• Collective project

– For slow connections and longer time spans

– Stop the OS execution while transfer

• Zap

• NomadBIOS

– Pre-copy migration

– Not adapting to the writable working set

Design

• Migrating memory

• Balancing Downtime and Total migration time

– Push phase

– Stop-and-copy phase

– Pull phase

• Local resources

– Connections to local devices(disks , network interfaces)

– Single switched LAN

– Generate an unsolicited ARP reply from migrated host, advertising that the IP has moved to a new location

– Network-Attached Storage

Writable Working Sets

Measuring Writable Working Sets

Measuring Writable Working Sets

Measuring Writable Working Sets

Implementation Issues

• Managed migration

– Performed largely outside the migratee

– Migration daemons running in the management VM of the source and destination (new VM on destination)

– Rounds of copying (dirtied during the previous round)

– Dirty bitmap copied from Xen at start of each round

– Shadow page tables (read-only page-tables entries => page fault trapped by Xen)

Implementation Issues

• Self migration

– Implemented within the migratee OS

– Migration stub on destination machine

– Consistent OS checkpointing

• Two-stage stop-and-copy phase

– Disables all OS activity except for migration => final scan of dirty bitmap => shadow buffer

• Transfer shadow buffer

Implementation Issues

• Dynamic Rate-Limiting

• Rapid Page Dirtying

Evaluation

• Test setup

– Dual Intel Xeon 2GHz CPU and 2GB memory

– TG3 broadband

Evaluation

Evaluation

Evaluation

Evaluation

Future work

• Cluster management

• Wide Area Network redirection

• Migrating Block Devices

Conclusions

• By integrating live OS migration into the Xen virtual machine monitor we enable rapid movement of interactive workloads within clusters and data centers.

Our dynamic network-bandwidth adaptation allows migration to proceed with minimal impact on running services, while reducing total downtime to below discernable thresholds.

• Our comprehensive evaluation shows that realistic server workloads such as SPECweb99 can be migrated with just 210ms downtime, while a Quake3 game server is migrated with an imperceptible 60ms outage.

Download