Live Migration of Virtual Machines
Presented by:
Edward Armstrong
University of Guelph
Overview
Problem Description
Solutions
Strengths and Weaknesses
Results
Migrating an OS - Complexity
Trivial
• Same machine.
• Hibernating a laptop.
Moderate
• Same Hardware, different
Machine.
• Changing cluster nodes.
Complex
• Different hardware.
Migrating an OS – Timescale
Reboot
Hibernation
Suspend
Live
• Requires only long term storage to be migrated.
• Loses all process state information.
• Driver reloading solves hardware problems.
• Typically performed on the same machine.
• Requires long term storage.
• Loses external connection information, ie. network status.
• Hibernation in short term storage (RAM).
• Typically used to maintain a low power state.
• Processes are not implicitly frozen.
• Differences in hardware create problems.
• Solved by using virtual machines.
Considerations
Both machines must be active at the same
time.
Migration of active live services.
Total migration time.
Resource contention.
Migrating memory
Push
Pull
Pause
Memory - Pure stop and copy.
Push
Pull
Pros
◦ Simplicity.
◦ Consistency.
Pause
Cons
◦ Downtime proportional to memory.
◦ Unacceptable for live services.
Source Machine
Target Machine
Memory - On demand migration.
Push
Pull
Pros
◦ Shorter downtime.
◦ Consistency.
Pause
Cons
◦ Longer migration time.
Page fault request
Send page
Source Machine
Target Machine
Memory - Pre copy migration.
Push
Pull
Pros
◦ Copy low fault pages quickly.
◦ Works well for live processes.
Pause
Cons
◦ Large number of faults for busy
memory.
Iterative push
Live page fault
Source Machine
Target Machine
Network and disk resources.
Unique to an OS instance.
Ordering of resources are nondeterminsitic.
Need to maintain open network
connections.
Resolving network connections with
an ARP* response.
Source Machine
No Address
LAN
Packets
Target Machine
192.168.0.102
*ARP: Address Resolution Protocol
Resolving network connections with
an ARP* response.
Unsolicited ARP reply
Source Machine
No Address
LAN
Packets
Target Machine
192.168.0.102
*ARP: Address Resolution Protocol
Resolving network connections with
an ARP* response.
Source Machine
192.168.0.102
Packets
LAN
Target Machine
No Address
*ARP: Address Resolution Protocol
Network resources.
Pros
◦ Handled by external devices.
◦ Similar scheme can be used to migrate disk
services (not covered in paper).
Cons
◦ Small amount of packet loss.
◦ Requires a LAN with unsolicited ARP
responses enabled.
Design Overview
PreMigration
Reservation
Pre-Copy
Activate
Commit
Stop and
Copy
Select a new target machine
Source Machine
Target Machine
Design Overview
PreMigration
Reservation
Pre-Copy
Activate
Commit
Stop and
Copy
Source Machine
Confirm available resources on target.
Failure means VM continues to run on source.
Target Machine
Design Overview
PreMigration
Reservation
Pre-Copy
Activate
Commit
Stop and
Copy
Source Machine
Transfer memory.
Retransmit memory used during transfer.
Target Machine
Design Overview
PreMigration
Reservation
Pre-Copy
Activate
Commit
Stop and
Copy
Source Machine
Halt source to redirect network traffic
and transfer CPU state.
Target Machine
Design Overview
PreMigration
Reservation
Pre-Copy
Activate
Commit
Stop and
Copy
Source Machine
Verify transfer complete.
Disable source.
Target Machine
Design Overview
PreMigration
Reservation
Pre-Copy
Activate
Commit
Stop and
Copy
Activate target machine
Source Machine
Target Machine
Summary of Design
At all times there is at least one
consistent image available.
Minimized down time.
Not a fast process overall.
Requires tuning.
Requires certain hardware.
Performance – Dirty Pages.
8 second granularity, used to decide which pages make for good pre-migration.
Performance
Performance
1- 4 pre-copy iterations
Performance
1- 4 pre-copy iterations
Performance
1- 4 pre-copy iterations