The Best of Both Worlds with On-Demand Virtualization Thawan Kooburat and Michael M. Swift On-Demand Virtualization On-Demand Virtualization allows systems to benefit from virtualization without paying for its overhead all the time. • Executing natively: No virtualization overhead when it is not being used. • Virtualize on-the-fly: Convert entire machine to VM only as the need arises. • No service disruption: OS/Application state and network connections remain intact after conversion. Motivation Cost of Virtualization Uses of On-Demand Virtualization There are many places where virtualization is not being used, such as datacenters running performance-sensitive applications or desktop environment due to: There are many use cases which can be enabled by on-demand virtualizations such as: • Lack of functionality: Devices such as GPUs, WIFI and ACPI devices are not emulated or supported by VMM. • IO overhead: Requires costly trap and emulation by host OS or VMM. • Memory overhead: TLB miss is more expensive due to extra level of page tables [1]. • Resource consolidation: Execute natively during peak hours. Then, virtualize and consolidate resources to save power. • Debugging: Virtualize and use tools such as deterministic replay [2] to analyze issues. • Backup: Checkpointing can be used to create backups. Implementation Key Challenges Problem: Capturing OS and Process State Solution: Hibernation We rely on the hibernation mechanism to transfer OS state from a native machine to a virtual machine: we suspend the native OS to disk, and resume it inside a virtual machine. However, hibernation assume that the OS will resume on machine with same hardware profile. Physical Machine 3 Start VM Native Kernel Virtualize 1 Hibernate VMM Partition Guest Partition VMM Kernel Our prototype is implemented on Linux 2.6.35 using TuxOnIce [3] for hibernation. We use on KVM as our Virtual Machine Monitor. The only cost is (i) reserving disk space for the VMM and (ii) using logical devices. Our current prototype can perform one-way conversion from native to virtual execution while retaining SSH/SCP connections. The process takes about 90 seconds and majority of the time is spent in hibernate /resume process and machine reboot. Guest Kernel 2 Boot Other Devices 4 Resume VMM Partition Device Guest Partition CPU* Memory Problem: Discovering Devices inside the VM Display Solution: Hotplugging Hibernation Steps Virtualize Steps Prepare device and load drivers Machine Restart Hibernate For platform devices such as timers and interrupt routers that do not have hotplug support, the hibernation boot code passes on configuration information to the resuming kernel, so that it can reconfigure these devices. Load hibernate image Scan and reconfigure devices Resume user processes Solution: Logical Devices OS OS bond0 /dev/mapper/vdisk e1000e Phy e1000 VM CPU Hotplugging Memory Hotplugging Switch to VGA display mode * VMM may not be able to emulate hardware features found in a physical machine such as IOMMU. Thus, we need to disable these features in advance because the kernel and applications may rely on them. References and Related Work Boot up machine Problem: Preserving Device Bindings Logical devices act as an interposition layer between the kernel and device drivers. They are normally used to provide aggregation or high availability. Thus, they allow existing kernel structures to transfer their state from one device to another. We use the network bonding driver to preserve network connections and the device mapper to do the same for block devices. This allows us to preserve device binding even across different model/type of devices. Only one CPU is online Require VM to have the same amount of RAM Future Boot up VMM Kernel Start VM from Guest partition Guest Boot Kernel Kernel We rely on hotplugging support found in many devices to make transition from physical to virtual hardware. The system virtually unplugs physical devices and plugs in new virtual devices. We modify the kernel to rescan the PCI bus to discover virtual devices and attach them to the OS during the resume process. However, this mechanism only works on devices supporting hotplugging. Guest Kernel Legend: Current SATA Phy IDE VM Microvisor [4] presents the first step toward ondemand virtualization. However, they do not virtualize the entire system so their benefit is limited to online maintenance. VMWare Converter [5] can clone a physical machine into a VM. However, it does not preserve running state of the physical machine. OS live migration [6] adds whole-system migration to an OS without requiring virtualization. However, each class of device must provide an import/export interface to transfer device state. Our logical device approach do not require device driver modification. [1] R. Bhargava, B. Serebrin, F. Spadini, et al. Accelerating two-dimensional page walks for virtualized systems, ASPLOS, 2008. [2] J. Chow, T. Garfinkel, and P.M. Chen. Decoupling dy-namic program analysis from execution in virtual environments, USENIX, 2008. [3] Linux software suspend http://www.tuxonice.net [4] D.E. Lowell, Y. Saito, and E.J. Samberg. Devirtualizable virtual machines enabling general, single-node, online maintenance, ASPLOS, 2004. [5] VMware vCenter Converter http://www.vmware.com/products/converter [6] M. Kozuch, M. Kaminsky, and M.P. Ryan. Migration without Virtualization, HotOS, 2009.