Bart Miller Outline Definition and goals Paravirtualization System Architecture The Virtual Machine Interface Memory Management CPU Device I/O Network, Disk Xen Timeline Definition and Goals An x86 Virtual Machine Monitor (aka Hypervisor) Developed in 2003 Approximately 60k lines of code Goals: 100 VMs per system Support full multi-application OSes No modifications to guest applications Negligible sacrifice in performance Fully isolate guests Paravirtualization Full virtualization on x86 (ca. 2003) is complex and not efficient Create a unique interface to the hardware Let the Guest OS access the hardware directly when appropriate Prevent the Guest OS from accessing functionality which could affect other guests or the VMM. Must modify the Guest OS For Linux, 2995 lines (1.36% of code base) For Windows XP, 4620 lines (0.04% of code base) System Architecture Memory Management Problem x86 has a hardware managed TLB Assumes single OS, does not support tagging nor managing in software Context switch requires TLB flush Solution Guest OSes manage hardware page table Direct read access; updates batched and validated by Xen Xen resides in 64Mbyte section at the top of every address space CPU Protection x86 has 4 privilege levels, known as Rings. Ring O is highest and Ring 3 lowest privilege For Xen, the VMM executes in Ring O, the Guest OS executes in Ring 1, and the user programs execute in Ring 3. Exceptions The Guest OS registers a table of exception handlers with Xen All are unmodified except the Page Fault handler, since it normally requires access to a privileged register (CR2) CPU (2) System Calls The Guest OS can register a “fast” exception handler Executes without indirection Xen verifies that the handler does not specify execution in Ring O Interrupts Interrupts are replaced by a lightweight event system Asynchronous, relies on Ring buffer Time Guest OSes have access to “real” and “virtual” timers Device I/O Network VIF, VFR Transmit and receive Ring I/O buffers Domain O manages and enforces the firewall rules Transmit: A guest enqueues a request to the transmit ring Xen validates the request against the firewall rules and forwards to the device Receive: A guest enqueues a receive request to the receive ring Xen determines the appropriate recipient The packet buffer is exchanged for a sacrificial page frame on the receiver’s ring Device I/O (2) Disk access Only Domain O can directly access physical disks All DomUs communicate through Virtual Block Devices (VBD) Channels are comprised of Ring buffers Requests can be reordered by the Guest OS and Xen Unless the Guest OS issues a reorder barrier Ring Structure Ring Structure (2) Xen Timeline 2003: Initial release of Xen 2005 was a significant year for Virtualization Intel introduces VT-x, quickly utilized by Xen Narrows performance gap between HVM and PVM 2006: Amazon opens up public beta of EC2 2007: Live migration for HVM guests 2008: PCI pass-through (VT-d) and ACPI S3 support 2011: Xen support for DomO and DomU is added to the Linux kernel Questions?