Xeniac Project Report Paul Craig Gazzillo, Arthur Meacham, Jonathan Ross This paper describes a project we have named Xeniac, which has been an attempt to host Mac OS X and Darwin in the Xen virtual environment. While our initial goal has turned out to be elusive, we have made some promising strides towards achieving it. The project has been a tumultuous one, and we provide something of a technical narrative to describe the challenges we faced and how we dealt with them. Finally, we detail the things that must be done, and possible ways to do them, and discuss our conclusions. Related Work Our project is a modification of an operating system so that it is able to run in the Xen virtual environment. There is a good deal of relevant work related to virtualization in general and Xen in particular, as well as information related to the porting of operating systems in order for them to function in such environments. Additionally, there has been some published work regarding Darwin and especially the kernel on which it was built, Mach. The concept of virtualization has a long history, dating back to research at IBM in the 1960s (Creasy). More recently, there has been a resurgence of interest in virtualization, with a number of research projects investigating methods of hosting multiple VMs on a single machine. One project, named Disco (Bugnion) helped repopularize the idea with a fully-virtualized environment for running multiple commodity operating systems. The designers of Disco achieved full virtualization by using a strategy of direct execution of all code, except for privileged calls which were replaced with kernel traps and handled by the VMM. The overhead associated with trapping to the VMM for every system call was significant, but Disco has still been made into a successful commercial product, VMWare. Running Mac OS X in Xen means we get the performance and scalability that Xen provides. Another project, Denali (Whitaker), introduced the concept of paravirtualization. A paravirtualized VMM does not precisely mimic the behavior of the underlying hardware and can achieve much better performance than full virtualization. Although Denali introduced this concept, it was a highly simplified system and was not intended to host large commodity operating systems. Xen (Barham), on the other hand, took the concept of Denali and created a paravirtual environment that could host modified x86 operating systems such as Linux and Windows. Recent changes to the Intel design have made full virtualization support better for unmodified kernels (Dong). However, to be properly hosted by Xen, an operating system must be modified to run at a lower privilege level, and changes must be made to memory management, I/O and other subsystems. This is what we intend to accomplish with Darwin. The Darwin kernel is built on Mach (Rashid). It adds a BSD UNIX interface and a mechanism for linking extensions to the kernel at run-time. Despite its Mach roots, Darwin is not the microkernel it used to be. BSD interfaces are implemented at the kernel level and Apple adds the I/O Kit, an object-oriented device-driver framework that was not in the original Mach kernel. Some OS subsystems still run in user space, though, such as the file system, and the virtual memory model the Mach microkernel team created is still used by Mac OS X. Darwin has been ported to run in different environments already. For example, Darbat (Lee) is a port of the Darwin kernel to L4, a microkernel. This approach creates separate L4 processes for XNU, the Darwin kernel, and I/O kit, the object-orientated device driver interface. Ultimately, both approaches paravirtualize Darwin as a guest OS, requiring changes to the kernel, but Xen, a virtual machine monitor, differs from L4 in that it is strictly a VMM. Finally, a crucial type of related work is that of other operating systems that have been ported to run on Xen. One especially interesting example in this category was a port of the Minix operating system to Xen by a student who documented the process as part of an academic project (Kelly). Xen 3.0 The most recent version of Xen makes use of new hardware supported virtualization technologies, known as Vanderpool on Intel and Pacifica on AMD (XenSource). These extensions add a new bit to the processor status register, indicating whether or not the CPU is operating in virtualized mode. When this bit is set, all privileged instructions trap to the VMM, allowing it, for instance, to register a handler for protected mode operations such as altering the page table. When all ring 0 operations are trapped to Xen under this new model, an operating system no longer needs to be paravirtualized to run, provided Xen correctly initializes the domain's memory space, i.e. by providing firmware emulation. Another difference is that custom drivers no longer need to be written for block devices, memory management, and networking. However, speed increases can be achieved if these are customized. Despite the clear improvements of Xen 3.x, we have encountered a number of problems. The first, seemingly innocuous, problem arose when we wrote a test boot loader program in assembly language to display a text message and exit when a disk image was “booted”. The resulting output was garbled despite our belief that the code was correct. This was our first clue that the boot environment provided by Xen was not entirely consistent with that of a hardware PC. It turned out that there are a number of places where Xen has been hacked to work in the most common cases such as running Windows or Linux, but is not truly consistent with an x86 PC. Additionally, perhaps because Xen is an open-source project under active and heavy development, we encountered major bugs even in supposedly stable, release versions of the software. For instance, the Xen kernel distributed with the latest version of Fedora became unstable and crashed the entire system whenever we booted Intel's EFI toolkit under Xen. After a great deal of hair-pulling, we were finally able to resolve the problem by switching to an older version of Xen. Step 1: Getting started with Xen. Our first task was to get Xen running and start virtualizing. We opted to use Red Hat's Fedora Core 6 distribution as a Xen development environment because it has builtin Xen support. Our first goal was to get Windows XP running in a VM. We had some initial disappointments, since the GUI tools provided by Xen did not seem to work correctly, and since it turns out Xen is unable to boot from physical CD-ROMs because it does not fully support the format. However, with the help of Google and various developer forums, we were able to install XP from an ISO image and configure Xen by editing the configuration files. Voila, we had XP running in a Linux Window! Step 2: Getting started with Mac OS Once we acquired a Mac Mini, we began experimenting with the boot process and multiboot options. We first tried using Apple's Boot Camp loader. This allowed us to install Windows and OS X on the same disk, but gave us very little control over or insight into the lower level details of the machine. Eventually, we found an open-source project called rEFIt, which provides a GRUB-like boot manager, and which enables the user to access an EFI shell (Klein). This was useful because Apple provides a very limited boot manager, even with Boot Camp, and the Mac loader does not allow access to the EFI shell. Using this, we were able to triple-boot OS X, Windows, and Linux on the same machine. From Linux, we could then run virtualized XP and other OSes in the Xen environment, giving us the distinct pleasure of using Windows within Linux on a MacIntosh EFI. After reading about the MacIntosh Intel architecture, it became clear that the primary hardware difference between x86 Mac and a standard PC lay in the boot process. While PCs still rely on variations of the same BIOS firmware introduced by IBM in the 1980s, Apple has chosen to use the newer standard, called the Extensible Firmware Interface, or EFI. Some of the benefits of EFI include the ability to extend it with drivers and a graphics mode. Apple has taken advantage of the platform's extensibility to customize its version of EFI, and also uses it to load a number of closed-source drivers. On start-up and after the firmware is initialized the EFI code loads and the machines goes into protected mode and graphics mode. Next EFI loads any custom drivers available, after which boot.efi is called. Boot.efi is then responsible for setting up the machine and loading the XNU kernel, after which execution is passed to the kernel. EFI Layer on BIOS It became clear to us that the major hurdle to getting OS X running would be the boot process. Unfortunately, EFI services are not provided by Xen. Rather, Xen provides an environment that mimics that of BIOS to allow for PC operating systems to boot up. Intel, however, offers a free EFI development package that includes bootable floppy disk images which load an EFI environment on BIOS-based machines. We saw this as an opportunity to load EFI in Xen, at which point we could hopefully start the boot manager and launch OS X. This became the focus of our strategy. Although we could not boot the floppy image, we were able to repackage it as a CD-ROM image which we were able to get running in Xen. As mentioned earlier, we encountered a transient (but extremely frequent) bug which froze the entire system whenever we ran this image under Xen. Once we figured out that the problem was rooted in the version of Xen we were using, we were able to use the EFI shell, mount and unmount filesystems, load and unload drivers, and take advantage of other EFI functionalities. Drivers The EFI package provided by Intel did not have any of Apple’s proprietary drivers, of course. Most of these were not especially important to us, such as the code for the Apple startup sound, or at least were not essential for booting the OS, like the bluetooth and airport drivers for instance. However, EFI did not provide support for the HFS+ filesystem, so we had no way to access the bootloader, kernel, or other system files without Apple’s filesystem driver, which resides in the firmware of the Mac Mini and is loaded at startup. Using the EFI shell provided by rEFIt (on the Mac hardware, not in Xen), we were able to get a listing of all drivers in memory with their hex handles and names, one of which happened to be “hfsplus”. From the handle, we were able to get a descriptor which included the start and end locations in memory to which the driver entry referred. Our only means of getting the actual memory contents was via the EFI hex dump utility, so we used that to output the memory contents in ADDRESS/HEX/TEXT table format into a text file. Next, we wrote a Java program which stripped out the address fields, text fields, and various garbage characters. The program converted the hex characters back into numeric values and wrote them to a byte stream which we wrote to the hfsplus.efi file. When we added this file to our EFI boot image in Xen, we found that we could then load it as a driver from the EFI shell. Once the filesystem driver was loaded, we were at last able to mount the OS X partition and browse the files and directories under Xen. Bootloader Woes Sadly, this was the end of our success using this approach. The boot.efi bootloader file, which loads the XNU kernel and starts OS X, simply would not run when we tried to launch it from our homegrown EFI environment. When called from the shell using Mac's native EFI, the booter worked as advertised, but it was rejected as an unsupported file type when we tried using our own shell. Unlike much of the kernel code for Darwin, the boot.efi loader is closed source, to the chagrin of open-source purists, and there is no official, publicly available documentation on its workings. When looking at the binary image of the file, it appeared that it was composed of a header and two executable files concatenated together. When separated, neither of the two embedded binaries runs by themselves, and the second one returns an error that 64-bit binaries are not supported (because our machine is 32-bit). Because of this boot.efi may either have some proprietary internal protection mechanism or is some kind of universal, or fat, EFI binary. After doing some reading online, it appears that Apple has created its own proprietary format that has extra security features enabled (Awkward TV). Needless to say, our version of EFI does not know how to execute this file. Faced with this challenge and a rapidly approaching deadline, we were forced to reevaluate our approach. We briefly considered writing our own bootloader in EFI, and even made an attempt using modified code taken from a sample bootloader provided in the Intel development package. Unfortunately, the learning curve for EFI development coupled with our lack of knowledge about the entry points for the mach kernel, and of the requirements for linking drivers or configuring memory made this too daunting a task for so little remaining time. The Darwin Approach After experiencing this setback we decided to shift our line of attack. We noted that the more-successful attempts to run OS X on bare-metal PCs did not use EFI at all, but appeared to have replaced the proprietary Apple boot environment and OS X kernel with the bootloader and kernel from the (Discontinued) OpenDarwin project. Although we could not get any of these modified OS X images to boot under Xen, we decided to try getting the x86 Darwin distribution to run, first on bare metal and then in a VM. Once we got this running, we reasoned, we could try using OpenDarwin as a replacement core to boot Mac OS. It actually turned out that both available distributions of Darwin, OpenDarwin’s and Apple’s are remarkably fragile. We tried installing on four different machines and failed on every one (OSx86 Project). Eventually, we learned that Darwin does not support recent hard drives, so we bought an IDE drive, on which we were finally able to install Darwin. Even so, getting it to boot required a fair amount of trickery – the bootloader reported partition errors. We ended up having to edit the MBR table with fdisk to set the correct partition type and force the partition to be primary. Even so, booting on a PC and booting under Xen were two different propositions. When booting under Xen, the bootloader reported the same partition errors that it had on the bare metal. Unlike bare metal, however, Xen has no concept of an MBR which can be edited, so we were stuck. We tried installing Grub and Lilo on the disk, then booting with those boot managers under Xen, using them to start Darwin, but Xen does not support virtualized boot managers. The last thing we tried was to start Darwin using pygrub, the python-based grub emulator used by Xen as a boot manager. Unfortunately, pygrub has very limited file system support, and could not recognize Darwin installed with HFS+ or UFS. Sadly, we had run out of time, and had to admit defeat. Compiling and Replacing Kernel from Darwin Source The instructions for compiling the Darwin kernel are almost all for the PPC version of the kernel, including Apple’s Kernel Programming Guide (Apple Developer Connection) and the Macinternals book (Singh), and while the steps are basically the same, most of them do not work. The best instructions came from the Macinternals book, and Apple’s instructions were not successful in creating a working kernel. With minor changes, compiling the kernel for the x86 is very simple. The most important part of compiling the kernel is matching source code versions with the Apple XCode developer tools and by extensions the gcc version. XCode defaults to gcc 4.01 with the option of installing 3.3. For the PPC version, gcc 3.3 must be used. For the Intel version 4.01 is used for compiling everything. Even though the Intel XCode installer gives the option of installing gcc 3.3, installing it only creates the directories without the gcc binaries. In fact, there is no Intel version of Apple’s gcc 3.3 compiler available. Once compiled, the new mach_kernel can replace the old by simply copying it to the root directory and restarting the machine. Future Work The main problems with running Mac OS X in Xen stem from bootloading, so this is where most of the future work needs to be. The most direct approach would be to write our own EFI bootloader. Intel’s TianoCore (http://www.tianocore.org) is an open-source EFI development environment with a compiler, linker, and EFI emulator. The opportunity to understand boot.efi would come out of Apple’s decision to open-source it or not. Knowing more about Mac’s additions to their firmware on top of what Intel provides would provide insight into what else the EFI shell in Xen needs to boot Mac OS X. Running the open-source Darwin distribution in Xen is probably a more. The grub bootloader can boot Darwin on bare metal, because it supports the HFS+ file system that Darwin and Mac OS X use. Xen has a grub emulator for its guest OSes, but unfortunately, this implementation does not support HFS+. Putting the work into supporting HFS+ in Xen’s grub emulation would be fruitful in running Darwin as a guest OS. Once Darwin can be booted in Xen, Mac OS X may not be far behind. The Darwin kernel is compatible with the one distributed with OS X, so the Darwin bootloader could be used to boot Mac OS X as it has in a popular VMWare image. Conclusion Beyond the sheer scope of this project, which we frankly did not anticipate, we have encountered a number of other hurdles. The first is the intentional obscurity of Apple’s x86 environment. Although much of the OS X kernel is nominally open source, perhaps as a legal requirement of using mach and BSD code, enough of the key components are closed to cause many problems. Additionally, build instructions are missing, and makefiles only exist for individual components, not the whole Darwin project. It seems as though Apple has intentionally made it as difficult as possible to use its “open source” code without actually fully closing the source (Yager). The second challenge is the constantly changing and poorly documented nature of Xen. We could not find an appropriate guide for the latest version of Xen that dealt with the kinds of development issues we faced. Documentation we did find, for instance, for editing configuration files, was often completely out of date and invalid even though it was less than one year old. Additionally, Xen is a work in progress, and many problems we encountered are listed in the “To do eventually” page of the Xen project website. Lack of documentation is a major issue we faced from all sides. Whenever we got stuck, which was often, we ended up scouring Google, bulletin boards, and mailing lists for any kind of information, clues, or even rumors of what the solution could be. As might be expected, this led to more dead-ends than actual fixes, and was enormously timeconsuming. In a way, this was two projects. One was learning OS X and getting it running on a PC environment. The other was learning Xen and getting an operating system to operate as its guest. Both were substantial undertakings that required a good deal of hackery. Together, they made for a complex and error-prone enterprise. In the process of attempting this goal, we got our hands extremely dirty, we did some hacking that we are quite proud of, and we got a better understanding of the workings of a modern OS and a modern VMM. References: Creasy, R. 1981. The origin of the VM/370 time-sharing system. IBM J. Res. Develop 25,5, 483-490. Bugnion, E., Devine, S., Govil, K., and Rosenblum, M. 1997. Disco: running commodity operating systems on scalable multiprocessors. ACM Trans. Comput. Syst. 15, 4 (Nov. 1997), 412-447. A. Whitaker, M. Shaw, and S. D. Gribble. Denali: Lightweight Virtual Machines for Distributed and Networked Applications. Technical Report 02-02-01, University of Washington, 2002. Barham, P., Dragovic, B., Fraser, K., Hand, S., Harris, T., Ho, A., Neugebauer, R., Pratt, I., and Warfield, A. 2003. Xen and the art of virtualization. In Proceedings of the Nineteenth ACM Symposium on Operating Systems Principles (Bolton Landing, NY, USA, October 19 - 22, 2003). SOSP '03. ACM Press, New York, NY, 164-177. Dong, Y., Li, S., Mallick, A., Nakajima, J., Tian, K., Xu, X., Yang, F., Yu, W. Extending Xen with Intel Virtualization Technology, Intel Technology Journal, Volume 10, Issue 3, 2006. Rashid, R., Julin, D., Orr, D., Sanzi, R., Baron, R., Forin, A., Golub, D., Jones, M. Mach: A System Software Kernel. COMPCON Spring '89, San Francisco, CA, March 1989. Lee, G., Gray, C. L4/Darwin: Evolving Unix. Conference for Unix, Linux and Open Source Professionals, Melbourne, Vic, Australia, October, 2006. Kelly, I. Final Year Project: Porting MINIX to Xen. University of Limerick, Ireland, 2006. Boot.efi Information, AwkwardTV Wiki, http://wiki.awkwardtv.org/wiki/Boot.efi_Information Klein, Mark, Xen on Intel Mac-Mini, http://www.scl.ameslab.gov/Projects/minixen/index.html Building your first Kernel, Kernel Programming Guide, Apple Developer Connection, http://developer.apple.com/documentation/Darwin/Conceptual/KernelProgramming/build /chapter_18_section_3.html Singh, Amit, Mac OS X Internals: A Systems Approach, Addison Wesley Professional, June 19, 2006 Yager, Tom, Apple Closes Down OS X, http://www.macworld.co.uk/news/index.cfm?NewsID=14663&Page=1&pagePos=8 Xen User's Manual, XenSource, http://www.xensource.com/files/xen_user_manual.pdf Hardware Compatibility List, OSx86 Project, http://wiki.osx86project.org/wiki/index.php/HCL