Xeniac Project Report

advertisement
Xeniac Project Report
Paul Craig Gazzillo, Arthur Meacham, Jonathan Ross
This paper describes a project we have named Xeniac, which has been an attempt
to host Mac OS X and Darwin in the Xen virtual environment. While our initial goal has
turned out to be elusive, we have made some promising strides towards achieving it. The
project has been a tumultuous one, and we provide something of a technical narrative to
describe the challenges we faced and how we dealt with them. Finally, we detail the
things that must be done, and possible ways to do them, and discuss our conclusions.
Related Work
Our project is a modification of an operating system so that it is able to run in the Xen
virtual environment. There is a good deal of relevant work related to virtualization in
general and Xen in particular, as well as information related to the porting of operating
systems in order for them to function in such environments. Additionally, there has been
some published work regarding Darwin and especially the kernel on which it was built,
Mach.
The concept of virtualization has a long history, dating back to research at IBM in the
1960s (Creasy). More recently, there has been a resurgence of interest in virtualization,
with a number of research projects investigating methods of hosting multiple VMs on a
single machine. One project, named Disco (Bugnion) helped repopularize the idea with a
fully-virtualized environment for running multiple commodity operating systems. The
designers of Disco achieved full virtualization by using a strategy of direct execution of
all code, except for privileged calls which were replaced with kernel traps and handled by
the VMM. The overhead associated with trapping to the VMM for every system call was
significant, but Disco has still been made into a successful commercial product,
VMWare. Running Mac OS X in Xen means we get the performance and scalability that
Xen provides.
Another project, Denali (Whitaker), introduced the concept of paravirtualization. A
paravirtualized VMM does not precisely mimic the behavior of the underlying hardware
and can achieve much better performance than full virtualization. Although Denali
introduced this concept, it was a highly simplified system and was not intended to host
large commodity operating systems. Xen (Barham), on the other hand, took the concept
of Denali and created a paravirtual environment that could host modified x86 operating
systems such as Linux and Windows. Recent changes to the Intel design have made full
virtualization support better for unmodified kernels (Dong). However, to be properly
hosted by Xen, an operating system must be modified to run at a lower privilege level,
and changes must be made to memory management, I/O and other subsystems. This is
what we intend to accomplish with Darwin.
The Darwin kernel is built on Mach (Rashid). It adds a BSD UNIX interface and a
mechanism for linking extensions to the kernel at run-time. Despite its Mach roots,
Darwin is not the microkernel it used to be. BSD interfaces are implemented at the
kernel level and Apple adds the I/O Kit, an object-oriented device-driver framework that
was not in the original Mach kernel. Some OS subsystems still run in user space, though,
such as the file system, and the virtual memory model the Mach microkernel team
created is still used by Mac OS X.
Darwin has been ported to run in different environments already. For example, Darbat
(Lee) is a port of the Darwin kernel to L4, a microkernel. This approach creates separate
L4 processes for XNU, the Darwin kernel, and I/O kit, the object-orientated device driver
interface. Ultimately, both approaches paravirtualize Darwin as a guest OS, requiring
changes to the kernel, but Xen, a virtual machine monitor, differs from L4 in that it is
strictly a VMM.
Finally, a crucial type of related work is that of other operating systems that have been
ported to run on Xen. One especially interesting example in this category was a port of
the Minix operating system to Xen by a student who documented the process as part of an
academic project (Kelly).
Xen 3.0
The most recent version of Xen makes use of new hardware supported
virtualization technologies, known as Vanderpool on Intel and Pacifica on AMD
(XenSource). These extensions add a new bit to the processor status register, indicating
whether or not the CPU is operating in virtualized mode. When this bit is set, all
privileged instructions trap to the VMM, allowing it, for instance, to register a handler for
protected mode operations such as altering the page table. When all ring 0 operations are
trapped to Xen under this new model, an operating system no longer needs to be
paravirtualized to run, provided Xen correctly initializes the domain's memory space, i.e.
by providing firmware emulation. Another difference is that custom drivers no longer
need to be written for block devices, memory management, and networking. However,
speed increases can be achieved if these are customized.
Despite the clear improvements of Xen 3.x, we have encountered a number of
problems. The first, seemingly innocuous, problem arose when we wrote a test boot
loader program in assembly language to display a text message and exit when a disk
image was “booted”. The resulting output was garbled despite our belief that the code
was correct. This was our first clue that the boot environment provided by Xen was not
entirely consistent with that of a hardware PC. It turned out that there are a number of
places where Xen has been hacked to work in the most common cases such as running
Windows or Linux, but is not truly consistent with an x86 PC.
Additionally, perhaps because Xen is an open-source project under active and
heavy development, we encountered major bugs even in supposedly stable, release
versions of the software. For instance, the Xen kernel distributed with the latest version
of Fedora became unstable and crashed the entire system whenever we booted Intel's EFI
toolkit under Xen. After a great deal of hair-pulling, we were finally able to resolve the
problem by switching to an older version of Xen.
Step 1: Getting started with Xen.
Our first task was to get Xen running and start virtualizing. We opted to use Red
Hat's Fedora Core 6 distribution as a Xen development environment because it has builtin Xen support. Our first goal was to get Windows XP running in a VM. We had some
initial disappointments, since the GUI tools provided by Xen did not seem to work
correctly, and since it turns out Xen is unable to boot from physical CD-ROMs because it
does not fully support the format. However, with the help of Google and various
developer forums, we were able to install XP from an ISO image and configure Xen by
editing the configuration files. Voila, we had XP running in a Linux Window!
Step 2: Getting started with Mac OS
Once we acquired a Mac Mini, we began experimenting with the boot process and
multiboot options. We first tried using Apple's Boot Camp loader. This allowed us to
install Windows and OS X on the same disk, but gave us very little control over or insight
into the lower level details of the machine. Eventually, we found an open-source project
called rEFIt, which provides a GRUB-like boot manager, and which enables the user to
access an EFI shell (Klein). This was useful because Apple provides a very limited boot
manager, even with Boot Camp, and the Mac loader does not allow access to the EFI
shell. Using this, we were able to triple-boot OS X, Windows, and Linux on the same
machine. From Linux, we could then run virtualized XP and other OSes in the Xen
environment, giving us the distinct pleasure of using Windows within Linux on a
MacIntosh EFI.
After reading about the MacIntosh Intel architecture, it became clear that the
primary hardware difference between x86 Mac and a standard PC lay in the boot process.
While PCs still rely on variations of the same BIOS firmware introduced by IBM in the
1980s, Apple has chosen to use the newer standard, called the Extensible Firmware
Interface, or EFI. Some of the benefits of EFI include the ability to extend it with drivers
and a graphics mode. Apple has taken advantage of the platform's extensibility to
customize its version of EFI, and also uses it to load a number of closed-source drivers.
On start-up and after the firmware is initialized the EFI code loads and the machines goes
into protected mode and graphics mode. Next EFI loads any custom drivers available,
after which boot.efi is called. Boot.efi is then responsible for setting up the machine and
loading the XNU kernel, after which execution is passed to the kernel.
EFI Layer on BIOS
It became clear to us that the major hurdle to getting OS X running would be the boot
process. Unfortunately, EFI services are not provided by Xen. Rather, Xen provides an
environment that mimics that of BIOS to allow for PC operating systems to boot up.
Intel, however, offers a free EFI development package that includes bootable floppy disk
images which load an EFI environment on BIOS-based machines. We saw this as an
opportunity to load EFI in Xen, at which point we could hopefully start the boot manager
and launch OS X. This became the focus of our strategy.
Although we could not boot the floppy image, we were able to repackage it as a
CD-ROM image which we were able to get running in Xen. As mentioned earlier, we
encountered a transient (but extremely frequent) bug which froze the entire system
whenever we ran this image under Xen. Once we figured out that the problem was
rooted in the version of Xen we were using, we were able to use the EFI shell, mount and
unmount filesystems, load and unload drivers, and take advantage of other EFI
functionalities.
Drivers
The EFI package provided by Intel did not have any of Apple’s proprietary
drivers, of course. Most of these were not especially important to us, such as the code for
the Apple startup sound, or at least were not essential for booting the OS, like the
bluetooth and airport drivers for instance. However, EFI did not provide support for the
HFS+ filesystem, so we had no way to access the bootloader, kernel, or other system files
without Apple’s filesystem driver, which resides in the firmware of the Mac Mini and is
loaded at startup.
Using the EFI shell provided by rEFIt (on the Mac hardware, not in Xen), we
were able to get a listing of all drivers in memory with their hex handles and names, one
of which happened to be “hfsplus”. From the handle, we were able to get a descriptor
which included the start and end locations in memory to which the driver entry referred.
Our only means of getting the actual memory contents was via the EFI hex dump utility,
so we used that to output the memory contents in ADDRESS/HEX/TEXT table format
into a text file. Next, we wrote a Java program which stripped out the address fields, text
fields, and various garbage characters. The program converted the hex characters back
into numeric values and wrote them to a byte stream which we wrote to the hfsplus.efi
file. When we added this file to our EFI boot image in Xen, we found that we could then
load it as a driver from the EFI shell. Once the filesystem driver was loaded, we were at
last able to mount the OS X partition and browse the files and directories under Xen.
Bootloader Woes
Sadly, this was the end of our success using this approach. The boot.efi
bootloader file, which loads the XNU kernel and starts OS X, simply would not run when
we tried to launch it from our homegrown EFI environment. When called from the shell
using Mac's native EFI, the booter worked as advertised, but it was rejected as an
unsupported file type when we tried using our own shell. Unlike much of the kernel code
for Darwin, the boot.efi loader is closed source, to the chagrin of open-source purists, and
there is no official, publicly available documentation on its workings. When looking at
the binary image of the file, it appeared that it was composed of a header and two
executable files concatenated together. When separated, neither of the two embedded
binaries runs by themselves, and the second one returns an error that 64-bit binaries are
not supported (because our machine is 32-bit). Because of this boot.efi may either have
some proprietary internal protection mechanism or is some kind of universal, or fat, EFI
binary. After doing some reading online, it appears that Apple has created its own
proprietary format that has extra security features enabled (Awkward TV). Needless to
say, our version of EFI does not know how to execute this file.
Faced with this challenge and a rapidly approaching deadline, we were forced to
reevaluate our approach. We briefly considered writing our own bootloader in EFI, and
even made an attempt using modified code taken from a sample bootloader provided in
the Intel development package. Unfortunately, the learning curve for EFI development
coupled with our lack of knowledge about the entry points for the mach kernel, and of the
requirements for linking drivers or configuring memory made this too daunting a task for
so little remaining time.
The Darwin Approach
After experiencing this setback we decided to shift our line of attack. We noted
that the more-successful attempts to run OS X on bare-metal PCs did not use EFI at all,
but appeared to have replaced the proprietary Apple boot environment and OS X kernel
with the bootloader and kernel from the (Discontinued) OpenDarwin project. Although
we could not get any of these modified OS X images to boot under Xen, we decided to
try getting the x86 Darwin distribution to run, first on bare metal and then in a VM. Once
we got this running, we reasoned, we could try using OpenDarwin as a replacement core
to boot Mac OS.
It actually turned out that both available distributions of Darwin, OpenDarwin’s
and Apple’s are remarkably fragile. We tried installing on four different machines and
failed on every one (OSx86 Project). Eventually, we learned that Darwin does not
support recent hard drives, so we bought an IDE drive, on which we were finally able to
install Darwin. Even so, getting it to boot required a fair amount of trickery – the
bootloader reported partition errors. We ended up having to edit the MBR table with
fdisk to set the correct partition type and force the partition to be primary.
Even so, booting on a PC and booting under Xen were two different propositions.
When booting under Xen, the bootloader reported the same partition errors that it had on
the bare metal. Unlike bare metal, however, Xen has no concept of an MBR which can
be edited, so we were stuck. We tried installing Grub and Lilo on the disk, then booting
with those boot managers under Xen, using them to start Darwin, but Xen does not
support virtualized boot managers. The last thing we tried was to start Darwin using
pygrub, the python-based grub emulator used by Xen as a boot manager. Unfortunately,
pygrub has very limited file system support, and could not recognize Darwin installed
with HFS+ or UFS. Sadly, we had run out of time, and had to admit defeat.
Compiling and Replacing Kernel from Darwin Source
The instructions for compiling the Darwin kernel are almost all for the PPC version of
the kernel, including Apple’s Kernel Programming Guide (Apple Developer Connection)
and the Macinternals book (Singh), and while the steps are basically the same, most of
them do not work. The best instructions came from the Macinternals book, and Apple’s
instructions were not successful in creating a working kernel. With minor changes,
compiling the kernel for the x86 is very simple. The most important part of compiling
the kernel is matching source code versions with the Apple XCode developer tools and
by extensions the gcc version. XCode defaults to gcc 4.01 with the option of installing
3.3. For the PPC version, gcc 3.3 must be used. For the Intel version 4.01 is used for
compiling everything. Even though the Intel XCode installer gives the option of
installing gcc 3.3, installing it only creates the directories without the gcc binaries. In
fact, there is no Intel version of Apple’s gcc 3.3 compiler available. Once compiled, the
new mach_kernel can replace the old by simply copying it to the root directory and
restarting the machine.
Future Work
The main problems with running Mac OS X in Xen stem from bootloading, so this is
where most of the future work needs to be. The most direct approach would be to write
our own EFI bootloader. Intel’s TianoCore (http://www.tianocore.org) is an open-source
EFI development environment with a compiler, linker, and EFI emulator. The
opportunity to understand boot.efi would come out of Apple’s decision to open-source it
or not. Knowing more about Mac’s additions to their firmware on top of what Intel
provides would provide insight into what else the EFI shell in Xen needs to boot Mac OS
X.
Running the open-source Darwin distribution in Xen is probably a more. The grub
bootloader can boot Darwin on bare metal, because it supports the HFS+ file system that
Darwin and Mac OS X use. Xen has a grub emulator for its guest OSes, but
unfortunately, this implementation does not support HFS+. Putting the work into
supporting HFS+ in Xen’s grub emulation would be fruitful in running Darwin as a guest
OS. Once Darwin can be booted in Xen, Mac OS X may not be far behind. The Darwin
kernel is compatible with the one distributed with OS X, so the Darwin bootloader could
be used to boot Mac OS X as it has in a popular VMWare image.
Conclusion
Beyond the sheer scope of this project, which we frankly did not anticipate, we have
encountered a number of other hurdles. The first is the intentional obscurity of Apple’s
x86 environment. Although much of the OS X kernel is nominally open source, perhaps
as a legal requirement of using mach and BSD code, enough of the key components are
closed to cause many problems. Additionally, build instructions are missing, and
makefiles only exist for individual components, not the whole Darwin project. It seems
as though Apple has intentionally made it as difficult as possible to use its “open source”
code without actually fully closing the source (Yager).
The second challenge is the constantly changing and poorly documented nature of Xen.
We could not find an appropriate guide for the latest version of Xen that dealt with the
kinds of development issues we faced. Documentation we did find, for instance, for
editing configuration files, was often completely out of date and invalid even though it
was less than one year old. Additionally, Xen is a work in progress, and many problems
we encountered are listed in the “To do eventually” page of the Xen project website.
Lack of documentation is a major issue we faced from all sides. Whenever we got stuck,
which was often, we ended up scouring Google, bulletin boards, and mailing lists for any
kind of information, clues, or even rumors of what the solution could be. As might be
expected, this led to more dead-ends than actual fixes, and was enormously timeconsuming.
In a way, this was two projects. One was learning OS X and getting it running on a PC
environment. The other was learning Xen and getting an operating system to operate as
its guest. Both were substantial undertakings that required a good deal of hackery.
Together, they made for a complex and error-prone enterprise. In the process of
attempting this goal, we got our hands extremely dirty, we did some hacking that we are
quite proud of, and we got a better understanding of the workings of a modern OS and a
modern VMM.
References:
Creasy, R. 1981. The origin of the VM/370 time-sharing system. IBM J. Res. Develop
25,5, 483-490.
Bugnion, E., Devine, S., Govil, K., and Rosenblum, M. 1997. Disco: running commodity
operating systems on scalable multiprocessors. ACM Trans. Comput. Syst. 15, 4 (Nov.
1997), 412-447.
A. Whitaker, M. Shaw, and S. D. Gribble. Denali: Lightweight Virtual Machines for
Distributed and Networked Applications. Technical Report 02-02-01, University of
Washington, 2002.
Barham, P., Dragovic, B., Fraser, K., Hand, S., Harris, T., Ho, A., Neugebauer, R., Pratt,
I., and Warfield, A. 2003. Xen and the art of virtualization. In Proceedings of the
Nineteenth ACM Symposium on Operating Systems Principles (Bolton Landing, NY,
USA, October 19 - 22, 2003). SOSP '03. ACM Press, New York, NY, 164-177.
Dong, Y., Li, S., Mallick, A., Nakajima, J., Tian, K., Xu, X., Yang, F., Yu, W. Extending
Xen with Intel Virtualization Technology, Intel Technology Journal, Volume 10, Issue 3,
2006.
Rashid, R., Julin, D., Orr, D., Sanzi, R., Baron, R., Forin, A., Golub, D., Jones, M. Mach:
A System Software Kernel. COMPCON Spring '89, San Francisco, CA, March 1989.
Lee, G., Gray, C. L4/Darwin: Evolving Unix. Conference for Unix, Linux and Open
Source Professionals, Melbourne, Vic, Australia, October, 2006.
Kelly, I. Final Year Project: Porting MINIX to Xen. University of Limerick, Ireland,
2006.
Boot.efi Information, AwkwardTV Wiki,
http://wiki.awkwardtv.org/wiki/Boot.efi_Information
Klein, Mark, Xen on Intel Mac-Mini, http://www.scl.ameslab.gov/Projects/minixen/index.html
Building your first Kernel, Kernel Programming Guide, Apple Developer Connection,
http://developer.apple.com/documentation/Darwin/Conceptual/KernelProgramming/build
/chapter_18_section_3.html
Singh, Amit, Mac OS X Internals: A Systems Approach, Addison Wesley Professional,
June 19, 2006
Yager, Tom, Apple Closes Down OS X,
http://www.macworld.co.uk/news/index.cfm?NewsID=14663&Page=1&pagePos=8
Xen User's Manual, XenSource, http://www.xensource.com/files/xen_user_manual.pdf
Hardware Compatibility List, OSx86 Project,
http://wiki.osx86project.org/wiki/index.php/HCL
Download