Comparison of Amoeba Vs Mach Operating Systems Yousuf Surmust Research Project: CS550 Fall 2001 Illinois Institute of Technology, Chicago, US. Email: yousuf@surmust.com Website: http://www.surmust.com Abstract: In this paper, I discuss two popular operating systems Amoeba and Mach. The future of super computing lies in massively parallel computers and will require the use of parallel computers containing thousands of powerful CPUs. To perform well, these parallel super computers will require operating systems radically different from current ones. Amoeba and Mach are compared and contrasted in various areas, like process management, memory management, communication etc. Keyword: Operating systems, Distributed systems, Computer Networks, Microkernel, Amoeba, Mach. TABLE OF CONTENTS: ABSTRACT: ..................................................................................................................... 2 INTRODUCTION: ........................................................................................................... 4 HISTORY OF AMOEBA: ..................................................................................................... 4 HISTORY OF MACH: ......................................................................................................... 4 BASICS OF AMOEBA AND MACH OPERATING SYSTEMS: ............................... 6 AMOEBA: ......................................................................................................................... 6 MACH: ............................................................................................................................. 7 DESIGN GOALS: ............................................................................................................. 7 AMOEBA: ......................................................................................................................... 7 MACH: ............................................................................................................................. 8 ARCHITECTURE:........................................................................................................... 9 AMOEBA: ......................................................................................................................... 9 MACH: ........................................................................................................................... 10 CURRENT HARDWARE: ............................................................................................ 11 COMPARISON OF MICROKERNELS: .................................................................... 11 COMPARISON OF PROCESS MANAGEMENT: .................................................... 12 COMPARISON OF MEMORY MANAGEMENT:.................................................... 13 COMPARISON OF COMMUNICATION: ................................................................. 14 CONCLUSION: .............................................................................................................. 15 ACKNOWLEDGEMENTS: .......................................................................................... 16 REFERENCES: .............................................................................................................. 16 Introduction: Roughly speaking, we can divide the history of modern computing into the three eras: 970s: Timesharing (1 computer with many users) 1980s: Personal computing (1 computer per user) 1990s: Parallel computing (many computers per user) Until about 1980, computers were huge, expensive, and located in computer centers. Most organizations had a single large machine. In the 1980s, prices came down to the point where each user could have his/her own personal computer or workstation. These machines were often networked together, so that users could do remote logins on other people’s computers or share files in various ways. Nowadays some systems have many processors per user, either in the form of a parallel computer or a large collection of CPUs shared by a small user community. Such systems are usually called parallel or distributed computer systems. History of Amoeba: A group under the direction of Prof. Andrew S. Tanenbaum at the Vrije Universiteit (VU) in Amsterdam (The Netherlands) has been doing research since 1980 in the area of distributed computer systems. Now, Amoeba is being jointly developed there and at Center for Mathematics and Computer Science, also in Amsterdam. This research, partly done in cooperation with the Centrum voor Wiskunde en Informatica (CWI), has resulted in the development of a new distributed operating system, called Amoeba, designed for an environment consisting of a large number of computers. The chief goal of all this work is to build a distributed system that is transparent to the users. Amoeba is available for free to universities, government and other educational institutions. History of Mach: Mach traces its ancestry to the Accent operating system developed at Carnegie Mellon University (CMU). Although Accent pioneered a number of novel operating system concepts, its utility was limited by its inability to execute UNIX applications and its strong ties to a single hardware architecture that made it difficult to port. Mach’s communication system and philosophy are derived from Accent, but many other significant portions of the system (for example, the virtual memory system, task and thread management) were developed from scratch. An important goal of the Mach effort was support for multiprocessors. Mach’s development followed an evolutionary path from BSD UNIX systems. Mach code was initially developed inside the 4.2BSD kernel, with BSD kernel components being replaced by Mach components as the Mach components were completed. The BSD components were updated to 4.3BSD when that became available. By 1986, the virtual memory and communication subsystems were running on the DEC VAX computer family, including multiprocessor versions of the VAX. Versions for the IBM RT/PC and for SUN 3 workstations followed shortly. 1987 saw the completion of the Encore Multimax and Sequent Balance multiprocessor versions, including task and thread support, as well as the first official releases of the system, Release 0 and Release 1. Through Release 2, Mach provides compatibility with the corresponding BSD systems by including much of BSD’s code in the kernel. The new features and capabilities of Mach make the kernels in these releases larger than the corresponding BSD kernels. Mach 3 moves the BSD code outside of the kernel, leaving a much smaller microkernel. This system implements only basic Mach features in the kernel; all UNIX-specific code has been evicted to run in user-mode servers. Excluding UNIX-specific code from the kernel allows replacement of BSD with another operating system, or the simultaneous execution of multiple operating system interfaces on top of the microkernel. In addition to BSD, user-mode implementations have been developed for DOS, the Macintosh operating system, and OSF/1. This approach has similarities to the virtual-machine concept, but the virtual machine is defined by software (the Mach kernel interface), rather than by hardware. As of Release 3.0, Mach became available on a wide variety of systems, including single-processor SUN, Intel, IBM, and DEC machines, and multiprocessor DEC, Sequent, and Encore systems. Mach was propelled into the forefront of industry attention when the Open Software Foundation (OSF) announced in 1989 that it would use Mach 2.5 as the basis for its new operating system, OSF/1. The initial release of OSF/1 occurred a year later, and now competes with UNIX System V, Release 4, and the operating system of choice among UNIX International (UI) members. OSF members include key technological companies such as IBM, DEC, and HP. Mach 2.5 is also the basis for the operating system on the NeXT workstation, the brainchild of Steve Jobs, of Apple Computer fame. OSF is evaluating Mach 3 as the basis for a future operating-system release, and research on Mach continues at CMU and OSF, and elsewhere. Basics of Amoeba and Mach operating systems: Amoeba: Amoeba is a distributed operating system designed to connect together a large number of machines in a transparent way. Its goal is to make the entire system look to the users like a single computer. The system consists of two parts: a microkernel and server processes. An Amoeba system consists of several components, including a pool of processors where most of the work is done, terminals that handle the user interface, and specialized servers. All these machines normally run the same microkernel. The microkernel has four primary functions: 1. Manage processes and threads. 2. Provide low-level memory management support. 3. Support communication. 4. Handle low-level I/O. Amoeba has processes just like most operating systems have, Processes can have multiple threads of control within a single process, all of which share the process address space and resources. A thread is the active entity within a process. Each thread has a program counter, and its own stack, and executes sequentially. One of the most unique feature of Amoeba is that it is based on the idea of objects, each of which is named and protected by a 128-bit capability. When a process creates an object, the server managing the object returns a capability for that object. The capability contains bits telling which of the operations on the object the holder of the capability may perform. A typical capability is shown in figure. Figure 1 A typical capability The Port field identifies the server. The Object field tells which object is being referred to, since a server normally will manage thousands of objects. The Rights field specifies which operations are allowed (e.g., a capability for a file may be read-only). Since capabilities are managed in user space the Check field is needed to protect them cryptographically, to prevent users from tampering with them. Since capabilities are managed by user processes themselves, and can be given away by their owners, the rights are protected from tampering by encryption. As a consequence, different users may have capabilities for the same object, but with different rights. Mach: The Mach operating system is designed to incorporate the many recent innovations in operating-system research to produce a fully functional technically advanced system. Mach is based on five major concepts: processes, threads, ports, messages, and memory objects. A process, as in other systems, is a container for holding threads and other resources that are managed together. A thread is a lightweight process-within-a-process, as in Amoeba. A port is a mailbox that is used for communication. A message is a typed data structure that one thread can send to another thread’s port so the receiving thread can read it. Finally, a memory object is a coherent region of memory, all of whose words have certain shared properties and which can be manipulated as a whole. Mach is an example of an object-oriented system where the data and the operations that manipulate that data are encapsulated into an abstract object. Only the operations of the object are able to act on the entities defined in it. The details of how these operations are implemented are hidden, as are the internal data structures. Thus programmer can use an object only by invoking its defined, exported operations. The object oriented approach supported by Mach allows objects to reside anywhere in a network of Mach systems, transparent to the user. The port mechanism makes all of this possible. Design Goals: Amoeba: Three central design goals were set for the Amoeba distributed operating system: Network transparency: All resource accesses were to be network transparent. In particular, there was to be a seamless system-wide file system, and processes were to execute at a processor of the system's choosing, without the user's knowledge. Object-based resource management: The system was designed to be object based. Each resource is regarded as an object and all objects, irrespective of their type, are accessed by a uniform naming scheme. Objects are managed by servers, where they can be accessed only by sending messages to the servers. Even when an object resides locally, it will be accessed by request to a server. User-level servers: The system software was to be constructed as far as possible as a collection of servers executing at user-level, on top of a standard microkernel that was to run at all computers in the system, regardless of their role. An issue that follows from the last two goals, and to which the Amoeba designers paid particular attention, is that of protection. The Amoeba microkernel supports a uniform model for accessing resources using capabilities. Mach: Multiprocessor operation: Mach was designed to execute on a shared memory multiprocessor, so that both kernel-mode threads and user-mode threads could be executed by any processor. It is designed to run on computer systems ranging from one to thousands of processors. Operating system emulation: To support the binary-level emulation of UNIX and other operating systems, Mach allows for the transparent redirection of operating system calls to emulation library calls and thence to user-level operating system servers. Flexible virtual memory implementation: Mach provides the ability to layer emulation of other operating systems as well, and they can even run concurrently. Portability: Mach was designed to be portable to a variety of hardware platforms. For this reason, machine-dependent code has been isolated as far as possible. In particular, the virtual memory code has been divided between machine independent and machine-dependent parts Support for diverse architectures: Mach support for diverse architectures including multiprocessors with varying degrees of shared memory access: Uniform Memory Access (UMA), Non-Uniform Memory Access (NUMA), and No Remote Memory Access (NORMA) . Simplified kernel structure with a small number of abstractions: In turn these abstractions are sufficiently general to allow other operating systems to be implemented on top of Mach Network transparency: Distributed operation, providing network transparency to clients and an object-oriented organization both internally and externally Integrated memory management and inter-process communication: Integrated memory management and IPC in Mach to provide both efficient communications of large numbers of data, as well as communication-based memory management Heterogeneous system support: Due to heterogeneous support nature of Mach OS makes Mach widely available and interoperable among computer systems from multiple vendors Compatibility with UNIX: Mach is better able to satisfy the needs of the masses than the others operating systems because it offers full compatibility with UNIX 4.3BSD. Ability with varying inter computer network speed: Mach OS has an ability to function with varying inter-computer network speeds, from wide area networks to high-speed local-area networks and tightly coupled multiprocessors Architecture: Amoeba: The Amoeba architecture consists of four principal components described below. Pool Processor model: A group of CPUs that can be dynamically allocated as needed, used, and then returned to the pool. Work station: Workstations are assigned on one user per workstation, on which user can carry out editing and other tasks that required fast interactive response. All workstations are diskless and are used as an intelligent terminal. Normally, X-terminal is the best choice for this. Specialized servers: There are numbers of different kinds of servers available such as directory servers, file servers, data base servers, boot servers and various other servers with specialized function. Each server is dedicated to performing a specific function. Gateways: Gateways are used to link the Amoeba systems at different sites. dfd Figure 2:The Amoeba architecture The core idea about the Amoeba architecture is that all the Amoeba machines run the same kernel, which primarily provides multithreaded processes, communications services, I/O and little else. The kernel is always try to be small as possible to enhance its reliability and to allow as much as possible of the OS to run as user process, providing for flexibility. Mach: Mach architecture consists of a large multiprocessor, several small multiprocessor and number of workstation. In the architecture of the Mach OS, LAN added later. Mach has a concept of the home machine. Mach never attempts to spread out load. Workstations run the same kernel (binary image if appropriate) as high-end multiprocessors. Development and testing of multiprocessor applications can be made on personal workstation. Workstations provide a user interface to mainframe uni-processors and multiprocessors. Workstations may themselves be multiprocessors. Figure 3: Mach Architecture Current hardware: Amoeba currently runs on following platforms SPARC (Sun4c and Sun4m), 386/486 68030 Sun 3/50 Sun 3/60. At the Vrije Universiteit, Amoeba runs on a collection of 80 single-board SPARC computers connected by an Ethernet, forming a powerful processor pool. Mach currently runs on: MicroVAX I & II VAX11/750, 11/780, 11/785, 820x, 8300, 8600, 8650. VAX 11/784 - 411/780's with 8 megabytes of shared memory. VAX 11/789- 411/785's. IBM RT/PC. SUN 3. Encore MultiMax. Comparison of Microkernels: Amoeba is a complete distributed operating system constructed as a collection of userlevel servers supported by the microkernel while Mach is primarily microkernel designs geared towards the emulation of existing operating systems, in a distributed system. Amoeba has a minimal kernel while the Mach kernel functionality to support widest range of applications. Mach has more than five times number of systems calls as compare to the Amoeba. Comparison of Process Management: In parallel supercomputers, processes play an important role, since there are many of them and they must be allocated to processors, often dynamically. In many applications, processes synchronization is a key issue. Finally, for fine-grained computation, it is frequently useful to have multiple threads of control in a single process. In this section we describe how these issues are dealt with in Amoeba and Mach. Support multiple threads per process: Both systems support processes with multiple threads per process. In both cases, the threads are managed and scheduled by the kernel, although user-level threads packages can be built on top of them. Object model and capabilities: Amoeba is based on the object model, and has capabilities for processes, segments, and other kernel and user objects, providing an integrated naming and protection scheme for all objects in the entire system. Mach only has capabilities for ports managed by kernel in capability lists, one per process. Port capabilities passed in controlled way. Scheduling: Amoeba gives processes the choice of run-to-completion vs. preemptive scheduling for its threads. Mach allows processes to determine the priorities and scheduling policies of their threads in software. Multiprocessors: Both kernels run on multiprocessors, but they differ in how they use the CPUs. Unix emulation: Amoeba provides UNIX emulation via the library. Mach provides binary emulation of UNIX, MS-DOS, and other operating systems by catching system call traps and reflecting them to user-space emulators. Thread Synchronization: Thread synchronization is done by mutexes and semaphores in Amoeba. In Mach it is done by mutexes and condition variables. The mutex primitives are lock, tryock and unlock. The operations on condition variables are signal, wait and broadcast. Thread migration concept: Amoeba does not distribute the threads of a single process over the CPUs. They all run on the same processor. Instead, it is processes, not threads that are spread over the CPUs. Mach, in contrast, allows fine-grained control of which threads are assigned to which CPUs using the processor set concept. This mechanism allows true parallelism among the threads of a single process. Comparison of Memory Management: Memory management is one the fundamental concept uses in all operating systems, including those for parallel supercomputers. Depending on the CPU chip used, virtual memory may also be available, in which case it, too, must be managed. Memory management on parallel computers is not appreciably different from memory management on single--processor computers, but it is still important. Address space: In Amoeba, a process can have any number of variable length segments mapped into its virtual address space wherever it wants to. Once in, a segment can later be mapped out. Segments are controlled by capabilities, and can be read and written by any process holding the capabilities, including remote processes, for example, for debugging. Mach provides to its user processes is a linear address space from 0 to some maximum address. Within this address space, processes can define regions, which are ranges of addresses, and can map memory objects onto regions. Paging: Amoeba does not support demand paging, so all of a process segments must be in memory when it is running. Having a process entirely in memory all the time makes communication go faster since no checks have to be made when an outgoing message spans virtual page boundaries. No paging means Amoeba system is simpler and makes the kernel smaller and more manageable. In Mach pages can be moved in and out of memory, as space requires. A memory object need not be fully in memory to be used. When an absent page is touched, its external pager is told to find it and bring it in. This mechanism supports full demand paging. Object based system and page based system: Amoeba supports an object-based system that allows variable-sized software objects to be shared using the kernel’s reliable broadcast mechanism. Mach has a network message server that supports a page-based system. Distributed shared memory: Both systems support distributed shared memory but they do it in different ways. Amoeba supports shared objects replicated on all machines. Objects can be of any size and support operations. Reads are done locally and writes use reliable broadcast. Mach use paged based distributed shared memory. Comparison of Communication: Communication is probably the true test of a parallel operating system. In a single processor operating system, interprocess communication is internal, and is thus managed entirely locally. In a parallel supercomputer, interprocess communication means sending messages between machines over some kind of network. These messages must be sent reliably and with a minimum of overhead. Most of the communication protocols developed for conventional networking, such as TCP, OSI, and X.25 are too heavyweight for use in a parallel system. In particular, the critical parameter here is not the bandwidth achievable for long transmissions, but the delay required for short messages. A scheme called remote procedure call (RPC) potentially provides a simple, high performance interprocess communication mechanism that retains most of the semantics of local procedure call. Very briefly, the calling process calls a library procedure, the stub that has the same interface and parameters as a procedure on the remote machine. The library procedure packs the parameters into a message and sends them to a complementary procedure on the remote machine, which unpacks them and makes the actual call to the remote procedure. The reply goes the other way and unblocks the caller when it gets back. In this way, the neither the calling nor called procedure is aware that they are on different machines. This technique is widespread in distributed and parallel systems. Forms of communication: Amoeba supports three forms of communication Unreliable one-way message passing Reliable RPC Reliable totally ordered group communication. Mach supports only one form of communication. Reliable one-way message passing. Messages typed: Messages in Amoeba have a fixed part and a variable part and are untyped. Messages in Mach have only a variable part and are typed. Message transmission: Amoeba uses a custom protocol called FLIP (Fast Local Internet Protocol) for actual message transmission. This protocol supports both RPC and group communication and is below them in the protocol hierarchy. In OSI terms, FLIP is a network layer protocol, whereas RPC is more of connectionless transport or session protocol. Mach allows messages to be transmitted from one process to another using the copy-onwrite mechanism. Amoeba does not have this. However, when sending messages between machines it is of no use; copying is always needed. Ports: In Amoeba, messages are addressed to service addresses. The receiving thread must do a RECEIVE call that provides the buffer directly in user space. In Mach, messages are sent to ports. These are data structures managed by the kernel that provide message storage. Mach support port sets, although they are only for receiving, not sending. Mach’s ports are named by capabilities managed by the kernel and referred to by their indices in the kernel’s capability list. Only Mach has SEND-ONCE capabilities. Only Amoeba allows multiple replicated servers to listen to the same service address to stochastically distribute the requests over the servers. Conclusion: In this paper I have thoroughly surveyed and compared two operating systems, Amoeba and Mach. I have found that these two OS microkernels are quite similar in various ways. Since both OS were independently designed and implemented but it is surprising fact that these differences are not very big. Both Amoeba and Mach were tested several times by their designer and they changed OS several times based on experiences gained form earlier versions. Here are some important points Amoeba microkernel manages process and memory, and handles communication. At the lowest lever, processes started by generating a process descriptor and doing RPC with the kernel thread responsible for starting new processes on the target machine. Higher-level services help with locating a suitable machine to run on. Memory is based on the concept of segments, which can be mapped into and out of processes address spaces at arbitrary addresses. Communication comes in three forms: Unreliable one-way message passing, Reliable RPC, and Reliable totally ordered group communication Mach uses lightweight processes, in the form of multiple threads of execution within one task (or address space), to support multiprocessing and parallel computation. Mach supports only one form of communication: reliable one-way message passing. Mach only has capabilities for ports managed by kernel in capability lists, one per process. Port capabilities passed in controlled way. By providing low-level, or primitive, system calls from which more complex functions may be built, Mach reduces the size of the kernel while permitting operating-system emulation at the user level. Acknowledgements: I would like to thank Prof. Marius Soneru for his help in this research paper and Mr. Nadeem Surmust for his helpful comments. References: Web Address: http://www.cs.vu.nl/pub/amoeba/ http://www-2.cs.cmu.edu/afs/cs.cmu.edu/project/mach/public/www/mach.html http://www.scs.carleton.ca/~csgs/resources/mach_papers.html Kaashoek, M.F., Tanenbaum, A.S., and Verstoep, K. 1993. Group Communication in Amoeba and its Applications. Distributed Systems Engineering J. 1, (July), pp. 48-58. Mullender, S.J., Rossum, G. van, Tanenbaum, A.S., Renesse, R. van, and Staveren, H. van 1990. Amoeba—A Distributed Operating System for the 1990s. IEEE Computer Magazine, 23, 5 (May), pp. 44-53. Tanenbaum, A.S., Kaashoek, M.F., Renesse, R. van, and Bal, H. 1991. The Amoeba Distributed Operating System - A Status Report. Computer Communications , 14, 4 (July-Aug.), pp. 324-335. Tanenbaum, A.S., Renesse, R. van, Staveren, H. van., Sharp, G.J., 1990. Mullender, S.J., Jansen, J., and Rossum, G. van Experiences with the Amoeba Distributed Operating System. Commun. of the ACM 33, 12 (Dec.), pp. 46-63. Andrew S. Tanenbaum & M. Frans Kaashoek ,”The Amoeba Microkernel,” IEEE Computer 1994. Baron, R.; Rashid, R.; Siegel, E.; Tevanian, A. And Young, M. 1985. Mach-1: An Operating Environment for Large-Scale Multiprocessor Applications. IEEE Software, 2, 4 (July), pp. 65-67. A. S. Tanenbaum, "A comparison of three microkernels ". J. Supercomput. 9(1/2), pp. 7-22, 1995. Black, D.L., Golub, D.B., Julin, D.P., Rashid, R.F., Draves, R.P., Dean, R.W., Forin, A., Barrera, J., Tokuda, H., Malan, G., and Bohman, D. 1992. Microkernel Operating System Architecture and Mach. In Proc. USENIX Workshop on Microkernels and Other Kernel Architectures , USENIX Association, pp. 11-30.