Systems Area Qualifier Written Exam (2008 Fall) You should choose 6 out of 9 questions in this written exam. Good luck! Question 1. Distributed system design Consider an RPC subsystem. a) Give a breakdown of the costs involved in performing an RPC in a RPC subsystem. b) As a system designer, what are the avenues available to you for shaving each of the component cost you identified in part (a)? In discussing such avenues, you have to clearly state what assumptions you are making about the execution environment (OS and hardware) to shave the costs, and the pros and cons of your design choices. c) Given today’s multi-core platforms, come up with at least one new way of reducing RPC cost you have not seen in prior RPC papers. Describe and defend it. Question 2. OS structures a) Microkernel based OSs will always be less efficient compared to Monolithic kernels. Based on your understanding of OS structures, is this statement True or False? Explain your answer with technical detail about some specific microkernel design. b) To what extent has the idea of configurable operating system kernels (like SPIN) influenced commercial operating systems? Compare that with the influence of microkernels (e.g., L3) or `thin’ OSs (e.g., Exokernel). (Note: For the sake of this question, any flavor of Unix and Microsoft Windows fall under commercial OS). You have to clearly explain how they have been or have not been influential and back such statements with facts and reasons. Question 3. Event Processing in Distributed Systems Researchers in distributed systems have posited that it may not be necessary to precisely know the causal relationships between events in different processes, but instead, it is sufficient to understand which set of events is concurrent (i.e., events that could not have influenced each other and therefore, could not be the ones to cause bugs or give rise to race conditions). Are algorithms that identify concurrency or not concurrency for events less or more complex (or the same) to implement than algorithms that determine causality? Elaborate your answer. Hint: a good way to start thinking about this point may be to draw time diagrams for sets of representative events. Question 4. Parallel Systems Multicore platforms have become unavoidable. An issue with future platforms with hundreds of cores is how to program/organize them in order to attain high levels of performance. Approaches advanced in the literature include i) forming multiple `cells’ comprised of smaller numbers of cores, where each cell has independent failure properties and is isolated from other cells in terms of performance, ii) adding hardware features like token busses for inter- and/or intra-cell coordination, and others. In this question, you are asked to design a programmable collective communication/computation construct that can be used for cross-cell program coordination, the assumption being that within a single cell we used standard coordination (e.g., synchronization) primitives but across different cells we use message-like synchronization using your construct. a) Describe the design of your programmable coordination construct. b) Illustrate its use with a simple example of a hypervisor-level service (i.e., the HV enforces and manages cells). Hint: consider scheduling. c) Speculate on useful hardware support fore your construct. Question 5. Virtualization A key problem with system virtualization is I/O. Some recent proposals have resurrected IBM’s idea of channels and channel processors to ‘fix’ some of the issues with I/O. This question explores that solution. a) Using a standard Xen system, explain why and to what extent I/O is a performance issue. Make sure you discuss both full and para-virtualization solutions to I/O. b) Given I/O channels, what capabilities do they have to have in order to address the I/O problems you have identified in (i)? Use this description to define your notion of I/O channel. c) Speculate on the hardware support needed to improve the viability of this approach, for standard ia-based architectures (Hints: Intel’s VT architecture is one start on this. IBM’s z system has lots of support for this) Question 6. Real Time Systems Real-time systems have time constraints that impose deadlines on task completion, consequently the allocation of system resources must take into account the task deadlines. a) Explain the differences between hard real-time and soft real-time constraints. b) Give an example of CPU scheduler that can achieve hard real-time guarantees. You are required to elaborate your example by answering the following questions: i) Describe how the scheduler works. ii) Give detailed explanation of the assumptions made by the scheduler. iii) Outline the performance of the scheduler in terms of worst case achievable utilization and discuss its overhead. c) Explain the difficulties that may arise when trying to apply the scheduler you described in (b) to a distributed system with heterogeneous network connections. Question 7. Replication and Fault-Tolerance in Distributed Systems Distributed state sharing is an important capability in distributed file systems with replication or distributed shared memory systems. Consider the following problem that would arise when we are concerned about when and how often certain state is accessed by various nodes of the distributed system. Assume that in addition to reading and writing common state, the users are also allowed to query when and on what nodes certain objects were read or written. The returned information may be per node count or more detailed information on resource utilization, including timestamps. This information may be used to detect anomalous access to objects, to recover some state information due to node failure, and so forth. We want any node to be able to access this information even if some other nodes fail. You are asked to develop an algorithm for making such information available to certain users under the following conditions: (a) Node failures are not considered and but consistent results (correct number of reads and writes) must be provided. (b) Nodes may experience crash failures and results returned may be stale or may only reflect operations that executed at non-faulty nodes. (c) Nodes can experience Byzantine failures. In this case, returned results must be correct with respect to all reads and writes that happen at non-faulty nodes. You may have to limit the number of failures in cases (2) and (3). If you do, specify the maximum number of failures that can be tolerated. If a solution cannot be developed in the presence of certain kind of failures, you need to explain why that is the case. Question 8. Distributed File Systems Network file systems such as the Andrew File System (AFS) and the World Wide Web both provide users access to remote files, but the two systems have very different user interfaces. In network file systems, the user only needs to mount the remote file system onto the local machine. Then he or she can access remote files just as if they were local. In WWW, a "browser" sends an "address" (a URL) of the file to a Web Server and displays the result to the user. (a) Discuss the major differences between the two systems, and elaborate your answers. Hint: At least you need to discuss how these two systems differ from granularity of file accesses; semantics of the caching of remote files on local disks; and handling concurrent reads and writes of a file. (b) Since the two systems provide different services to the end users. Naturally, their implementations are different. Based on your understanding on the main features of the AFS file system implementation, suggest an algorithm for implementing the WWW and discuss why your algorithm is better than the current WWW. Hint: You do not need to describe how a browser implements the display of the Web documents, but you should suggest a framework for maintaining a cache of recentlyaccessed documents to decrease network traffic and for making sure that the documents in the cache are up-to-date. (c) If AFS were universally available, would that simplify the task of implementing the WWW? Elaborate your answer. Question 9. Specialization (Read the entire question before starting to answer.) Program specialization is similar to partial evaluation. A generic program can be specialized when there are _invariants_ that make some sequence of instructions superfluous, since they always produce the same results at the end. The idea of program specialization is to replace the sequence of instructions with the result, thus improving program performance. The main difficulty in OS specialization is that there are _quasi-invariants_ that remain true almost all of the time, but that could be invalidated in rare situations. (a) In their SOSP'95 paper, Pu et al describe the specialization of the HP-UX file system. They specialize the read system call using the exclusive sequential access quasi-invariant, which is considered the most common case in Unix file systems. Give an example of quasi-invariant in another OS module that may improve system performance through specialization. Hint: I/O subsystems are easier candidates. (b) To guard against the rare cases when quasi-invariants may be invalidated, we need to insert _guards_ into the system when applying program specialization. For example, in the specialization of HP-UX read system call, they inserted guards into the open system call, which may invalidate the exclusive access quasi-invariant when a second process opens the same file. A less obvious guard is inserted into the dup system call, which may produce the same result. For the example of quasi-invariant you gave in sub-question (a), give two examples of places you need to guard against quasi-invariant invalidation. If you believe there is only one guard necessary, present an argument that you don't need any other guards.