STUDY GUIDE – FINAL EXAM Introduction/Review Be able to define system call, mode switch, context switch. Know the purpose of each. Be able to reproduce the process state transition diagram and describe the characteristics of each state and the kinds of events that cause state transitions. Know the difference between deadlock and starvation. Kernel Architectures & Virtual Machine Monitors What is the motivation for developing microkernel and other extensible operating system architectures?.(What problem do they address?) Be able to briefly describe the main mechanism used in SPIN (extensions) and Exokernel (library operating systems) and explain how the mechanisms address the problem mentioned above. Be able to define virtual machine and virtual machine monitor. Know the difference between full virtualization and para-virtualization. What are the advantages of virtual machine technology? Distributed Systems – Introduction and Architectures What is middleware? How does it contribute to transparency (single system image) in distributed systems? What are some techniques for adding transparency into a distributed system? Discuss scalability in distributed systems and be familiar with various scaling techniques. Be able to compare cluster computer systems and grid computer systems. Be able to compare centralized (client/server) and decentralized (peer-to-peer) architectures for distributed systems. What is the main difference between structured and unstructured P2P systems? Identify Distributed Hash Tables and the Chord algorithm; understand their purpose. Processes, Threads, Virtual Machines in Distributed Systems Threads/lightweight processes – what are they, why are they useful What are the benefits of a multithreaded server? What is the role of virtualization in distributed systems? Communication in Distributed Systems What is the difference between reliable and unreliable communication? What is the difference between a synchronous and an asynchronous primitive? Understand the structure and operation of Remote Procedure Calls and why they may be preferred to messages. Synchronization, Mutual Exclusion, Transactions, Data-race Detection Definitions: mutual exclusion, critical section, data race, etc. Know the characteristics of the producer/consumer and readers/writers problem. Know how the P and V operations on semaphores work. Be able to state the algorithm that was presented in class, and know how to apply it in situations such as the homework problems or class examples. What is Lamport’s happened-before relation? Know the three components of the definition. Understand the difference between causally related events and concurrent events. How does “happens-before” differ (in purpose or application) from total ordering? What is the biggest shortcoming associated with Lamport’s virtual clocks? What is the advantage of vector clocks over Lamport clocks? Given the four algorithms for ensuring mutual exclusion in a distributed environment, be able to explain a) how the algorithm works b) how a process knows when it can enter the critical section c) the main problems associated with this algorithm d) how fault tolerant the algorithm is (justify your answer) e) how efficient they are, in terms of message overhead. Be able to define “transaction”, explain the four ACID properties (especially atomicity and isolation), and describe what it means for a transaction to “commit”. What is the main problem addressed by RaceTrack and Eraser? Briefly, how do each of these algorithms approach the problem? Why might a distributed system need an election algorithm? Distributed System Principles: Naming, Replication & Consistency, Fault Tolerance Why is an identifier more suitable than an address as a name for an object in a distributed system? What is location-independent naming? How does the Chord algorithm support it? What are two arguments in favor of data replication? What problems does data replication introduce? Fault tolerance: definition, use of redundancy to mask failures File Systems What is an i-node? Why did FFS introduce cylinder groups? (Consider performance and reliability) Give an argument in favor of large block sizes in file systems and an argument against large block sizes. How did FFS solve this issue? What are the major issues that must be addressed in a distributed file system? What is network transparency, how can it be achieved? How does the architecture of a cluster file system, such as Google File System, differ from a traditional clientserver system? How does it differ from a Peer-to-Peer system? What is the difference between a stateless and a stateful file system? Be able to give an advantage and disadvantage of each approach. Understand the meaning of UNIX semantics, session semantics, immutable semantics, and transaction semantics. Be able to compare server-initiated and client-initiated cached consistency. Distributed file systems implement various measures to deal with consistency. Why is file locking still desirable? In general, what are the advantages and disadvantages of client-side caching in a distributed file system? Compare to the advantages and disadvantages of server-side replication. What is the special emphasis of the Coda file system? The Coda file system sometimes uses callbacks to communicate with clients. What is their purpose? How does Coda check for inconsistency in file replicas? Memory Management Understand the basic principles of memory management: motivation for virtual memory, implementation of virtual memory, problems with virtual memory, and the solutions. Understand the function of a page table and a TLB What problem is addressed in the superpage paper? Discuss the two main ways an operating system can implement superpages (relocation, reservation) and the advantages & disadvantages of each. Be able to state an argument for the use of multiple page sizes in a system, and an argument against the use of multiple page sizes. What approach is adopted in the paper we read? Understand the problems of memory fragmentation, contiguity management, and other issues introduced by the use of superpages MapReduce Be able to describe the main purpose of this programming methodology. What kinds of problems are suitable for this kind of processing? How does parallelism improve performance? What do the map and reduce phases do (give examples) Know how the processing is distributed across a Google cluster; understand how the MapReduce distribution of work using Master/Worker organization differs from the way file accesses are processed in the Master/Chunkserver organization. What kind of fault tolerance measures does the MapReduce algorithm support? (No need to know all details, just be able to give a general description.) Dynamo Dynamo storage management system is designed to handle applications based on a primary key/value lookup system. What are Dynamo’s performance requirements? (Incremental scalability, high availability – 99.9% for writes – and fault tolerance). How does the design of the system enforce the requirements? Why is a DBMS system inappropriate for these applications? Understand the conflict between strong consistency and high availability. What is “always writeable” data storage and how does Dynamo support it? (Hint: think of things like replication, eventual consistency versus strong consistency, resolution of inconsistencies during reads, not writes) What are key design principles/considerations for Dynamo? What is the architecture of a Dynamo storage management system? What is “eventual” consistency? What technique do Dynamo and Coda use to recognize replica inconsistencies/conflicting versions? What is consistent hashing? Sloppy Quorum & gossip-based protocol: for what purpose are they used in Dynamo?