Distributed Operating Systems and Process Scheduling Brett O’Neill CSE 8343 – Group A6 1 Overview What is a Distributed Operating System (DOS)? Advantages of a DOS Process scheduling/allocation in centralized systems Process scheduling/allocation in distributed systems DOS Scheduling Algorithms Questions? References 2 What is a Distributed Operating System (DOS)? A DOS is a collection of heterogeneous computers connected via a network. The functions of a conventional operating system are distributed throughout the network To users of a DOS, it is as if the computer has a single processor. 3 What is a Distributed Operating System (DOS)? (cont.) The multiple processors do not share a common memory or clock. Instead, each processor has its own local memory and communicates with other processes through communication lines. 4 What is a Distributed Operating System (DOS)? (cont.) The goal of a DOS is to provide a common, consistent view of: File systems Name space Time Security Access to resources while keeping the details transparent to users. 5 What is a Distributed Operating System (DOS)? (cont.) Transparency Description Access Hide differences in data representation and how a resource is accessed Location Hide where a resource is located Migration Hide that a resource may move to another location Relocation Hide that a resource may be moved to another location while in use Replication Hide that a resource may be shared by several competitive users Concurrency Hide that a resource may be shared by several competitive users Failure Hide the failure and recovery of a resource Persistence Hide whether a (software) resource is in memory or on disk 6 Advantages of a DOS Resource Sharing – a user at one site can use the resources available at another. Computation Speedup – if a particular computation can be partitioned into a number of subcomputations, the DOS can allow the distribution of computing among various sites to run concurrently. 7 Advantages of a DOS (cont.) Reliability – if one site fails, the remaining sites can continue operating. Communication – users at different sites have the opportunity to exchange information. Data Migration – data can be transferred between sites, either by transferring entire files or portions of files which are needed. 8 Advantages of a DOS (cont.) Computation Migration – computations can be transferred across the system. For example, if a large file resides on a site other than the initiator, computations on the file can be done where the file resides using remote procedure calls. 9 Advantages of a DOS (cont.) Process Migration – it is often advantageous to execute processes at different sites for the following reasons: Load balancing – even the workload among CPU’s Computation speedup – total process time can be reduced Hardware preference – a process may be more suitable for execution on one processor than another Software preference – a process my require software only available at one site 10 Process scheduling/allocation in centralized systems In a centralized system, there is never more than one running process. If there are more processes, the rest will have to wait until the CPU is free and can be rescheduled. As processes enter the system, they are placed in the job queue, which consists of all processes in the system. The processes waiting to execute are kept in the ready queue. Processes waiting for a particular I/O device are placed in the device queue. 11 Process scheduling/allocation in centralized systems (cont.) A process migrates through the various queues throughout its lifetime. The operating system selects from these queues in some fashion. The operating system has 2 schedulers for this task: Long-term scheduler (job scheduler) – selects processes from a spool and loads them into memory for execution Short-term scheduler (CPU scheduler) – selects from among the processes ready to execute, and allocates the CPU to one of them The long-term scheduler must make a careful process mix of I/O-bound and CPU-bound processes. 12 Process scheduling/allocation in centralized systems (cont.) There are several scheduling algorithms to determine which process in the ready queue should be allocated to the CPU next: First-Come, First-Served Shortest-Job-First Priority Scheduling Round-Robin Scheduling Multilevel Queue Scheduling Multilevel Feedback-Queue Scheduling 13 Process scheduling/allocation in distributed systems Similar to centralized systems, but other factors to consider. There are several factors to take into account when deciding how to allocate processes in distributed systems: 1. 2. Scheduling levels – local scheduling deals with allocating processes to a processor, much like in a centralized system. Global scheduling deals with choosing the location at which a process will be executed. Load distribution goals – load balancing strives to maintain an equal load throughout the distributed system. Load sharing strives to prevent any particular location from becoming too busy. 14 Process scheduling/allocation in distributed systems (cont.) 3. Scheduling efficiency goals – Optimal Scheduling Algorithm – the state of all competing processes and all related information must be available to the scheduler. These solutions are NP-hard for more than two processors. Sub-Optimal solutions – Approximation – attempt to find a reasonably efficient solution as quickly as possible by limiting the search space. Heuristics – employ “rules of thumb”: Dependent processes should be located in close proximity Independent processes that change shared files should be located in close proximity Divisible processes with little or no precedence relationships should be distributed If the load is heavy at a particular location, do not schedule other processes there 15 Process scheduling/allocation in distributed systems (cont.) 4. 5. Processor binding time – in static binding, the process is assigned to a processor before its execution in deterministic fashion. In dynamic binding, the process is assigned at execution time. Scheduler responsibility – the scheduler can reside on a single processor at a centralized server or can be physically distributed among various processors. 16 DOS Scheduling Algorithms 1. Usage Points A usage table is kept on a centralized server. The goal is to allocate usage among processors fairly. When a host takes a process to execute, it receives credit and one point is reduced from its entry in the usage table. When a host requests a resource that is not local, its usage table entry is increased one point, called debt. 17 DOS Scheduling Algorithms 2. Graph Algorithms The scheduling problem is represented as a unidirected graph. Processes are represented as P nodes, processors are represented as L nodes, meaning location. A P node is linked to an L node if P is running on L. A P node is linked to another P node if the processes are communicating. The weight on a link represents a communication cost or execution cost. 18 DOS Scheduling Algorithms 2. Graph Algorithms (cont.) Minimum-cut – When a vertex is cut, its associated links are removed. A minimum cut-set represents the lowest cost possible for cutting a set of particular vertices. When cut, the different set of vertices represent a different set of processors. A minimum cost cut-set represents the assignment of processes to locations with minimum communication costs. 19 DOS Scheduling Algorithms 2. Graph Algorithms (cont.) Maximum-flow – If the weight associated with each edge represents the cost of communication, then minimum cost of communication can result in maximum flow of information. Thus finding maximum-flow is equivalent to finding minimumcut. Therefore total execution and communication costs can be minimized and concurrency can increase. 20 DOS Scheduling Algorithms 3. Probes Messages can be sent out to members of the distributed system to locate an appropriate processor for each process. Complete, global information is not necessary in this case. The scheduler can use localized information to determine where to execute the process. 21 DOS Scheduling Algorithms 4. Scheduling Queues Quite similar to centralized operating systems. There are two types of queues: Local queue – the standard queue that exists at each location to maintain a list of processes to run locally Global queue – a queue that maintains a list of processes that can run at different locations 22 DOS Scheduling Algorithms 4. Scheduling Queues (cont.) A good example is MACH: Scheduling is at the thread level. Each location has a set of local queues and a set of global queues with different priorities. Users can use “scheduling hints” which have 3 different priority levels. Scheduling hints allow the user to bypass the scheduling queue. 23 DOS Scheduling Algorithms 5. Stochastic Learning Heuristics are used to identify the best possible action based on previous actions, learning from experience: All possible scheduling actions are associated with a probability. All probabilities are initialized to certain values when the DOS is set up. After a process is sent to a location, the destination sends information back to the source. If the destination is overloaded, it sends penalty points back to the source. If it is underloaded, it sends back bonus points. These points are used to adjust probabilities accordingly. 24 DOS Scheduling Algorithms 6. Automaton Vector An extension of stochastic learning. Each entry in the automaton vector represents the workload of a node. The contents of the vector are utilized with the probabilities associated with that node as a weighted load. This data is used to determine where each process should be executed. The automaton vector can collect and calculate data on the fly. 25 Questions? 26 References Silberschatz, Galvin, Gagne, Applied Operating System Concepts Prashant Shenoy, University of Massachusetts Mark Sebern, Milwaukee School of Engineering Jonathon Hodgson, St. Joseph’s University 27