Context switch: - Switching the CPU to another process requires saving the state of the old process and loading the saved state for the new process. - Context switch time is pure overhead, because the system does not useful work while switching. - Its speed varies from machine to machine, typically ranges from 1-1000 microseconds. Operations on processes: - a process may create several new processes. Via a create process system call during the course of execution. - The creating process (parent process); new processes (children) which in turn create other processes forming tree. - A process needs certain resources (CPU time, memory, files, I/O devices) to accomplish its task. - A sub process may obtain its resources directly from the OS, or it may be constrained to a subset of the resources of the parent process. - Parent process may have to partition its resources among its children, or share some resources (memory, files) among children. - When a process creates a new process, two possibilities: 1 + Parent execute concurrently with its children. + Parent waits until some or all children have terminated. Process termination: - A process terminates when it finishes executing its last statement and asks the OS to delete it by using the exist system call. - Process may return data to its parent (through wait system call). - All resources (including physical and virtual memory, open files, I/O buffers) are deallocated by OS. - Additional circumstances with termination. A process can cause the termination of other process via an appropriate system call (abort), which can be invoked only by the parent. - Parent needs to know the identities of its children. (When one process creates a new process, the identity of the newly created process is passed to the parent). - Parent may terminate one of its children, because: + Child exceeds its usage of resources allocation to it. + Task assigned to the child is no longer required. + The parent is exiting. 2 Dispatcher: is the module that actually gives control of the CPU to the process selected by the short-term scheduling. - This function involves loading the registers of the process, switching to user mode, and dumping to the process location in the user program to restart it. - It should be as fast as possible. Scheduling algorithms: - Deals with the problem of deciding which of the processes in the RQ is to be allocated the CPU. - There are many different CPU scheduling algorithms. - In choosing which algorithm to use in a particular situation, there are many criteria have been suggested for comparing CPU scheduling algorithms: these are: + CPU utilization: we want to keep the CPU as busy as possible. In real system, CPU utilization should range from 40% to 90%. + Throughput: one measure of work is the number of processes that are completed per time unit (say second, minute, hour). Turnaround time: the interval from the time of submission to the time of completion. I.e. how long it takes to execute a process. 3 Turnaround time is the sum of the periods spent waiting to get into memory, waiting in the RQ, executing in the CPU, and doing I/O. Waiting time: the CPU scheduling algorithm does not affect the amount of time that a job execute or does I/O. the algorithm affects the amount of time that a job spends waiting in the RQ. We may consider the waiting time for each job. Response time: in an interactive system, turnaround time may not be the best criterion. Thus another measure is the time from submission of a request until the 1st response is produced (this called response time) i.e. the amount of time it takes to start responding, but not the time that it takes to output that response. - It is desirable to maximize CPU utilization and throughput, and to minimize turnaround time, waiting time, and response. Scheduling algorithms: CPU scheduling deals with the problem of deciding which of the processes in the RQ is to allocated the CPU. There are many CPU scheduling algorithms: 1. first-come, first-served scheduling (FCFS) - The simplest CPU scheduling algorithm. - With this scheme, the process that requests the CPU first is allocated the CPU first. 4 - The implementation is managed with a FCFS queue. - When a process enters the RQ, its PCD is linked on to the tail of the queue. - When the CPU is free, it is allocated to the process at the head of the queue. - The performance of the FCFS is quite poor. I.e. the average waiting time under FCFS is quite long. Ex: consider the following jobs (processes) that arrive at time (0). Process Burst time P1 24 P2 3 P3 3 If the jobs arrive in the order above, the Gantt chart: P1 0 P2 24 P3 27 30 - Waiting time will be 0 for p1, 24 for p2, 27 for p3. Average waiting time = (0 + 24 + 27) / 3 =17 - The turnaround time for p1 is 24, p2 is 27, p3 is 30. Average turnaround time is (24 + 27 + 30) / 3 =27 5 - If the jobs arrive in the order (p2, p3, p1) the result will be: P2 P3 P1 0 3 6 30 The average waiting time is (6+0+3)/3=3 Ms The average turnaround time is (30+3+6)/3=13 Ms - Thus, the average waiting time is generally not minimal and may vary substantially. - The same for the average turnaround time. 2.shortest job first scheduling algorithm (SJF) - Associate with each process the length of the latter’s next CPU burst. - When the CPU is available, it is assigned to the process that has the smallest next CPU burst. - If two processes have the same length next CPU burst, FCFS is used to break the tie. Ex: Consider the following set of jobs: Process P1 P2 P3 P4 Burst time 6 8 7 3 6 We would schedule these jobs as follows: P4 0 P1 3 P3 9 P2 16 24 * Waiting time: 3 for p1 16 for p2 9 for p3 0 for p4. Average waiting time = (3+16+9+0)/4=7 While if we use FCFS would be 10.25 * Transportation time: 9 for p1 24 for p2 16 for p3 3 for p4 Average transportation time= (9+24+16+3)/4=13 While if we use FCFS would be 16.25 - SJF is provably optima, in that it gives the minimum average waiting time for a given set of jobs. - The real difficult with SJF is knowing the length of the next CPU request. - For long-term (job) scheduling in a batch system, we can use the time limit specified by the user during the submission as the length. - This motivates the user to estimate the process time limit accurately. 7 - Since a lower value may mean faster response, while too low a value will cause a “time limit exceeded” error and require resubmission. - SJF scheduling is used frequently in long-term scheduling (job scheduling). - Although SJF is optimal, it cannot be implemented at the short-term CPU scheduling level. - There is no way to know the length of the next CPU burst. - SJF algorithm may be either preemptive or nonpreemptive. - The choice arises when a new process arrives at the RQ while a previous process is executing. - The new process may have a shorter next CPU burst than what is left of the currently executing process. - A preemptive SJF algorithm will preempt the currently executing process, whereas a nonpreemptive SJF algorithm will allow the currently running process to finish its CPU burst. - Preemptive SJF scheduling is sometimes called shortest-remaining-time-first scheduling. 8 Ex: consider the following set of processes: Process Arrival time Burst time P1 0 8 P2 1 4 P3 2 9 P4 3 5 If the processes arrive at the RQ at the time shown, we have: P1 P2 P4 P1 P3 0 1 5 10 17 26 The average waiting time= ( (10-1)+(1-1)+(17-2)+(5-3) )/4=26/4=6.5 While a nonpreemptive SJF waiting time= 7.75 3. Priority scheduling: - SJF is a special case of the general priority scheduling algorithm. - A priority is associated with each process, and the CPU is allocated to the process with the highest priority. - Equal priority processes are scheduled in FCFS order. - There is no general agreement on whether (0) is the highest or lowest priority. 9 - Some systems use low numbers to represent low priority; others use low numbers for high priority. Ex: Consider the following set of processes, assumed to have arrived at time 0, in the following sequence: Process P1 P2 P3 P4 P5 P2 0 P5 1 Arrival time 10 1 2 1 5 P1 6 Burst time 3 1 3 4 2 P3 16 P4 18 19 The average waiting time= (6+0+16+18+1)/5=41/5=8.2 - Priority can be defined internally or externally. - Internally (time limits, memory requirements, No. of open files, Ratio of average I/O burst to average CPU burst). - External (type and amount of funds being paid for computer use, the department sponsoring the work, political factors). - Priority scheduling can be either preemptive or nonpreemptive. 10 - When a process arrives at the RQ, its priority compared with the priority of the currently running process. - Preemptive priority will preempt the CPU if the newly arrived process has a higher priority than the running one. - Non-preemptive will put the new process at the head of the RQ. - Major problem with priority scheduling is indefinite blocking or starvation. - A process, which is ready to run, but lacking the CPU can considered blocked, waiting for the CPU. - A priority algorithm can leave some low-priority processes waiting indefinitely from the CPU. - The solution to this problem is aging. - Aging is gradually increase the priority of jobs that wait in the system for a long time. Ex: if priority ranges from 0 to 128, we could increment a waiting job’s priority by 1 every 15 minutes. (Takes 32 hours) 4. Round-Robin scheduling (RR): - Is designed specially for time-sharing systems. - It is similar to GCGS scheduling, but preemption is added to switch between processes. - A small unit of time (time quantum or time slice) is defined. 11 - A time quantum is generally from (10 to 100) milliseconds. - The RQ is treated as a circular queue. - The CPU scheduler goes around the ready queue, allocating the CPU to each process for a time interval up to a quantum in length. - To implement RR, the RQ is kept as a FIFO queue of processes. - The average waiting time under RR policy is quite long. Ex: consider the following set of jobs that arrive at time 0 Job P1 P2 P3 P1 0 P2 4 7 Burst time 24 3 3 If we use a time quantum of (4) P3 P1 P1 10 14 P1 18 P1 22 P1 26 30 The average waiting time= ( (26 – 4 * 5) + 4 + 7 ) / 3 = 17/3 =5.66 The average turnaround time= (30+7+10)/3 = 47 / 3 = 16 12 - The performance of RR depends heavily on the time quantum. - If the time quantum is very large (infinite), RR is as FCFS. - If the time quantum is very small (say 1 microsecond), RR is called processor sharing. (As if n processes have n processors with speed 1/n). 5. Multi-level queues scheduling: - Another class of scheduling algorithm has been created for situations in which jobs are easily classified into different groups. - A common division is made between foreground (interactive) processes and background (batch) processes. (Have different response time might need different scheduling algorithms). Highest Priority System processes Interactive processes Editing Interactive Batch Processes Lowest Priority System Processes 13 - Multi-level queue algorithm partitions the RQ into separate queues. - Jobs are permanently assigned to one queue, based upon some property of the job (such as memory size, or job type). - Each Q has its own scheduling algorithm. - There must be scheduling between the queues, which implemented as fixed-priority preemptive scheduling. Ex: foreground Q may have absolute priority over the background Q. - See the previous figure. - Note: Each queue has absolute priority over lowerpriority Q. - No process in the batch Q could run unless the queues for system processes, interactive processes, editing processes were all empty. - If an editing process enters the RQ while a batch process was running, the batch process would be preemptive. Alternative possibility: is to time slice between the Qs. - Each Q gets a certain portion of the CPU time, which it can then scheduled among the various process in its Q. - Ex: Foreground Q may have 80% of the CPU, using RR among its processes. - Background Q may have 20% of the CPU, using FCFS among its processes. 14 - Typical time need for context switch is from (10100) microseconds. - To show the effect of context switch time on the performance of RR, assume we have one job of 10 time units, if the quantum is 12 time units, then there is no overhead. If quantum is 6 time units, the job is required two quanta, resulting in a context switch. If quantum is 1 unit, then 9 context switches will occur, slowing the execution of the job. - Turnaround time depend on time quantum Ex: given 3 jobs of 10 time units each, and quantum of 1 unit. The average turnaround time =(28+29+30)/3=29 If time quantum is 10, the average turnaround= (10+20+30)/3=20. If context switch is added in, things are even worse for a smaller time quantum, since more context switches will be required. Algorithm Evaluation: - How we do select a CPU scheduling algorithm for a particular system? - 1st problem is defining the criteria to be used in selecting an algorithm. - A criterion is defined in terms of CPU utilization, response time, or throughput. - We must define the relative importance of these measures. Which might include several measures, such as: 15 + Max CPU utilization under the constraint that the max response time is 1 second. + Max throughput such that turnaround time (on average) linearly proportional to total execution time. - There are a number of different evaluation method: 1. Analytic evaluation: - It uses the algorithm and the system workload to produce a formula or number that evaluate the performance of the algorithm for that workload. Deterministic modeling: (one type of analytic evaluation) - This method takes a particular predetermined workload and defines the performance of each algorithm for that workload. Ex: assume we have the following workload. All jobs arrive at time 0, in the given order: Job 1 2 3 4 5 Burst time 10 29 3 7 12 16 - Consider FCFS, SJF, and RR (quantum=10), for this set of jobs, which algorithm would give the min average waiting time? a) For FCFS, the jobs would be executed as follows: 1 2 3 4 5 10 39 42 49 12 Job 1 2 3 4 5 Waiting time 0 10 39 42 49 140 The average waiting time = 140/5=28 b) For SJF (non-preemptive), we execute the jobs as: 3 4 1 5 2 0 3 10 20 32 61 Job Waiting time 1 10 2 32 3 0 4 3 5 20 65 17 The average waiting time= 65/5=13 c) With RR (Q=10), we start with job 2, but preempt it after 10 time unit, putting it in the back of the queue. 1 2 10 3 20 23 4 5 30 2 40 5 50 52 2 61 Job Waiting time 1 0 2 2 3 20 4 23 5 40 115 The average waiting time = 115/5=23 We see that, in the case of SJF has less than ½ the average waiting time of FCFS, RR gas an intermediate value. - Deterministic modeling is simple and fast. - It gives exact numbers allowing the algorithms to be compared. - It requires exact numbers for input and its answer apply to these cases. - We can use deterministic scheduling to select a scheduling algorithm in case we have a running program for times. (i.e. we may be running the 18 same program over and over again), and can measure their processing requirements exactly. - In general, deterministic modeling is too specific to be very useful in most cases. - It requires too much exact knowledge and provides results of limited usefulness. 2. Queuing models: - The jobs that are run on many systems vary from day to day. - There is no static set of jobs (and time) to use for deterministic modeling. - What can be determined is the distribution of CPU and I/O bursts. - These distributions mat be measured and then approximated or estimated. - The computer system is described as a network of servers. - Each server has a queue of waiting jobs. - The CPU is a server with its ready queue, as is the I/O system with its device queues. - Knowing arrival rates and service rate, we can compute utilization, average queue length, average wait time, and so on. (This area is called queuing network analysis) 19 - Let n be the average queue length w be the average waiting time in the queue λ be the average arrival rate for new jobs in the queue. If the system is in a steady state (No. of jobs leaving the queue is equal to No. of jobs arrive): then n= λ * w (little’s formula) We can use this formula to compute one of 3 variables, if we know the other 2. - Queuing analysis can be quite useful in comparing scheduling algorithms, but it also has its limitations. - - 3. Simulations: used to get more accurate evaluation of the scheduling algorithms. Simulations involve programming a model of the computers systems. Software data structures represent the major components of the system. Simulator has a variable representing a clock as its value increased the simulator modifies the system state to reflect the activities of the devices, the jobs, and the scheduler. Data to drive the simulation can be generated in several ways (most common: Random number generator, to generate jobs, CPU bursts, arrivals, departures, according to probability distributions 20 (either uniforms, exponential Poisson) or defined empirically. - Simulations can be very expensive, requiring hours of computer time. The design, coding, and debugging of the simulator can be major task. 4. Implementation: The completely accurate way of evaluating a scheduling algorithm is to code it up and put it in the OS; to see how it works. -Difficultly: cost of this approach (coding algorithm, modifying OS, data structures required, reaction of users to constantly changing OS.) - Other difficulty with any algorithm evaluation is the environment in which the algorithm is used, will change. 21