Chapter 12 System Management Understanding Operating Systems, Fourth Edition Objectives You should be able to describe: • The tradeoffs to be considered when attempting to improve overall system performance • The roles of system measurement tools such as positive and negative feedback loops • Two system monitoring techniques • The importance of sound accounting practices by system administrators • The fundamentals of patch management Understanding Operating Systems, Fourth Edition 2 Evaluating an Operating System • To evaluate an operating system, you need to understand: – – – – Its design goals and history How it communicates with users How resources are managed Tradeoffs made to achieve goals • An operating system’s strengths and weaknesses need to be weighed in relation to: – Users – Hardware – Purpose Understanding Operating Systems, Fourth Edition 3 Cooperation Among Components • Performance of any one resource depends on performance of other system resources • Any system improvement can be made only after extensive analysis of: – Needs of the system’s resources, requirements, managers, and users • System changes often result in trading one set of problems for another • Consider performance of entire system and not just individual components Understanding Operating Systems, Fourth Edition 4 Role of Memory Management • Before making memory related changes, consider actual system operating environment • There’s a tradeoff between memory use and CPU overhead – As memory management algorithms grow more complex, CPU overhead increases and overall performance can suffer – Some operating systems perform remarkably better with additional memory Understanding Operating Systems, Fourth Edition 5 Role of Processor Management • Multiprogramming requires synchronization among Memory Manager, Processor Manager, and I/O devices – Tradeoff: Better use of CPU versus increased overhead, slower response time, and decreased throughput Understanding Operating Systems, Fourth Edition 6 Role of Processor Management (continued) • System saturation point could be reached if CPU is fully utilized but allowed to accept additional jobs – Results in higher overhead and less time to run programs • Under heavy loads, CPU time required to manage I/O queues could dramatically increase time required to run jobs • With long queues forming at channels, control units, and I/O devices, CPU could be idle waiting for processes to finish I/O Understanding Operating Systems, Fourth Edition 7 Role of Device Management • Ways to improve I/O device utilization include buffering, blocking, and rescheduling I/O requests to optimize access times – Tradeoffs: Increased CPU overhead and additional memory space used • Blocking reduces number of physical I/O requests, but increases overhead • Buffering helps CPU match slower speed of I/O devices but requires memory space – Tradeoff: Reduced multiprogramming versus better use of I/O devices Understanding Operating Systems, Fourth Edition 8 Role of Device Management (continued) • Rescheduling requests helps optimizing I/O times – Overhead function – Speed of both CPU and I/O device must be weighed against time to execute reordering algorithm Understanding Operating Systems, Fourth Edition 9 Role of Device Management (continued) Table 12.1: System with three CPUs and four disk drives of different speeds. Assuming system requires 1,000 instructions to reorder I/O requests, advantages of reordering vary depending on combination of CPU and disk. Understanding Operating Systems, Fourth Edition 10 Role of Device Management (continued) A system consisting of CPU 1 and Disk Drive A has to access Track 1, Track 9, Track 1, and then Track 9, and the arm is already located at Track 1. Without reordering, data access requires: 35 + 35 + 35 = 105 ms, Figure 12.1: Combination of CPU 1 and Disk Drive A without reordering Understanding Operating Systems, Fourth Edition 11 Role of Device Management (continued) After reordering, the arm can perform both accesses on Track 1 before traveling, in 35 ms, to Track 9 With reordering, data access requires: 35 + 30 = 65 ms Figure 12.2: Combination of CPU 1 and Disk Drive A with reordering Understanding Operating Systems, Fourth Edition 12 Role of Device Management (continued) • Reordering requests aren’t always warranted – Consider CPU 1 and much faster Disk Drive C • Without reordering, access time: 5 + 5 + 5 = 15 ms • With reordering, access time: 5 + 30 = 35 ms • Reordering algorithm is either always on or always off – Can’t be changed by systems operator without reconfiguration – Initial setting must be determined by evaluating system on the average Understanding Operating Systems, Fourth Edition 13 Role of File Management • Secondary storage allocation schemes help user organize and access files on system – Different schemes offer different flexibility, but tradeoff for increased file flexibility is increased CPU overhead • Example: Accessing all records of a file stored noncontiguously could be time-consuming and require compaction, which takes CPU time • Volume’s directory location affect retrieval time • File management is closely related to device on which files are stored Understanding Operating Systems, Fourth Edition 14 Role of File Management (continued) If file’s directory is loaded into memory, access speed affects only initial retrieval and not subsequent retrievals Table 12.2: A system with four disk drives of different speeds and a CPU speed of 1.2 ms Understanding Operating Systems, Fourth Edition 15 Role of Network Management • The Network Manager – Routinely synchronizes the load among remote processors – Tries to select most efficient communication paths over multiple data communication lines – Allows network administrator to monitor use of individual computers and shared hardware – Ensures compliance with software license agreements – Simplifies updating data files and programs on networked computers Understanding Operating Systems, Fourth Edition 16 Measuring System Performance • Total system performance can be defined as efficiency with which a computer system meets its goals • System efficiency is not easily measured – Affected by three major components: user programs, operating system programs, and hardware • System performance can be very subjective and difficult to quantify Understanding Operating Systems, Fourth Edition 17 Measurement Tools • Most designers and analysts rely on following measures of system performance: – – – – – – – Throughput Capacity Response time Turnaround time Resource utilization Availability Reliability Understanding Operating Systems, Fourth Edition 18 Measurement Tools (continued) • Throughput: Composite measure that indicates productivity of system as a whole – Usually measured under steady-state conditions – Reflects quantities such as “the number of jobs processed per day” or “the number of online transactions handled per hour” – Can also be a measure of volume of work handled by one system unit – Can be monitored by either hardware or software Understanding Operating Systems, Fourth Edition 19 Measurement Tools (continued) • Capacity: Maximum throughput level – Resource becomes saturated and processes in system aren’t being passed along • Thrashing is a result – Main memory has been over-committed and level of multiprogramming has reached a peak point – Can be monitored by either hardware or software – Bottlenecks can be detected by monitoring queues forming at each resource Understanding Operating Systems, Fourth Edition 20 Measurement Tools (continued) • Response time: Interval required to process user’s request – From when user presses key to send message until system indicates receipt of message – Depends on: • Workload handled by system at time of request • Type of job or request being submitted – Should include both average values and variance Understanding Operating Systems, Fourth Edition 21 Measurement Tools (continued) • Turnaround time: Response time for batch jobs – Time from submission of job until output is returned to user – Same dependencies and measurement requirements as response time Understanding Operating Systems, Fourth Edition 22 Measurement Tools (continued) • Resource utilization: Measure of how much each unit is contributing to overall operation – Usually given as percentage of time a resource is actually in use • Example: Is CPU busy 60 percent of time? – Helps analyst to determine: • If there is a balance among system units • If system is I/O-bound or CPU-bound Understanding Operating Systems, Fourth Edition 23 Measurement Tools (continued) • Availability: Indicates likelihood that resource will be ready when user needs it – Influenced by two factors: • Mean time between failures (MTBF): Average time unit is operational before it breaks down • Mean time to repair (MTTR): Average time needed to fix failed unit and put it back in service MTBF Availabili ty(A) MTBF MTTR Understanding Operating Systems, Fourth Edition 24 Measurement Tools (continued) Table 12.3: Availability of certain platforms based on 24 hours, 365 days/year use Understanding Operating Systems, Fourth Edition 25 Measurement Tools (continued) • Reliability: Measures probability that unit will not fail during given time period – Function of MTBF R(t ) e (1 MTBF )( t ) Understanding Operating Systems, Fourth Edition 26 Measurement Tools (continued) • Measures of performance can’t be taken in isolation from workload being handled by system • Overall system performance varies with time – Important to define the actual working environment before making generalizations Understanding Operating Systems, Fourth Edition 27 Feedback Loops • Feedback loop: A mechanism to monitor system’s resource utilization so adjustments can be made – Prevents processor from spending more time doing overhead than executing jobs • Types of feedback loops: – Negative feedback loop – Positive feedback loop Understanding Operating Systems, Fourth Edition 28 Feedback Loops (continued) • Negative feedback loop: Causes the arrival rate of processes to decrease when system becomes too congested – Helps stabilize system – Keeps queue lengths close to estimated mean values • Positive feedback loop: Causes arrival rate to increase when system becomes underutilized – Used in paged virtual memory systems – More difficult to implement than negative loops Understanding Operating Systems, Fourth Edition 29 Feedback Loops (continued) Figure 12.3: Negative feedback loop Understanding Operating Systems, Fourth Edition 30 Feedback Loops (continued) Figure 12.4: Positive feedback loop Understanding Operating Systems, Fourth Edition 31 Monitoring • Implemented using hardware or software – Hardware monitors are more expensive • Have minimum impact on system because they’re outside of it and attached electronically – Examples: counters, clocks, and comparators, etc. – Software monitors are relatively inexpensive • Can distort results of analysis because they become part of system • Must be developed for each specific system • Difficult to move from system to system Understanding Operating Systems, Fourth Edition 32 Monitoring (continued) • In early systems, performance measurements monitored only CPU speed • Today’s measurements include other hardware units,OS, compilers, and other system software • Measurements are made in a variety of ways – Using real programs, usually production programs • Run with different configurations of CPUs, operating systems, and other components • Results are called benchmarks – Using simulation models Understanding Operating Systems, Fourth Edition 33 Monitoring (continued) • Benchmarks demonstrate specific advantages of a new CPU, operating system, compiler, or piece of hardware – Useful when comparing systems that have gone through extensive changes – Results highly dependent upon: • System’s workload • System’s design and implementation • Specific requirements of applications loaded on system Understanding Operating Systems, Fourth Edition 34 Monitoring (continued) Table 12.4: Benchmarking results Understanding Operating Systems, Fourth Edition 35 Monitoring (continued) Table 12.4 (continued): Benchmarking results Understanding Operating Systems, Fourth Edition 36 Accounting • Pays bills and keeps system financially operable • In a single-user environment easy to calculate cost of system • In a multiuser environment, computer costs are usually distributed among users – Based on how much each one uses system’s resources Understanding Operating Systems, Fourth Edition 37 Accounting (continued) • For distribution, operating system must be able to: – – – – Set up user accounts Assign passwords Identify which resources are available to each user Define quotas for available resources, such as disk space or maximum CPU time allowed per job • Pricing policies vary from system to system Understanding Operating Systems, Fourth Edition 38 Accounting (continued) • Pricing policies include some or all of the following: – Total amount of time spent between job submission and completion – CPU time, main memory usage – Secondary storage used during program execution – Secondary storage used during billing period – Use of system software, number of I/O operations – Time spent waiting for I/O completion – Number of input records read, output records printed, page faults Understanding Operating Systems, Fourth Edition 39 Accounting (continued) • Pricing policies often used as a way to achieve specific operational goals • Pricing incentives can be used to encourage users to access more plentiful and cheap resources • Method of billing information depends on environment • Maintaining billing records online: – Status of each user can be checked before the user’s job is allowed to enter the READY queue – Results in increased overhead Understanding Operating Systems, Fourth Edition 40 Patch Management • Systematic updating of the operating system and other system software • Patch: Piece of programming code that replaces or changes code that makes up software • Primary reasons for software patches: – Need for vigilant security precautions against attacks – Need to assure system compliance with government regulations – Need to keep systems running at peak efficiency • Among top eight technologies used most Understanding Operating Systems, Fourth Edition 41 Patch Management (continued) Table 12.5: 2004 E-Crime Watch survey results of security and law enforcement executives Understanding Operating Systems, Fourth Edition 42 Patching Fundamentals • Essential steps to take before patch installation: 1. 2. 3. 4. 5. • Identify the required patch Verify patch’s source and integrity Test patch in a safe environment Deploy patch throughout system Audit system to gauge success of patch deployment Never patch operating system without a recent data backup in hand Understanding Operating Systems, Fourth Edition 43 Patching Fundamentals (continued) • Patch Availability: Identify the criticality of patch – If critical, plan to apply patch as soon as possible – If not critical, possible to delay installation until a regular patch cycle begins • Patch Integrity: Validate source and integrity – Check digital signature or validation tool that comes with software – Validate digital signature used by vendor to send new software on a regular basis Understanding Operating Systems, Fourth Edition 44 Patching Fundamentals (continued) • Patch Testing: Test new patch on a sample system or an isolated machine – Test to see: • If system reboots after patch is installed • If patched software performs its assigned tasks – Tested system should resemble complexity of target network – Test contingency plans to uninstall patch and recover old software if something goes wrong Understanding Operating Systems, Fourth Edition 45 Patching Fundamentals (continued) • Patch Deployment: Installation of patch – On single-user computer, patch deployment is a simple task • Install software and reboot computer – On multiplatform system with many users, task is exceptionally complicated • Should have an accurate inventory of all hardware and software – Can be gleaned from network mapping software • Can launch deployment in stages Understanding Operating Systems, Fourth Edition 46 Patching Fundamentals (continued) • Audit the Finished System: Confirm resulting system meets expectations – Verifying all computers are patched correctly and perform fundamental tasks as expected – Verifying no users had unauthorized versions of software on computers and thus ineligible for patch – Verifying no users left with unpatched software on computers Understanding Operating Systems, Fourth Edition 47 Patching Fundamentals (continued) • Audit the Finished System (continued) – Should include: • Documentation of changes made to system • Success or failure of each stage of process – Keep a log of all system changes for future reference – Get feedback from users to verify deployment’s success Understanding Operating Systems, Fourth Edition 48 Software Options • Patches can be managed in two ways: – Manually, one at a time – Automatically using software • Deployment software falls into two groups: – Agent-based software • Agent must be installed on all target systems before patches can be deployed – Agentless software • Attractive for administrators of large, complex networks Understanding Operating Systems, Fourth Edition 49 Timing the Patch Cycle • Critical patches must be applied immediately • Less-critical patches can be scheduled at convenience of systems group • Routine patches can be: – Applied monthly or quarterly – Timed to coincide with vendor’s service pack release • Advantage of routine patch cycles: – Allow for thorough review of patch and testing cycles before deployment Understanding Operating Systems, Fourth Edition 50 Summary • The operating system is the orchestrated cooperation of every piece of hardware and software • When one part of the system is favored, it’s often at the expense of others • System’s managers must make sure they are using appropriate measurement tools and techniques to verify effectiveness of the system • System’s managers must evaluate degree of improvement Understanding Operating Systems, Fourth Edition 51