Chapter 12 System Management Understanding Operating Systems, Fourth Edition

advertisement
Chapter 12
System Management
Understanding Operating Systems,
Fourth Edition
Objectives
You should be able to describe:
• The tradeoffs to be considered when attempting to
improve overall system performance
• The roles of system measurement tools such as
positive and negative feedback loops
• Two system monitoring techniques
• The importance of sound accounting practices by
system administrators
• The fundamentals of patch management
Understanding Operating Systems, Fourth Edition
2
Evaluating an Operating System
• To evaluate an operating system, you need to
understand:
–
–
–
–
Its design goals and history
How it communicates with users
How resources are managed
Tradeoffs made to achieve goals
• An operating system’s strengths and weaknesses
need to be weighed in relation to:
– Users
– Hardware
– Purpose
Understanding Operating Systems, Fourth Edition
3
Cooperation Among Components
• Performance of any one resource depends on
performance of other system resources
• Any system improvement can be made only after
extensive analysis of:
– Needs of the system’s resources, requirements,
managers, and users
• System changes often result in trading one set of
problems for another
• Consider performance of entire system and not just
individual components
Understanding Operating Systems, Fourth Edition
4
Role of Memory Management
• Before making memory related changes, consider
actual system operating environment
• There’s a tradeoff between memory use and CPU
overhead
– As memory management algorithms grow more
complex, CPU overhead increases and overall
performance can suffer
– Some operating systems perform remarkably better
with additional memory
Understanding Operating Systems, Fourth Edition
5
Role of Processor Management
• Multiprogramming requires synchronization among
Memory Manager, Processor Manager, and I/O
devices
– Tradeoff: Better use of CPU versus increased
overhead, slower response time, and decreased
throughput
Understanding Operating Systems, Fourth Edition
6
Role of Processor Management
(continued)
• System saturation point could be reached if CPU is
fully utilized but allowed to accept additional jobs
– Results in higher overhead and less time to run
programs
• Under heavy loads, CPU time required to manage
I/O queues could dramatically increase time
required to run jobs
• With long queues forming at channels, control
units, and I/O devices, CPU could be idle waiting
for processes to finish I/O
Understanding Operating Systems, Fourth Edition
7
Role of Device Management
• Ways to improve I/O device utilization include
buffering, blocking, and rescheduling I/O requests
to optimize access times
– Tradeoffs: Increased CPU overhead and additional
memory space used
• Blocking reduces number of physical I/O requests,
but increases overhead
• Buffering helps CPU match slower speed of I/O
devices but requires memory space
– Tradeoff: Reduced multiprogramming versus better
use of I/O devices
Understanding Operating Systems, Fourth Edition
8
Role of Device Management
(continued)
• Rescheduling requests helps optimizing I/O times
– Overhead function
– Speed of both CPU and I/O device must be weighed
against time to execute reordering algorithm
Understanding Operating Systems, Fourth Edition
9
Role of Device Management
(continued)
Table 12.1: System with three CPUs and four disk drives of
different speeds. Assuming system requires 1,000 instructions
to reorder I/O requests, advantages of reordering vary
depending on combination of CPU and disk.
Understanding Operating Systems, Fourth Edition
10
Role of Device Management
(continued)
A system consisting of CPU 1 and Disk Drive A has to
access Track 1, Track 9, Track 1, and then Track 9, and
the arm is already located at Track 1.
Without reordering, data access requires: 35 + 35 + 35 =
105 ms,
Figure 12.1: Combination of CPU 1 and Disk Drive A
without reordering
Understanding Operating Systems, Fourth Edition
11
Role of Device Management
(continued)
After reordering, the arm can perform both accesses on Track 1
before traveling, in 35 ms, to Track 9
With reordering, data access requires: 35 + 30 = 65 ms
Figure 12.2: Combination of CPU 1 and Disk Drive A
with reordering
Understanding Operating Systems, Fourth Edition
12
Role of Device Management
(continued)
• Reordering requests aren’t always warranted
– Consider CPU 1 and much faster Disk Drive C
• Without reordering, access time: 5 + 5 + 5 = 15 ms
• With reordering, access time: 5 + 30 = 35 ms
• Reordering algorithm is either always on or always
off
– Can’t be changed by systems operator without
reconfiguration
– Initial setting must be determined by evaluating
system on the average
Understanding Operating Systems, Fourth Edition
13
Role of File Management
• Secondary storage allocation schemes help user
organize and access files on system
– Different schemes offer different flexibility, but
tradeoff for increased file flexibility is increased CPU
overhead
• Example: Accessing all records of a file stored
noncontiguously could be time-consuming and require
compaction, which takes CPU time
• Volume’s directory location affect retrieval time
• File management is closely related to device on
which files are stored
Understanding Operating Systems, Fourth Edition
14
Role of File Management (continued)
If file’s directory is loaded into memory, access speed affects
only initial retrieval and not subsequent retrievals
Table 12.2: A system with four disk drives of different speeds
and a CPU speed of 1.2 ms
Understanding Operating Systems, Fourth Edition
15
Role of Network Management
• The Network Manager
– Routinely synchronizes the load among remote
processors
– Tries to select most efficient communication paths
over multiple data communication lines
– Allows network administrator to monitor use of
individual computers and shared hardware
– Ensures compliance with software license
agreements
– Simplifies updating data files and programs on
networked computers
Understanding Operating Systems, Fourth Edition
16
Measuring System Performance
• Total system performance can be defined as
efficiency with which a computer system meets its
goals
• System efficiency is not easily measured
– Affected by three major components: user programs,
operating system programs, and hardware
• System performance can be very subjective and
difficult to quantify
Understanding Operating Systems, Fourth Edition
17
Measurement Tools
• Most designers and analysts rely on following
measures of system performance:
–
–
–
–
–
–
–
Throughput
Capacity
Response time
Turnaround time
Resource utilization
Availability
Reliability
Understanding Operating Systems, Fourth Edition
18
Measurement Tools (continued)
• Throughput: Composite measure that indicates
productivity of system as a whole
– Usually measured under steady-state conditions
– Reflects quantities such as “the number of jobs
processed per day” or “the number of online
transactions handled per hour”
– Can also be a measure of volume of work handled
by one system unit
– Can be monitored by either hardware or software
Understanding Operating Systems, Fourth Edition
19
Measurement Tools (continued)
• Capacity: Maximum throughput level
– Resource becomes saturated and processes in
system aren’t being passed along
• Thrashing is a result
– Main memory has been over-committed and level of
multiprogramming has reached a peak point
– Can be monitored by either hardware or software
– Bottlenecks can be detected by monitoring queues
forming at each resource
Understanding Operating Systems, Fourth Edition
20
Measurement Tools (continued)
• Response time: Interval required to process user’s
request
– From when user presses key to send message until
system indicates receipt of message
– Depends on:
• Workload handled by system at time of request
• Type of job or request being submitted
– Should include both average values and variance
Understanding Operating Systems, Fourth Edition
21
Measurement Tools (continued)
• Turnaround time: Response time for batch jobs
– Time from submission of job until output is returned
to user
– Same dependencies and measurement
requirements as response time
Understanding Operating Systems, Fourth Edition
22
Measurement Tools (continued)
• Resource utilization: Measure of how much each
unit is contributing to overall operation
– Usually given as percentage of time a resource is
actually in use
• Example: Is CPU busy 60 percent of time?
– Helps analyst to determine:
• If there is a balance among system units
• If system is I/O-bound or CPU-bound
Understanding Operating Systems, Fourth Edition
23
Measurement Tools (continued)
• Availability: Indicates likelihood that resource will
be ready when user needs it
– Influenced by two factors:
• Mean time between failures (MTBF): Average time
unit is operational before it breaks down
• Mean time to repair (MTTR): Average time needed to
fix failed unit and put it back in service
MTBF
Availabili ty(A) 
MTBF  MTTR
Understanding Operating Systems, Fourth Edition
24
Measurement Tools (continued)
Table 12.3: Availability of certain platforms based on
24 hours, 365 days/year use
Understanding Operating Systems, Fourth Edition
25
Measurement Tools (continued)
• Reliability: Measures probability that unit will not
fail during given time period
– Function of MTBF
R(t )  e (1 MTBF )( t )
Understanding Operating Systems, Fourth Edition
26
Measurement Tools (continued)
• Measures of performance can’t be taken in
isolation from workload being handled by system
• Overall system performance varies with time
– Important to define the actual working environment
before making generalizations
Understanding Operating Systems, Fourth Edition
27
Feedback Loops
• Feedback loop: A mechanism to monitor system’s
resource utilization so adjustments can be made
– Prevents processor from spending more time doing
overhead than executing jobs
• Types of feedback loops:
– Negative feedback loop
– Positive feedback loop
Understanding Operating Systems, Fourth Edition
28
Feedback Loops (continued)
• Negative feedback loop: Causes the arrival rate
of processes to decrease when system becomes
too congested
– Helps stabilize system
– Keeps queue lengths close to estimated mean
values
• Positive feedback loop: Causes arrival rate to
increase when system becomes underutilized
– Used in paged virtual memory systems
– More difficult to implement than negative loops
Understanding Operating Systems, Fourth Edition
29
Feedback Loops (continued)
Figure 12.3: Negative feedback loop
Understanding Operating Systems, Fourth Edition
30
Feedback Loops (continued)
Figure 12.4: Positive feedback loop
Understanding Operating Systems, Fourth Edition
31
Monitoring
• Implemented using hardware or software
– Hardware monitors are more expensive
• Have minimum impact on system because they’re
outside of it and attached electronically
– Examples: counters, clocks, and comparators, etc.
– Software monitors are relatively inexpensive
• Can distort results of analysis because they become
part of system
• Must be developed for each specific system
• Difficult to move from system to system
Understanding Operating Systems, Fourth Edition
32
Monitoring (continued)
• In early systems, performance measurements
monitored only CPU speed
• Today’s measurements include other hardware
units,OS, compilers, and other system software
• Measurements are made in a variety of ways
– Using real programs, usually production programs
• Run with different configurations of CPUs, operating
systems, and other components
• Results are called benchmarks
– Using simulation models
Understanding Operating Systems, Fourth Edition
33
Monitoring (continued)
• Benchmarks demonstrate specific advantages of
a new CPU, operating system, compiler, or piece of
hardware
– Useful when comparing systems that have gone
through extensive changes
– Results highly dependent upon:
• System’s workload
• System’s design and implementation
• Specific requirements of applications loaded on
system
Understanding Operating Systems, Fourth Edition
34
Monitoring (continued)
Table 12.4: Benchmarking results
Understanding Operating Systems, Fourth Edition
35
Monitoring (continued)
Table 12.4 (continued): Benchmarking results
Understanding Operating Systems, Fourth Edition
36
Accounting
• Pays bills and keeps system financially operable
• In a single-user environment easy to calculate
cost of system
• In a multiuser environment, computer costs are
usually distributed among users
– Based on how much each one uses system’s
resources
Understanding Operating Systems, Fourth Edition
37
Accounting (continued)
• For distribution, operating system must be able to:
–
–
–
–
Set up user accounts
Assign passwords
Identify which resources are available to each user
Define quotas for available resources, such as disk
space or maximum CPU time allowed per job
• Pricing policies vary from system to system
Understanding Operating Systems, Fourth Edition
38
Accounting (continued)
• Pricing policies include some or all of the following:
– Total amount of time spent between job submission
and completion
– CPU time, main memory usage
– Secondary storage used during program execution
– Secondary storage used during billing period
– Use of system software, number of I/O operations
– Time spent waiting for I/O completion
– Number of input records read, output records
printed, page faults
Understanding Operating Systems, Fourth Edition
39
Accounting (continued)
• Pricing policies often used as a way to achieve
specific operational goals
• Pricing incentives can be used to encourage users
to access more plentiful and cheap resources
• Method of billing information depends on
environment
• Maintaining billing records online:
– Status of each user can be checked before the
user’s job is allowed to enter the READY queue
– Results in increased overhead
Understanding Operating Systems, Fourth Edition
40
Patch Management
• Systematic updating of the operating system and
other system software
• Patch: Piece of programming code that replaces or
changes code that makes up software
• Primary reasons for software patches:
– Need for vigilant security precautions against attacks
– Need to assure system compliance with government
regulations
– Need to keep systems running at peak efficiency
• Among top eight technologies used most
Understanding Operating Systems, Fourth Edition
41
Patch Management (continued)
Table 12.5: 2004 E-Crime Watch survey results of
security and law enforcement executives
Understanding Operating Systems, Fourth Edition
42
Patching Fundamentals
•
Essential steps to take before patch installation:
1.
2.
3.
4.
5.
•
Identify the required patch
Verify patch’s source and integrity
Test patch in a safe environment
Deploy patch throughout system
Audit system to gauge success of patch
deployment
Never patch operating system without a recent
data backup in hand
Understanding Operating Systems, Fourth Edition
43
Patching Fundamentals (continued)
• Patch Availability: Identify the criticality of patch
– If critical, plan to apply patch as soon as possible
– If not critical, possible to delay installation until a
regular patch cycle begins
• Patch Integrity: Validate source and integrity
– Check digital signature or validation tool that comes
with software
– Validate digital signature used by vendor to send
new software on a regular basis
Understanding Operating Systems, Fourth Edition
44
Patching Fundamentals (continued)
• Patch Testing: Test new patch on a sample
system or an isolated machine
– Test to see:
• If system reboots after patch is installed
• If patched software performs its assigned tasks
– Tested system should resemble complexity of target
network
– Test contingency plans to uninstall patch and
recover old software if something goes wrong
Understanding Operating Systems, Fourth Edition
45
Patching Fundamentals (continued)
• Patch Deployment: Installation of patch
– On single-user computer, patch deployment is a
simple task
• Install software and reboot computer
– On multiplatform system with many users, task is
exceptionally complicated
• Should have an accurate inventory of all hardware
and software
– Can be gleaned from network mapping software
• Can launch deployment in stages
Understanding Operating Systems, Fourth Edition
46
Patching Fundamentals (continued)
• Audit the Finished System: Confirm resulting
system meets expectations
– Verifying all computers are patched correctly and
perform fundamental tasks as expected
– Verifying no users had unauthorized versions of
software on computers and thus ineligible for patch
– Verifying no users left with unpatched software on
computers
Understanding Operating Systems, Fourth Edition
47
Patching Fundamentals (continued)
• Audit the Finished System (continued)
– Should include:
• Documentation of changes made to system
• Success or failure of each stage of process
– Keep a log of all system changes for future
reference
– Get feedback from users to verify deployment’s
success
Understanding Operating Systems, Fourth Edition
48
Software Options
• Patches can be managed in two ways:
– Manually, one at a time
– Automatically using software
• Deployment software falls into two groups:
– Agent-based software
• Agent must be installed on all target systems before
patches can be deployed
– Agentless software
• Attractive for administrators of large, complex networks
Understanding Operating Systems, Fourth Edition
49
Timing the Patch Cycle
• Critical patches must be applied immediately
• Less-critical patches can be scheduled at
convenience of systems group
• Routine patches can be:
– Applied monthly or quarterly
– Timed to coincide with vendor’s service pack release
• Advantage of routine patch cycles:
– Allow for thorough review of patch and testing cycles
before deployment
Understanding Operating Systems, Fourth Edition
50
Summary
• The operating system is the orchestrated
cooperation of every piece of hardware and
software
• When one part of the system is favored, it’s often at
the expense of others
• System’s managers must make sure they are using
appropriate measurement tools and techniques to
verify effectiveness of the system
• System’s managers must evaluate degree of
improvement
Understanding Operating Systems, Fourth Edition
51
Download