Outline • Announcement • Distributed scheduling – continued

advertisement
Outline
• Announcement
• Distributed scheduling – continued
• Quiz at the end of today’s class
Announcement
• Schedule for the rest of the semester
– 4/10: Recovery
– 4/15: Fault tolerance
– 4/17: Class evaluation
Protection and security
– 4/22: Protection and security – continued
Quiz #3
– 4/24: Existing distributed systems and review
• Final exam
– 5:30-7:30PM, April 29, 2003
– Cumulative
May 29, 2016
COP 5611 - Operating Systems
2
Motivations
May 29, 2016
COP 5611 - Operating Systems
3
Motivations – cont.
May 29, 2016
COP 5611 - Operating Systems
4
Distributed Scheduling
• A distributed scheduler is a resource
management component of a distributed
operating system that focuses on judiciously
and transparently redistributing the load of
the system among the computers to maximize
the overall performance
May 29, 2016
COP 5611 - Operating Systems
5
Components of a Load Distributing Algorithm
• Four components
– Transfer policy
• Determines when a node needs to send tasks to other
nodes or can receive tasks from other nodes
– Selection policy
• Determines which task(s) to transfer
– Location policy
• Find suitable nodes for load sharing
– Information policy
May 29, 2016
COP 5611 - Operating Systems
6
Stability
• The queuing-theoretic perspective
– The CPU queues grow without bound if arrival rate is
greater than the rate at which the system can perform work
– A load distributing algorithm is effective under a given set
of conditions if it improves the performance relative to that
of a system not using load distribution
• Algorithmic stability
– An algorithm is unstable if it can perform fruitless actions
indefinitely with finite probability
• Processor thrashing
May 29, 2016
COP 5611 - Operating Systems
7
Sender-Initiated Algorithms
May 29, 2016
COP 5611 - Operating Systems
8
Receiver-Initiated Algorithms
May 29, 2016
COP 5611 - Operating Systems
9
Empirical Comparison of Sender-Initiated and Receiver-Initiated Algorithms
May 29, 2016
COP 5611 - Operating Systems
10
Symmetrically Initiated Algorithms
• Sender-initiated component
– A sender broadcasts a TooHigh message, sets a
TooHigh timeout alarm, and listens for an Accept
– A receiver that receives a TooHigh message cancels
its TooLow timeout, sends an Accept message to the
sender, and increases its load value
– On receiving an Accept message, if the site is still a
sender, choose the best task to transfer and transfer it
– If no Accept has been received before the timeout, it
broadcasts a ChangeAverage message to increase the
average load estimates at the other nodes
May 29, 2016
COP 5611 - Operating Systems
11
Symmetrically Initiated Algorithms – cont.
• Receiver-initiated component
– It broadcasts a TooLow message, set a TooLow
timeout alarm, and starts listening for a TooHigh
message
– If TooHigh message is received, it cancels its
TooLow timeout, sends an Accept message to the
sender, and increases its load value
– If no TooHigh message is received before the
timeout, the receiver broadcasts a ChangeAverage
message to decrease the average at other nodes
May 29, 2016
COP 5611 - Operating Systems
12
Comparison
May 29, 2016
COP 5611 - Operating Systems
13
Adaptive Algorithms
• A stable symmetrically initiated algorithm
– Each node keeps of a senders list, a receivers
list, and an OK list
• By classifying the nodes in the system as
Sender/overloaded, Receiver/underloaded, or OK
using the information gathered through polling
May 29, 2016
COP 5611 - Operating Systems
14
A Stable Symmetrically Initiated Algorithm – cont.
• Sender-initiated component
– The sender polls the node at the head of the receiver
– The polled node moves the sender to the head of its
sender list and sends a message indicating it is a
receiver, sender, or OK node
– The sender updates the polled node based on the reply
– If the polled node is a receiver, it transfers a task
– The polling process stops if its receiver’s list becomes
empty, or the number of polls reaches a PollLimit
May 29, 2016
COP 5611 - Operating Systems
15
A Stable Symmetrically Initiated Algorithm – cont.
• Receiver-initiated component
– The nodes polled in the following order
• Head to tail of its senders list
• Tail to head in the OK list
• Tail to head in the receivers list
May 29, 2016
COP 5611 - Operating Systems
16
A Stable Sender-Initiated Algorithm
• This algorithm uses the sender-initiated algorithm
of the stable symmetrically initiated algorithm
– Each node is augmented by an array called the
statevector
• It keeps track of its status at all the other nodes in the system
• It is updated based on the information at the polling stage
– The receiver-initiated component is replaced by the
following protocol
• When a node becomes a receiver, it informs all the nodes
that are misinformed
May 29, 2016
COP 5611 - Operating Systems
17
Comparison
May 29, 2016
COP 5611 - Operating Systems
18
Performance Under Heterogeneous Workloads
May 29, 2016
COP 5611 - Operating Systems
19
Selecting a Suitable Load Sharing Algorithm
• The best algorithm depends on the system
under consideration
– For example, if the system never attains high
loads, sender-initiated algorithms will give an
improved algorithm
– Stable scheduling algorithms should be used for
systems that can reach high loads
– For systems with heterogeneous work loads,
adaptive stable algorithms are preferable
May 29, 2016
COP 5611 - Operating Systems
20
Other Requirements of Load Distributing
• Scalability
– The algorithm should work well in large
distributed systems
•
•
•
•
Location transparency
Determinism
Preemption
Heterogeneity
May 29, 2016
COP 5611 - Operating Systems
21
Case Studies
•
•
•
•
The V-System
The Sprite system
Condor system
The Stealth distributed scheduler
May 29, 2016
COP 5611 - Operating Systems
22
Task Placement vs. Task Migration
• Task placement vs. task migration
– Task placement refers to the transfer of a task
that is yet to begin execution to a new location
and starts its execution there
– Task migration refers to the transfer of task that
has already begun execution to a new location
and continuing its execution there
May 29, 2016
COP 5611 - Operating Systems
23
Task Migration
• State transfer
– The task’s state includes the content of registers,
the task stack, the task’s status, virtual memory
address space, file descriptors, any temporary
files and buffered messages
– In addition, current working directory, signal
masks and handlers, resource usage statistics, and
references to children and parent processes
• Unfreeze
– The task is installed at the new machine and is
put in the ready queue
May 29, 2016
COP 5611 - Operating Systems
24
Issues in Task Migration
• State transfer
– The cost to support remote execution, including
delays due to freezing the tasks, obtaining and
transferring the state, and unfreezing the task
– Residual dependencies
•
•
•
•
Transferring pages in the virtual memory space
Redirection of messages
Location-dependent system calls
Residual dependencies are undesirable
May 29, 2016
COP 5611 - Operating Systems
25
State Transfer Mechanisms
May 29, 2016
COP 5611 - Operating Systems
26
Location Transparency
• Location transparency is essential to support
task migration
– Task migration should hide the locations of tasks
– Remote execution of tasks should not require any
special provisions in programs
– These require names be independent of their
locations
• Addresses are maintained as hints
• An object can be accessed through pointers
May 29, 2016
COP 5611 - Operating Systems
27
Task Migration Performance
• Cost of process migration in Sprite
May 29, 2016
COP 5611 - Operating Systems
28
Task Migration Performance – cont.
• Cost of process migration in Charlotte
May 29, 2016
COP 5611 - Operating Systems
29
Summary
• Load distributed algorithms try to improve the
overall system performance by transferring
load from heavily loaded nodes to lightly
loaded or idle nodes
– There are different load distributed algorithms
developed
– To be effective, these algorithms must be able to
collect the necessary information efficiently and
minimize the overhead of task transferring and
delays to due to task transferring
May 29, 2016
COP 5611 - Operating Systems
30
Download