Outline • Announcement • Midterm Review • Distributed File Systems – continued

advertisement
Outline
• Announcement
• Midterm Review
• Distributed File Systems – continued
– If we have time
5/29/2016
COP5611
1
Announcements
• Please turn in your homework #3 at the
beginning of class
• The midterm will be on March 20
– This coming Thursday
– It will be an open-book, open-note exam
5/29/2016
COP5611
2
Operating System
• An operating system is a layer of software on
a bare machine that performs two basic
functions
– Resource management
• To manage resources so that they are used in an
efficient and fair manner
– User friendliness
5/29/2016
COP5611
3
Distributed Systems
• A distributed system is a collection of
independent computers that appears to its
users as a single coherent system
– Independent computers mean that they do not
share memory or clock
– The computers communicate with each other by
exchanging messages over a communication
network
5/29/2016
COP5611
4
Distributed Systems – cont.
5/29/2016
COP5611
5
Distributed Systems – cont.
• Advantages
– The computing power of a group of cheap
workstations can be enormous
• Decisive price/performance advantage over traditional
time-sharing systems
–
–
–
–
Resource sharing
Enhanced performance
Improved reliability and availability
Modular expandability
5/29/2016
COP5611
6
Distributed System Architecture – cont.
• Distributed systems are often classified based
on the hardware
– Multiprocessor systems
– Homogenous multi-computer systems
– Heterogeneous multi-computer systems
5/29/2016
COP5611
7
Distributed Operating Systems
• Hardware for distributed systems is
important, but the software largely
determines what a distributed system looks
like to a user
• Distributed operating systems are much like
the traditional operating systems
– Resource management
– User friendliness
– The key concept is transparency
5/29/2016
COP5611
8
Distributed Operating Systems – cont.
• In a truly distributed operating system, the
user views the system as a virtual
uniprocessor system even though physically
it consists of multiple computers
– In other words, the use of multiple computers
and accessing remote data and resources should
be invisible to the user
5/29/2016
COP5611
9
Overview of Different Kinds of Distributed Systems
System
Description
Main Goal
DOS
Tightly-coupled operating system for multiprocessors and homogeneous
multicomputers
Hide and manage
hardware
resources
NOS
Loosely-coupled operating system for
heterogeneous multicomputers (LAN and
WAN)
Offer local
services to remote
clients
Middleware
Additional layer atop of NOS implementing
general-purpose services
Provide
distribution
transparency
5/29/2016
COP5611
10
Multicomputer Operating Systems
• General structure of a multicomputer operating system
5/29/2016
COP5611
11
Network Operating System
1-19
5/29/2016
COP5611
12
Middleware and Openness
1.23
• In an open middleware-based distributed system, the protocols
used by each middleware layer should be the same, as well as
the interfaces they offer to applications.
5/29/2016
COP5611
13
Comparison Between Systems
Distributed OS
Multiproc.
Multicomp.
Network
OS
Degree of transparency
Very High
High
Low
High
Same OS on all nodes
Yes
Yes
No
No
Number of copies of OS
1
N
N
N
Basis for
communication
Shared
memory
Messages
Files
Model specific
Resource management
Global,
central
Global,
distributed
Per node
Per node
Scalability
No
Moderately
Yes
Varies
Openness
Closed
Closed
Open
Open
Item
5/29/2016
COP5611
Middlewarebased OS
14
Issues in Distributed Operating Systems
• Absence of global knowledge
– In a distributed system, due to the unavailability
of a global memory and a global clock and due to
unpredictable message delays, it is practically
impossible to for a computer to collect up-to-date
information about the global state of the
distributed system
– Therefore a fundamental problem is to develop
efficient techniques to implement a decentralized
system wide control
– Another problem is how to order all the events
5/29/2016
COP5611
15
Issues in Distributed Operating Systems – cont.
• Naming
– Plays an important role in achieving location
transparency
– A name service maps a logical name into a
physical address by making use of a table lookup,
an algorithm, or a combination of both
– In distributed systems, the tables may be
replicated and stored at many places
• Consider naming in a distributed file system
5/29/2016
COP5611
16
Issues in Distributed Operating Systems – cont.
• Scalability
– Systems generally grow with time, especially
distributed systems
– Scalability requires that the growth should not
result in system unavailability or degraded
performance
– This puts additional constraints on design
approaches
5/29/2016
COP5611
17
Issues in Distributed Operating Systems – cont.
• Compatibility
– Refers to the interoperability among the
resources in a system
– Three different levels
• Binary level
– All processors execute the same binary instruction repertoire
– Virtual binary level
• Execution level
– Same source code can be compiled and executed properly
• Protocol level
– A common set of protocols
5/29/2016
COP5611
18
Issues in Distributed Operating Systems – cont.
• Process synchronization
– The synchronization of processes in distributed
systems is difficult because of the unavailability
of shared memory
• It needs to synchronize processes running on different
computers when they try to concurrently access a
shared resource
• This is the mutual exclusion problem as in classical
operating systems
5/29/2016
COP5611
19
Issues in Distributed Operating Systems – cont.
• Resource management
– Resource management needs to make both local
and remote resources available to uses in an
effective manner
– Data migration
• Distributed file system
• Distributed shared memory
– Computation migration
• Remote procedure call
– Distributed scheduling
5/29/2016
COP5611
20
Issues in Distributed Operating Systems – cont.
• Structuring
– The distributed operating system requires some
additional constraints on the structure of the
underlying operating system
– The collective kernel structure
• An operating system is structured as a collection of
processes that are largely independent of each other
– Object-oriented operating system
• The operating system’s services are implemented as
objects
5/29/2016
COP5611
21
Clients and Servers
• General interaction between a client and a server.
5/29/2016
COP5611
22
Layered Protocols
• Layers, interfaces, and protocols in the OSI model.
5/29/2016
COP5611
23
Network Layer
• The primary task of a network layer is routing
• The most widely used network protocol is the
connection-less IP (Internet Protocol)
– Each IP packet is routed to its destination
independent of all others
• A connection-oriented protocol is gaining
popularity
– Virtual channel in ATM networks
5/29/2016
COP5611
24
Transport Layer
• This layer is the last part of a basic network protocol
stack
– In other words, this layer can be used by application
developers
• An important aspect of this layer is to provide end-toend communication
– The Internet transport protocol is called TCP (Transmission
Control Protocol)
– The Internet protocol also supports a connectionless transport
protocol called UDP (Universal Datagram Protocol)
5/29/2016
COP5611
25
Sockets
• Socket primitives for TCP/IP.
5/29/2016
Primitive
Meaning
Socket
Create a new communication endpoint
Bind
Attach a local address to a socket
Listen
Announce willingness to accept connections
Accept
Block caller until a connection request arrives
Connect
Actively attempt to establish a connection
Send
Send some data over the connection
Receive
Receive some data over the connection
Close
Release the connection
COP5611
26
Sockets – cont.
• Connection-oriented communication pattern using sockets.
5/29/2016
COP5611
27
Socket Programming
• Review
–
–
–
–
IP
TCP
UDP
Port
• Server Design Issues
– Iterative vs. concurrent server
– Stateless vs. stateful server
– Multithreaded server
5/29/2016
COP5611
28
A Multithreaded Server
5/29/2016
COP5611
29
The Message Passing Model
• The message passing model provides two
basic communication primitives
– Send and receive
– Send has two logical parameters, a message and
its destination
– Receive has two logical parameters, the source
and a buffer for storing the message
5/29/2016
COP5611
30
Semantics of Send and Receive Primitives
• There are several design issues regarding send and
receive primitives
– Buffered or un-buffered
– Blocking vs. non-blocking primitives
• With blocking primitives, the send does not return control
until the message has been sent or received and the receive
does not return control until a message is copied to the buffer
• With non-blocking primitives, the send returns control as the
message is copied and the receive signals its intention to
receive a message and provide a buffer for it
5/29/2016
COP5611
31
Semantics of Send and Receive Primitives – cont.
• Synchronous vs. asynchronous primitives
– With synchronous primitives, a SEND primitive
is blocked until a corresponding RECEIVE
primitive is executed
– With asynchronous primitives, a SEND primitive
does not block if there is no corresponding
execution of a RECEIVE primitive
• The messages are buffered
5/29/2016
COP5611
32
Remote Procedure Call
• RPC is designed to hide all the details from
programmers
– Overcome the difficulties with message-passing
model
• It extends the conventional local procedure
calls to calling procedures on remote computers
5/29/2016
COP5611
33
Steps of a Remote Procedure Call – cont.
5/29/2016
COP5611
34
Remote Procedure Call – cont.
• Design issues
– Structure
• Mostly based on stub procedures
– Binding
• Through a binding server
• The client specifies the machine and service required
– Parameter and result passing
• Representation issues
• By value and by reference
5/29/2016
COP5611
35
Remote Object Invocation
• Extend RPC principles to objects
– The key feature of an object is that it encapsulates
data (called state) and the operations on those data
(called methods)
– Methods are made available through an interface
– The separation between interfaces and the objects
implementing these interfaces allows us to place an
interface at one machine, while the object itself
resides on another machine
5/29/2016
COP5611
36
Distributed Objects
• Common organization of a remote object with
client-side proxy.
5/29/2016
COP5611
37
Inherent Limitations of a Distributed System
• Absence of a global clock
– In a centralized system, time is unambiguous
– In a distributed system, there exists no system
wide common clock
• In other words, the notion of global time does not
exist
– Impact of the absence of global time
• Difficult to reason about temporal order of events
• Makes it harder to collect up-to-date information on
the state of the entire system
5/29/2016
COP5611
38
Inherent Limitations of a Distributed System
• Absence of shared memory
– An up-to-date state of the entire system is not
available to any individual process
• This information, however, is necessary to reason
about the system’s behavior, debugging, recovering
from failures
5/29/2016
COP5611
39
Lamport’s Logical Clocks
• Logical clocks
– For a wide of algorithms, what matters is the
internal consistency of clocks, not whether they
are close to the real time
– For these algorithms, the clocks are often called
logical locks
• Lamport proposed a scheme to order events
in a distributed system using logical clocks
5/29/2016
COP5611
40
Lamport’s Logical Clocks – cont.
• Definitions
– Happened before relation
• Happened before relation () captures the causal
dependencies between events
• It is defined as follows
– a  b, if a and b are events in the same process and a
occurred before b.
– a  b, if a is the event of sending a message m in a process
and b is the event of receipt of the same message m by
another process
– If a  b and b  c, then a  c, i.e., “” is transitive
5/29/2016
COP5611
41
Lamport’s Logical Clocks – cont.
• Definitions – continued
– Causally related events
• Event a causally affects event b if a  b
– Concurrent events
• Two distinct events a and b are said to be concurrent
(denoted by a || b) if a  b and b  a
• For any two events, either a  b, b  a, or a || b
5/29/2016
COP5611
42
Lamport’s Logical Clocks – cont.
• Implementation rules
– [IR1] Clock Ci is incremented between any two
successive events in process Pi
Ci := Ci + d ( d > 0)
– [IR2] If event a is the sending of message m by
process Pi, then message m is assigned a
timestamp tm = Ci(a). On receiving the same
message m by process Pj, Cj is set to
Cj := max(Cj, tm + d)
5/29/2016
COP5611
43
An Example
5/29/2016
COP5611
44
Total Ordering Using Lamport’s Clocks
• If a is any event at process Pi and b is any
event at process Pj, then a => b if and only if
either
Ci (a)  C j (b) or
Ci (a)  C j (b) and Pi  Pj
– Where  is any arbitrary relation that totally
orders the processes to break ties
5/29/2016
COP5611
45
A Limitation of Lamport’s Clocks
• In Lamport’s system of logical clocks
– If a  b, then C(a) < C(b)
– The reverse if not necessarily true if the events
have occurred on different processes
5/29/2016
COP5611
46
A Limitation of Lamport’s Clocks
5/29/2016
COP5611
47
Vector Clocks
• Implementation rules
– [IR1] Clock Ci is incremented between any two
successive events in process Pi
Ci[i] := Ci[i] + d ( d > 0)
– [IR2] If event a is the sending of message m by
process Pi, then message m is assigned a
timestamp tm = Ci(a). On receiving the same
message m by process Pj, Cj is set to
Cj[k] := max(Cj[k], tm[k])
5/29/2016
COP5611
48
Vector Clocks – cont.
5/29/2016
COP5611
49
Vector Clocks – cont.
• Assertion
– At any instant,
i, j : Ci [i]  C j [i]
• Events a and b are casually related if ta < tb or tb
< ta. Otherwise, these events are concurrent
• In a system of vector clocks,
a  b iff t  t
a
5/29/2016
COP5611
b
50
Causal Ordering of Messages
• The causal ordering of messages tries to
maintain the same causal relationship that
holds among “message send” events with the
corresponding “message receive” events
– In other words, if Send(M1) -> Send(M2), then
Receive(M1) -> Receive(M2)
– This is different from causal ordering of events
5/29/2016
COP5611
51
Causal Ordering of Messages – cont.
5/29/2016
COP5611
52
Causal Ordering of Messages – cont.
• The basic idea
– It is very simple
– Deliver a message only when no causality
constraints are violated
– Otherwise, the message is not delivered
immediately but is buffered until all the
preceding messages are delivered
5/29/2016
COP5611
53
Birman-Schiper-Stephenson Protocol
5/29/2016
COP5611
54
Schiper-Eggli-Sando Protocol
5/29/2016
COP5611
55
Schiper-Eggli-Sando Protocol – cont.
5/29/2016
COP5611
56
Schiper-Eggli-Sando Protocol – cont.
5/29/2016
COP5611
57
Local State
• Local state
– For a site Si, its local state at a given time is
defined by the local context of the distributed
application, denoted by LSi.
• More notations
– mij denotes a message sent by Si to Sj
– send(mij) and rec(mij) denote the corresponding
sending and receiving event.
5/29/2016
COP5611
58
Definitions – cont.
5/29/2016
COP5611
59
Definitions – cont.
5/29/2016
COP5611
60
Global State – cont.
5/29/2016
COP5611
61
Definitions – cont.
Strongly consistent global state:
A global state is strongly consistent
if it is consistent and transitless
5/29/2016
COP5611
62
Global State – cont.
5/29/2016
COP5611
63
Chandy-Lamport’s Global State Recording Algorithm
5/29/2016
COP5611
64
Cuts of a Distributed Computation
• A cut is a graphical representation of a global
state
– A consistent cut is a graphical representation of a
consistent global state
• Definition
– A cut of a distributed computation is a set C={c1,
c2, ...., cn}, where ci is a cut event at site Si in
the history of the distributed computation
5/29/2016
COP5611
65
Cuts of a Distributed Computation – cont.
5/29/2016
COP5611
66
Cuts of a Distributed Computation – cont.
5/29/2016
COP5611
67
Cuts of a Distributed Computation – cont.
5/29/2016
COP5611
68
Cuts of a Distributed Computation – cont.
5/29/2016
COP5611
69
Cuts of a Distributed Computation – cont.
5/29/2016
COP5611
70
The Critical Section Problem
• When processes (centralized or distributed)
interact through shared resources, the
integrity of the resources may be violated if
the accesses are not coordinated
– The resources may not record all the changes
– A process may obtain inconsistent values
– The final state of the shared resource may be
inconsistent
5/29/2016
COP5611
71
Mutual Exclusion
• One solution to the problem is that at any
time at most only one process can access the
shared resources
– This solution is known as mutual exclusion
– A critical section is a code segment in a process
which shared resources are accessed
• A process can have more than one critical section
• There are problems which involve shared resources
where mutual exclusion is not the optimal solution
5/29/2016
COP5611
72
The Structure of Processes
• Structure of process Pi
repeat
entry section
critical section
exit section
reminder section
until false;
5/29/2016
COP5611
73
Requirements of Mutual Exclusion Algorithms
• Freedom from deadlocks
– Two or more sites should not endlessly wait for messages
• Freedom from starvation
– A site would wait indefinitely to execute its critical section
• Fairness
– Requests are executed in the order based on logical clocks
• Fault tolerant
– It continues to work when some failures occur
5/29/2016
COP5611
74
Performance Measure for Distributed Mutual Exclusion
• The number of messages per CS invocation
• Synchronization delay
– The time required after a site leaves the CS and
before the next site enters the CS
– System throughput 1/(sd+E), where sd is the
synchronization delay and E the average CS
execution time
• Response time
– The time interval a request waits for its CS
execution to be over after its request messages have
been sent out
5/29/2016
COP5611
75
Performance Measure for Distributed Mutual Exclusion
5/29/2016
COP5611
76
A Centralized Algorithm
• It is a simple solution
– One site, called the control site, is responsible for
granting permission to the CS execution
– To request the CS, a site sends a REQUEST
message to the control site
• When a site is done with CS execution, it sends a
RELEASE message to the control site
– The control site queues up the requests for the CS
and grant them permission
5/29/2016
COP5611
77
Distributed Solutions
• Non-token-based algorithms
– Use timestamps to order requests and resolve
conflicts between simultaneous requests
– Lamport’s algorithm and Ricart-Agrawala
Algorithm
• Token-based algorithms
– A unique token is shared among the sites
– A site is allowed to enter the CS if it possess the
token and continues to hold the token until its CS
execution is over; then it passes the token to the
next site
5/29/2016
COP5611
78
Lamport’s Distributed Mutual Exclusion Algorithm
• This algorithm is based on the total ordering using
Lamport’s clocks
– Each process keeps a Lamport’s logical clock
• Each process is associated with a unique id that can be used
to break the ties
– In the algorithm, each process keeps a queue,
request_queuei, which contains mutual exclusion
requests ordered by their timestamp and associated id
– Ri of each process consists of all the processes
– The communication channel is assumed to be FIFO
5/29/2016
COP5611
79
Lamport’s Distributed Mutual Exclusion Algorithm – cont.
5/29/2016
COP5611
80
Lamport’s Distributed Mutual Exclusion Algorithm – cont.
5/29/2016
COP5611
81
Ricart-Agrawala Algorithm
5/29/2016
COP5611
82
A Simple Toke Ring Algorithm
• When the ring is initialized, one process is given
the token
• The token circulates around the ring
– It is passed from k to k+1 (modulo the ring size)
– When a process acquires the token from its
neighbor, it checks to see if it is waiting to enter its
critical section
• If so, it enters its CS
– When exiting from its CS, it passes the token to the next
• Otherwise, it passes the token to the next
5/29/2016
COP5611
83
Suzuki-Kasami’s Algorithm
• Data structures
– Each site maintains a vector consisting the largest
sequence number received so far from other sites
– The token consists of a queue of requesting sites
and an array of integers, consisting of the
sequence number of the request that a site
executed most recently
5/29/2016
COP5611
84
Suzuki-Kasami’s Algorithm – cont.
5/29/2016
COP5611
85
Distributed Deadlock Detection
• In distributed systems, the system state can
be represented by a wait-for graph (WFG)
– In WFG, nodes are processes and there is a
directed edge from node P1 to node P2 if P1 is
blocked and is waiting for P2 to release some
resource
– The system is deadlocked if there is a directed
cycle or knot in its WFG
– The problem is how to maintain the WFG and
detect cycle/knot in the graph
5/29/2016
COP5611
86
Distributed Deadlock Detection – cont.
• Centralized detection algorithms
• Distributed deadlock algorithms
–
–
–
–
–
Path-pushing
Edge-chasing
Diffusion computation
Global state detection
You need to know the basic ideas but not the
details about those algorithms
5/29/2016
COP5611
87
Agreement Protocols
• In distributed systems, sites are often required to
reach mutual agreement
– In distributed database systems, data managers must
agree on whether to commit or to abort a transaction
– Reaching an agreement requires the sites have
knowledge about values at other sites
• Agreement when the system is free from failures
• Agreement when the system is prone to failure
5/29/2016
COP5611
88
Agreement Problems
• There are three well known agreement problems
– Byzantine agreement problem
– Consensus problem
– Interactive consistency problem
5/29/2016
COP5611
89
Lamport-Shostak-Pease Algorithm
5/29/2016
COP5611
90
Lamport-Shostak-Pease Algorithm – cont.
5/29/2016
COP5611
91
Download