Uploaded by Sarthak Gaur

AOS Notes 1

advertisement
Theoretical Foundations of Distributed Operating
Systems
Neeraj Mittal
CS/SE 6378 (Advanced Operating Systems)
The University of Texas at Dallas
June 9, 2020
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
1 / 62
Outline
1
Overview
Distributed Systems
2
Ordering Events
3
Abstract Clocks
4
Ordering of Messages
5
State of a Distributed System
Freezing based Approach
Flushing based Approach
6
Monitoring a Distributed System
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
2 / 62
Overview
Advanced Operating Systems
• Operating systems for
• Distributed systems
• Multicore systems
• Database systems
• Real-time systems
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
4 / 62
Overview
Distributed Systems
Why Distributed Systems?
• Economies of scale
• create a system with desired computational power using a large number
of cheap machines
• Modular growth
• can add more machines to the system as and when needed
• Fault tolerance
• system can continue to operate albeit with reduced performance
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
6 / 62
Overview
Distributed Systems
Two Important Characteristics
• Absence of Global Clock
• there is no common notion of time
• Absence of Shared Memory
• no process has up-to-date knowledge about the system
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
7 / 62
Overview
Distributed Systems
Absence of Global Clock
• Different processes may have different notions of time
I invented AIDS cure!
I invented AIDS cure!
Who did it first?
Patent Officer
Mars
Earth
Problem: How do we order events on different processes?
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
8 / 62
Overview
Distributed Systems
Absence of Shared Memory
• A process does not know current state of other processes
What is Harry doing
at present?
Harry
Bob
Problem: How do we obtain a coherent view of the system?
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
9 / 62
Overview
Distributed Systems
When is it possible to order two events?
• Three cases:
1
Events executed on the same process:
• if e and f are events on the same process and e occurred before f , then
e happened-before f
2
Communication events of the same message:
• if e is the send event of a message and f is the receive event of the
same message, then e happened-before f
3
Events related by transitivity:
• if event e happened-before event g and event g happened-before event
f , then e happened-before f
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
10 / 62
Ordering Events
Outline
1
Overview
Distributed Systems
2
Ordering Events
3
Abstract Clocks
4
Ordering of Messages
5
State of a Distributed System
Freezing based Approach
Flushing based Approach
6
Monitoring a Distributed System
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
11 / 62
Ordering Events
Happened-Before Relation
• Happened-before relation is denoted by →
• Illustration:
a
b
c
P1
d
P2
e
f
P3
g
h
i
• Events on the same process:
examples: a → b, a → c, d → f
• Events of the same message:
examples: b → e, f → i
• Transitivity:
examples: a → e, a → i, e → i
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
12 / 62
Ordering Events
Concurrent Events
• Events not related by happened-before relation
• Concurrency relation is denoted by k
• Illustration:
a
b
c
P1
d
P2
e
f
P3
g
h
i
• Examples: a k d, d k h, c k e
• Concurrency relation is not transitive:
example: a k d and d k c but a ∦ c
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
13 / 62
Abstract Clocks
Outline
1
Overview
Distributed Systems
2
Ordering Events
3
Abstract Clocks
4
Ordering of Messages
5
State of a Distributed System
Freezing based Approach
Flushing based Approach
6
Monitoring a Distributed System
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
14 / 62
Abstract Clocks
Different Kinds of Clocks
• Logical Clocks
• used to totally order all events
• Vector Clocks
• used to track happened-before relation
• Matrix Clocks
• used to track what other processes know about other processes
• Direct Dependency Clocks
• used to track direct causal dependencies
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
15 / 62
Abstract Clocks
Logical Clock
• Implements the notion of virtual time
• Can be used to totally order all events
• Assigns timestamp to each event in a way that is consistent with the
happened-before relation:
e → f =⇒ C (e) < C (f )
C (e): timestamp for event e
C (f ): timestamp for event f
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
16 / 62
Abstract Clocks
Implementing Logical Clock
• Each process has a local scalar clock, initialized to zero
• Ci denotes the local clock of process Pi
• Action depends on the type of the event
• Protocol for process Pi :
• On executing an interval event:
Ci := Ci + 1
• On sending a message m:
Ci := Ci + 1
piggyback Ci on m
• On receiving a message m:
let tm be the timestamp piggybacked on m
Ci := max{Ci , tm } + 1
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
17 / 62
Abstract Clocks
Implementing Logical Clock: An Illustration
a
P1
1
P2
P3
1
a: internal event
Neeraj Mittal (UT Dallas)
Theoretical Foundations
C (a) = 1
June 9, 2020
18 / 62
Abstract Clocks
Implementing Logical Clock: An Illustration
a
P1
1
b
P2
1
c
P3
1
b and c: internal events
Neeraj Mittal (UT Dallas)
C (b) = 1 and C (c) = 1
Theoretical Foundations
June 9, 2020
18 / 62
Abstract Clocks
Implementing Logical Clock: An Illustration
P1
a
d
1
2
2
b
P2
1
c
P3
1
d: send event
Neeraj Mittal (UT Dallas)
Theoretical Foundations
C (d) = 2
June 9, 2020
18 / 62
Abstract Clocks
Implementing Logical Clock: An Illustration
P1
a
d
1
2
2
b
e
1
3
P2
c
P3
1
e: receive event
Neeraj Mittal (UT Dallas)
Theoretical Foundations
C (e) = 3
June 9, 2020
18 / 62
Abstract Clocks
Implementing Logical Clock: An Illustration
P1
a
d
1
2
3
2
b
e
1
3
4
P2
4
4
c
P3
1
Neeraj Mittal (UT Dallas)
2
Theoretical Foundations
5
June 9, 2020
18 / 62
Abstract Clocks
Limitation of Logical Clock
• Logical clock cannot be used to determine whether two events are
concurrent
e k f does not imply C (e) = C (f )
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
19 / 62
Abstract Clocks
Vector Clock
• Captures the happened-before relation
• Assigns timestamp to each event such that:
e → f ⇐⇒ C (e) < C (f )
C (e): timestamp for event e
C (f ): timestamp for event f
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
20 / 62
Abstract Clocks
Comparing Two Vectors
• Vectors are compared component-wise:
• Equality:
V = V′
h∀i : V [i] = V ′ [i]i
iff
• Less Than:
V < V′
iff
h∀i : V [i] ≤ V ′ [i]i ∧ h∃i : V [i] < V ′ [i]i
and
2
• Example:
1
2
0
Neeraj Mittal (UT Dallas)
<
2
3
1
1
1
<
2
Theoretical Foundations
3
4
but
1
2
1
6<
2
1
3
June 9, 2020
21 / 62
Abstract Clocks
Implementing Vector Clock
• Each process has a local vector clock
• Ci denotes the local clock of process Pi
• Action depends on the type of the event
• Protocol for process Pi :
• On executing an interval event:
Ci [i] := Ci [i] + 1
• On sending a message m:
Ci [i] := Ci [i] + 1
piggyback Ci on m
• On receiving a message m:
let tm be the timestamp piggybacked on m
∀k Ci [k] := max{Ci [k], tm [k]}
Ci [i] := Ci [i] + 1
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
22 / 62
Abstract Clocks
Implementing Vector Clock: An Illustration
a
P1
h1i
0
0
P2
P3
h0i
0
1
a: internal event
Neeraj Mittal (UT Dallas)
C (a) =
Theoretical Foundations
1
0
0
June 9, 2020
23 / 62
Abstract Clocks
Implementing Vector Clock: An Illustration
a
P1
h1i
0
0
b
P2
h0i
1
0
c
P3
h0i
0
1
b and c: internal events
Neeraj Mittal (UT Dallas)
C (b) =
Theoretical Foundations
1
0
0
and C (c) =
0
0
1
June 9, 2020
23 / 62
Abstract Clocks
Implementing Vector Clock: An Illustration
a
d
h1i
h2i
P1
0
0
0
0
b
h2i
0
0
P2
h0i
1
0
c
P3
h0i
0
1
d: send event
Neeraj Mittal (UT Dallas)
C (d) =
Theoretical Foundations
2
0
0
June 9, 2020
23 / 62
Abstract Clocks
Implementing Vector Clock: An Illustration
a
d
h1i
h2i
P1
0
0
0
0
b
h2i
0
0
e
P2
h2i
h0i
2
0
1
0
c
P3
h0i
0
1
e: receive event
Neeraj Mittal (UT Dallas)
C (e) =
Theoretical Foundations
2
2
0
June 9, 2020
23 / 62
Abstract Clocks
Implementing Vector Clock: An Illustration
a
d
h1i
h2i
P1
0
0
0
0
b
h2i
h3i
h4i
0
0
0
0
e
0
0
P2
h2i
h0i
2
0
1
0
c
h2i
3
0
h2i
3
0
P3
h0i
0
1
Neeraj Mittal (UT Dallas)
h0i
0
2
Theoretical Foundations
h2i
3
3
June 9, 2020
23 / 62
Abstract Clocks
Properties of Vector Clock
• How many comparisons are needed to determine whether an event e
happened-before another event f ?
• As many as N integers may need to be compared in the worst case,
where N is the number of processes
• Can we do better?
• Suppose we know the processes on which events e and f occurred (say,
Pi and Pj respectively)
e→f
if and only if
W
(i = j) ∧ (C (e)[i] < C (f )[i])
(i 6= j) ∧ (C (e)[i] ≤ C (f )[i])
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
24 / 62
Ordering of Messages
Outline
1
Overview
Distributed Systems
2
Ordering Events
3
Abstract Clocks
4
Ordering of Messages
5
State of a Distributed System
Freezing based Approach
Flushing based Approach
6
Monitoring a Distributed System
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
25 / 62
Ordering of Messages
Ordering of Messages
• For many applications, messages should be delivered in certain order
to be interpreted meaningfully
• Example:
m1: Have you seen the movie “Shrek”?
Bob
m2: Yes I have and I liked it
Alice
Tom
• m2 cannot be interpreted until m1 has been received
• Tom receives m2 before m1 : an undesriable behavior
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
26 / 62
Ordering of Messages
Useful Notations
• For a message m:
•
•
•
•
src(m): source process of m
dst(m): destination process of m
snd(m): send event of m
rcv (m): receive event of m
snd (m)
src (m)
e
Pi
m
Pj
f
dst (m)
Neeraj Mittal (UT Dallas)
rcv (m)
Theoretical Foundations
June 9, 2020
27 / 62
Ordering of Messages
Causal Delivery of Messages
• A message w causally precedes a message m if snd(w ) → snd(m)
• An execution of a distributed system is said to be causally ordered if
the following holds for every message m:
every message that causally precedes m and is destined
for the same process as m is delivered before m
Mathematically, for every message w :
(snd(w ) → snd(m)) ∧ (dst(w ) = dst(m))
=⇒
rcv (w ) → rcv (m)
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
28 / 62
Ordering of Messages
A Causally Ordered Delivery Protocol
• Proposed by Birman, Schiper and Stephenson (BSS)
• Assumption:
• communication is broadcast based: a process sends a message to every
other process
• Each process maintains a vector with one entry for each process:
• let Vi denote the vector for process Pi
• the j th entry of Vi refers to the number of messages that have been
broadcast by process Pj that Pi knows of
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
29 / 62
Ordering of Messages
The BSS Protocol
• Protocol for process Pi :
• On broadcasting a message m:
piggyback Vi on m
Vi [i] := Vi [i] + 1
• On arrival of a message m from process Pj :
let Vm be the vector piggybacked on m
deliver m once Vi ≥ Vm
• On delivery of a message m sent by process Pj :
Vi [j] := Vi [j] + 1
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
30 / 62
Ordering of Messages
The BSS Protocol: An Illustration
e
a
P1
h0i
0
0
P2
h0i
0
0
P3
h0i
0
0
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
31 / 62
Ordering of Messages
The BSS Protocol: An Illustration
e
a
P1
h0i
0
0
P2
h1i
0
0
h0i
h0i
0
0
0
0
b
h0i
0
0
P3
h0i
0
0
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
31 / 62
Ordering of Messages
The BSS Protocol: An Illustration
e
a
P1
h0i
0
0
P2
h0i
0
0
h1i
0
0
h0i
h0i
0
0
0
0
b
h1i
0
0
P3
h0i
0
0
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
31 / 62
Ordering of Messages
The BSS Protocol: An Illustration
e
a
P1
h0i
0
0
P2
h0i
0
0
h1i
0
0
h0i
h0i
0
0
0
0
b
0
0
c
h1i h1i
0
0
h1i
1
0
h1i
0
0
d
P3
h0i
0
0
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
31 / 62
Ordering of Messages
The BSS Protocol: An Illustration
e
a
P1
h0i
0
0
P2
h0i
0
0
h1i
0
0
h0i
h0i
0
0
0
0
b
0
0
c
h1i h1i
0
0
h1i
1
0
h1i
h1i
1
0
0
0
d
P3
h0i
0
0
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
31 / 62
Ordering of Messages
The BSS Protocol: An Illustration
e
a
P1
h0i
0
0
P2
h0i
0
0
h1i
0
0
h0i
h0i
0
0
0
0
b
0
0
c
h1i h1i
0
0
h1i
1
0
h1i
h1i
1
0
0
0
d
f
P3
h0i
0
0
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
31 / 62
Ordering of Messages
The BSS Protocol: An Illustration
e
a
P1
h0i
0
0
P2
h0i
0
0
h1i
0
0
h0i
h0i
0
0
0
0
b
0
0
c
h1i h1i
0
0
h1i
1
0
h1i
h1i
1
0
0
0
d
f
P3
h0i
h1i
0
0
Neeraj Mittal (UT Dallas)
0
0
Theoretical Foundations
June 9, 2020
31 / 62
Ordering of Messages
The BSS Protocol: An Illustration
e
a
P1
h0i
0
0
P2
h0i
0
0
h1i
0
0
h0i
h0i
0
0
0
0
b
c
h1i
0
0
h1i
1
0
h1i
h1i h1i
0
0
0
0
1
0
f
g
P3
h0i
h1i h1i
0
0
Neeraj Mittal (UT Dallas)
0
0
Theoretical Foundations
1
0
June 9, 2020
31 / 62
Ordering of Messages
Another Causally Ordered Delivery Protocol
• Proposed by Raynal, Schiper and Toueg (RST)
• Assumption:
• communication is unicast based: a process sends a message to only one
other process
• Each process Pi maintains a matrix Mi of size N × N
• N is the number of processes in the system
• The entry (j, k) in the matrix Mi captures the following knowledge:
• Suppose Mi [j, k] = x
• Interpretation: As far as process Pi knows, process Pj has sent x
messages to process Pk
• Initially, all entries in all matrices are set to 0
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
32 / 62
Ordering of Messages
The RST Protocol
• Protocol for process Pi :
• On sending message m to process Pj :
piggyback matrix Mi on message m
increment Mi [i, j] by 1
• On receiving message m from process Pj carrying matrix Mm :
buffer the message until Mi [∗, i] ≥ Mm [∗, i]
• On delivery of message m sent by process Pj carrying matrix Mm :
merge Mi and Mm
∀x, y : Mi [x, y ] := max(Mi [x, y ], Mm [x, y ])
increment Mi [j, i] by 1
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
33 / 62
Ordering of Messages
The RST Protocol: An Illustration
h0
0 0
0 0 0
0 0 0
i
h0
i
h0
i
P1
0 0
0 0 0
0 0 0
P2
0 0
0 0 0
0 0 0
P3
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
34 / 62
Ordering of Messages
The RST Protocol: An Illustration
h0
0 0
0 0 0
0 0 0
P1
i
m1
h0
0 0
0 0 0
0 0 0
i
h0
i
P2
0 0
0 0 0
0 0 0
P3
P1 sends m1 to P3
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
34 / 62
Ordering of Messages
The RST Protocol: An Illustration
h0
0 0
0 0 0
0 0 0
P1
i
h0
0 1
0 0 0
0 0 0
i
m1
h0
0 0
0 0 0
0 0 0
h0
0 0
0 0 0
0 0 0
i
h0
i
P2
0 0
0 0 0
0 0 0
P3
i
P1 sends m1 to P3
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
34 / 62
Ordering of Messages
The RST Protocol: An Illustration
h0
0 0
0 0 0
0 0 0
P1
i
h0
0 1
0 0 0
0 0 0
i
m2
m1
h0
0 0
0 0 0
0 0 0
h0
0 0
0 0 0
0 0 0
i
h0
i
P2
0 0
0 0 0
0 0 0
P3
i
P1 sends m2 to P2
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
34 / 62
Ordering of Messages
The RST Protocol: An Illustration
h0
0 0
0 0 0
0 0 0
P1
i
h0
0 1
0 0 0
0 0 0
h0
1 1
0 0 0
0 0 0
i
m2
m1
h0
0 0
0 0 0
0 0 0
h0
0 0
0 0 0
0 0 0
i
h0
i
P2
0 0
0 0 0
0 0 0
P3
i
h0
i
0 1
0 0 0
0 0 0
i
P1 sends m2 to P2
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
34 / 62
Ordering of Messages
The RST Protocol: An Illustration
h0
0 0
0 0 0
0 0 0
P1
i
h0
0 1
0 0 0
0 0 0
h0
1 1
0 0 0
0 0 0
i
m2
m1
h0
0 0
0 0 0
0 0 0
h0
0 0
0 0 0
0 0 0
i
h0
i
P2
0 0
0 0 0
0 0 0
P3
i
h0
i
0 1
0 0 0
0 0 0
i
m2 arrives at P2
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
34 / 62
Ordering of Messages
The RST Protocol: An Illustration
h0
0 0
0 0 0
0 0 0
P1
i
h0
0 1
0 0 0
0 0 0
h0
1 1
0 0 0
0 0 0
i
m2
m1
h0
0 0
0 0 0
0 0 0
h0
0 0
0 0 0
0 0 0
i
h0
i
P2
0 0
0 0 0
0 0 0
P3
i
h0
i
0 1
0 0 0
0 0 0
i
h0
1 1
0 0 0
0 0 0
i
P2 delivers m2
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
34 / 62
Ordering of Messages
The RST Protocol: An Illustration
h0
0 0
0 0 0
0 0 0
P1
i
h0
0 1
0 0 0
0 0 0
h0
1 1
0 0 0
0 0 0
i
m2
m1
h0
0 0
0 0 0
0 0 0
h0
0 0
0 0 0
0 0 0
P2
i
i
h0
i
0 1
0 0 0
0 0 0
i
h0
1 1
0 0 0
0 0 0
i
m3
h0
0 0
0 0 0
0 0 0
P3
i
P2 sends m3 to P3
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
34 / 62
Ordering of Messages
The RST Protocol: An Illustration
h0
0 0
0 0 0
0 0 0
P1
i
h0
0 1
0 0 0
0 0 0
h0
1 1
0 0 0
0 0 0
i
m2
m1
h0
0 0
0 0 0
0 0 0
h0
0 0
0 0 0
0 0 0
P2
i
i
h0
i
0 1
0 0 0
0 0 0
i
h0
1 1
0 0 0
0 0 0
h0
1 1
0 0 1
0 0 0
i
m3
h0
0 0
0 0 0
0 0 0
P3
h0
i
1 1
0 0 0
0 0 0
i
i
P2 sends m3 to P3
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
34 / 62
Ordering of Messages
The RST Protocol: An Illustration
h0
0 0
0 0 0
0 0 0
P1
i
h0
0 1
0 0 0
0 0 0
h0
1 1
0 0 0
0 0 0
i
m2
m1
h0
0 0
0 0 0
0 0 0
h0
0 0
0 0 0
0 0 0
P2
i
i
h0
i
0 1
0 0 0
0 0 0
i
h0
1 1
0 0 0
0 0 0
h0
1 1
0 0 1
0 0 0
i
m3
h0
0 0
0 0 0
0 0 0
P3
h0
i
1 1
0 0 0
0 0 0
i
i
m3 arrives at P3 and is buffered
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
34 / 62
Ordering of Messages
The RST Protocol: An Illustration
h0
0 0
0 0 0
0 0 0
P1
i
h0
0 1
0 0 0
0 0 0
h0
1 1
0 0 0
0 0 0
i
m2
m1
h0
0 0
0 0 0
0 0 0
h0
0 0
0 0 0
0 0 0
P2
i
i
h0
i
0 1
0 0 0
0 0 0
i
h0
1 1
0 0 0
0 0 0
h0
1 1
0 0 1
0 0 0
i
m3
h0
0 0
0 0 0
0 0 0
P3
h0
i
1 1
0 0 0
0 0 0
i
i
m1 arrives at P3
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
34 / 62
Ordering of Messages
The RST Protocol: An Illustration
h0
0 0
0 0 0
0 0 0
P1
i
h0
0 1
0 0 0
0 0 0
h0
1 1
0 0 0
0 0 0
i
m2
m1
h0
0 0
0 0 0
0 0 0
h0
0 0
0 0 0
0 0 0
P2
i
i
h0
i
0 1
0 0 0
0 0 0
i
h0
1 1
0 0 0
0 0 0
h0
1 1
0 0 1
0 0 0
i
m3
h0
0 0
0 0 0
0 0 0
P3
h0
i
1 1
0 0 0
0 0 0
i
h0
0 1
0 0 0
0 0 0
i
i
P3 delivers m1
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
34 / 62
Ordering of Messages
The RST Protocol: An Illustration
h0
0 0
0 0 0
0 0 0
P1
i
h0
0 1
0 0 0
0 0 0
h0
1 1
0 0 0
0 0 0
i
m2
m1
h0
0 0
0 0 0
0 0 0
h0
0 0
0 0 0
0 0 0
P2
i
i
h0
i
0 1
0 0 0
0 0 0
i
h0
1 1
0 0 0
0 0 0
h0
1 1
0 0 1
0 0 0
i
m3
h0
0 0
0 0 0
0 0 0
P3
h0
i
1 1
0 0 0
0 0 0
i
h0
0 1
0 0 0
0 0 0
i
i
h0
1 1
0 0 1
0 0 0
i
P3 delivers m2
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
34 / 62
Ordering of Messages
The RST Protocol: An Illustration
h0
0 0
0 0 0
0 0 0
P1
i
h0
0 1
0 0 0
0 0 0
h0
1 1
0 0 0
0 0 0
i
m2
m1
h0
0 0
0 0 0
0 0 0
h0
0 0
0 0 0
0 0 0
P2
i
i
h0
i
0 1
0 0 0
0 0 0
i
h0
1 1
0 0 0
0 0 0
h0
1 1
0 0 1
0 0 0
i
m3
h0
0 0
0 0 0
0 0 0
P3
h0
1 1
0 0 0
0 0 0
h0
0 1
0 0 0
0 0 0
i
Neeraj Mittal (UT Dallas)
i
Theoretical Foundations
i
i
h0
1 1
0 0 1
0 0 0
June 9, 2020
i
34 / 62
State of a Distributed System
Outline
1
Overview
Distributed Systems
2
Ordering Events
3
Abstract Clocks
4
Ordering of Messages
5
State of a Distributed System
Freezing based Approach
Flushing based Approach
6
Monitoring a Distributed System
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
35 / 62
State of a Distributed System
State of a Distributed System
• State of a distributed system is a collection of states of all its
processes and channels:
• a process state is given by the values of all variables on the process
• a channel state is given by the set of messages in transit in the channel
• can be determined by examining states of the two processes it connects
• State of a process is called local state or local snapshot
• State of the system is called global state or global snapshot
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
36 / 62
State of a Distributed System
Events and Local States
• A process changes its state by executing an event
• Example:
x =0
P1
x =2
x =3
a
y =1
P2
b
y =2
c
y =3
d
• Process P1 changes its state from x = 0 to x = 2 on executing event a
• Process P2 changes its state from y = 2 to y = 3 on executing event d
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
37 / 62
State of a Distributed System
When is a Global State Meaningful?
• Example:
x =0
P1
p
y =1
P2
s
x =2
a
x =3
q
b
y =2
c
t
r
y =3
d
u
• Is it possible for x to be 0 and y to be 3 at the same time?
• Does {p, u} form a meaningful global state?
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
38 / 62
State of a Distributed System
Revisiting Happened-Before Relation
• Earlier, happened-before relation was defined on events
• The relation can be extended to local states as follows:
• let s be a local state of process Pi
• let t be a local state of process Pj
• s → t if there exist events e and f such that:
1 Pi executed e after s,
2 Pj executed f before t, and
3 e→f
• If s → t, then s and t cannot co-exist at the same time
Also, s must occur before t
• If s k t, then s and t can co-exist simultaneously
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
39 / 62
State of a Distributed System
A Consistent Global State
• For a global state G , let G [i] refer to the local state of process Pi in G
• A global state G is meaningful or consistent if
∀i, j : i 6= j : (G [i] 9 G [j]) ∧ (G [j] 9 G [i])
≡
∀i, j : i 6= j : G [i] k G [j]
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
40 / 62
State of a Distributed System
Recording a Consistent Global Snapshot
• Many algorithms have been proposed
• Focus on two approaches
• Freezing based approach
• Flushing based approach
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
41 / 62
State of a Distributed System
Freezing based Approach
Outline
1
Overview
Distributed Systems
2
Ordering Events
3
Abstract Clocks
4
Ordering of Messages
5
State of a Distributed System
Freezing based Approach
Flushing based Approach
6
Monitoring a Distributed System
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
42 / 62
State of a Distributed System
Freezing based Approach
Freezing based Approach
• A distinguished process is responsible for collecting the system
snapshot
• Has two non-overlapping phases
• Each phase involves the following
• The snapshot process sends a request message to all processes and
waits to receive a reply from each one of them
• A process on receiving a message from the snapshot process sends a
reply message back possibly containing its local state
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
43 / 62
State of a Distributed System
Freezing based Approach
Phase I: Freeze Phase
• Process is frozen and cannot execute certain events
• Request message is referred to as Freeze message
• Reply message is referred to as Frozen message
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
44 / 62
State of a Distributed System
Freezing based Approach
Phase II: Unfreeze Phase
• Process is unfrozen and resumes normal execution
• Request message is referred to as Unfreeze message
• Reply message is referred to as Unfrozen message
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
45 / 62
State of a Distributed System
Freezing based Approach
Two Phase Snapshot Algorithms
• Multiple options possible
1
Which event to freeze? (send event or receive event)
2
When to collect process states? (freeze phase or unfreeze phase)
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
46 / 62
State of a Distributed System
Freezing based Approach
Two Phase Snapshot Algorithms
• Four possible options
• Two yield correct algorithms (why?)
1
Freeze send events and collect process states in phase I
2
Freeze receive events and collect process states in phase II
• Two yield incorrect algorithms (why?)
3
Freeze send events and collect process states in phase II
4
Freeze receive events and collect process states in phase I
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
47 / 62
State of a Distributed System
Freezing based Approach
Algorithm 1: An Illustration
Snapshot
Process
P1
P2
P3
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
48 / 62
State of a Distributed System
Freezing based Approach
Algorithm 1: An Illustration
Snapshot
Process
P1
P2
P3
The snapshot process sends Freeze message to all processes.
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
48 / 62
State of a Distributed System
Freezing based Approach
Algorithm 1: An Illustration
Snapshot
Process
Phase I
P1
P2
P3
A process, on receiving the Freeze message, freezes its send events,
records its local state and sends Frozen message back to the snapshot
process (along with its state).
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
48 / 62
State of a Distributed System
Freezing based Approach
Algorithm 1: An Illustration
Snapshot
Process
Phase I
P1
P2
P3
After receiving Frozen message from all processes, the snapshot process
sends Unfreeze message to all processes.
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
48 / 62
State of a Distributed System
Freezing based Approach
Algorithm 1: An Illustration
Snapshot
Process
Phase I
Phase II
P1
P2
P3
A process, on receiving the Unfreeze message, unfreezes its send events
and sends Unfrozen message back to the snapshot process.
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
48 / 62
State of a Distributed System
Freezing based Approach
Algorithm 2: An Illustration
Snapshot
Process
P1
P2
P3
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
49 / 62
State of a Distributed System
Freezing based Approach
Algorithm 2: An Illustration
Snapshot
Process
P1
P2
P3
The snapshot process sends Freeze message to all processes.
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
49 / 62
State of a Distributed System
Freezing based Approach
Algorithm 2: An Illustration
Snapshot
Process
Phase I
P1
P2
P3
A process, on receiving the Freeze message, freezes its receive events
and sends Frozen message back to the snapshot process.
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
49 / 62
State of a Distributed System
Freezing based Approach
Algorithm 2: An Illustration
Snapshot
Process
Phase I
P1
P2
P3
After receiving Frozen message from all processes, the snapshot process
sends Unfreeze message to all processes.
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
49 / 62
State of a Distributed System
Freezing based Approach
Algorithm 2: An Illustration
Snapshot
Process
Phase I
Phase II
P1
P2
P3
A process, on receiving the Unfreeze message, records it local state,
unfreezes its receive events, and sends Unfrozen message back to the
snapshot process (along with its local state).
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
49 / 62
State of a Distributed System
Flushing based Approach
Outline
1
Overview
Distributed Systems
2
Ordering Events
3
Abstract Clocks
4
Ordering of Messages
5
State of a Distributed System
Freezing based Approach
Flushing based Approach
6
Monitoring a Distributed System
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
50 / 62
State of a Distributed System
Flushing based Approach
Flushing based Approach
• Proposed by Chandy and Lamport (CL)
• Assumptions and Features:
• channels satisfy first-in-first-out (FIFO) property
• channels are not required to be bidirectional
• communication topology may not be fully connected
• Messages exchanged by the underlying computation (whose snapshot
is being recorded) are called application messages
• Messages exchanged by the snapshot algorithm are called control
messages
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
51 / 62
State of a Distributed System
Flushing based Approach
The CL Protocol
• Initially: all processes are colored blue
• Eventually: all processes become red
• On changing color from blue to red:
record local snapshot
send a marker message along all outgoing channels
• On receiving marker message along incoming channel C :
if color is blue then
change color from blue to red
endif
record state of channel C as application messages received along C
since turning red
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
52 / 62
State of a Distributed System
Flushing based Approach
The CL Protocol (Continued)
• Any process can initiate the snapshot protocol by spontaneously
changing its color from blue to red
• there can be multiple initiators of the snapshot protocol
• Global Snapshot: local snapshots of processes just after they turn red
• In-Transit Messages: blue application messages received by processes
after they have turned red
• Why is the global snapshot consistent?
• Assume an application message has the same color as its sender
• Can a blue process receive a red application message?
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
53 / 62
State of a Distributed System
Flushing based Approach
The CL Protocol: An Illustration
• Three processes: P1 , P2 and P3
• Five channels: C1,2 , C1,3 , C2,1 , C2,3 and C3,1
P1
C1,3
C1,2
C3,1
C2,1
P3
P2
C2,3
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
54 / 62
State of a Distributed System
Flushing based Approach
The CL Protocol: An Illustration (Continued)
P1
m5
m2
m4
m3
P2
m1
P3
1
All processes are blue
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
55 / 62
State of a Distributed System
Flushing based Approach
The CL Protocol: An Illustration (Continued)
P1
m5
m2
m4
m3
P2
m1
P3
1
P1 turns red
2
P1 sends a marker message along C1,2 and C1,3
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
55 / 62
State of a Distributed System
Flushing based Approach
The CL Protocol: An Illustration (Continued)
P1
m5
m2
m4
m3
P2
m1
P3
1
P2 receives the marker message along C1,2 and turns red
2
P2 sends a marker message along C2,1 and C2,3
3
P2 records the state of C1,2 as ∅
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
55 / 62
State of a Distributed System
Flushing based Approach
The CL Protocol: An Illustration (Continued)
P1
m5
m2
m4
m3
P2
m1
P3
1
P3 receives the marker message along C1,3 and turns red
2
P3 sends a marker message along C3,1
3
P3 records the state of C1,3 as ∅
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
55 / 62
State of a Distributed System
Flushing based Approach
The CL Protocol: An Illustration (Continued)
P1
m5
m2
m4
m3
P2
m1
P3
1
P1 receives the marker message along C2,1
2
P1 records the state of C2,1 as {m4 , m5 }
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
55 / 62
State of a Distributed System
Flushing based Approach
The CL Protocol: An Illustration (Continued)
P1
m5
m2
m4
m3
P2
m1
P3
1
P3 receives the marker message along C2,3
2
P3 records the state of C2,3 as {m1 }
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
55 / 62
State of a Distributed System
Flushing based Approach
The CL Protocol: An Illustration (Continued)
P1
m5
m2
m4
m3
P2
m1
P3
1
P1 receives the marker message along C3,1
2
P1 records the state of C3,1 as {m2 , m3 }
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
55 / 62
Monitoring a Distributed System
Outline
1
Overview
Distributed Systems
2
Ordering Events
3
Abstract Clocks
4
Ordering of Messages
5
State of a Distributed System
Freezing based Approach
Flushing based Approach
6
Monitoring a Distributed System
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
56 / 62
Monitoring a Distributed System
A Subclass of Distributed Computation
• Many distributed computations obey the following paradigm:
• A process is either in active state or passive state
• A process can send an application message only when it is active
• An active process can become passive at any time
• A passive process, on receiving an application message, becomes active
• Intuitively:
• If a process is active, then it is doing some work
• If process is passive, then it is idle
• An active process uses an application message to send a part of its
work to another process
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
57 / 62
Monitoring a Distributed System
Distributed Computation: An Illustration
P1
1
0
0
1
P2
P3
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
58 / 62
Monitoring a Distributed System
Distributed Computation: An Illustration
P1
1
0
0
1
m1
P2
P3
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
58 / 62
Monitoring a Distributed System
Distributed Computation: An Illustration
P1
1
0
0
1
m1
P2
1
0
0
1
0
1
P3
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
58 / 62
Monitoring a Distributed System
Distributed Computation: An Illustration
P1
1
0
0
1
m1
P2
1
0
0
1
0
1
m2
P3
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
58 / 62
Monitoring a Distributed System
Distributed Computation: An Illustration
P1
1
0
0
1
1
0
0
1
m1
P2
1
0
0
1
0
1
m2
P3
Neeraj Mittal (UT Dallas)
11
00
00
11
Theoretical Foundations
June 9, 2020
58 / 62
Monitoring a Distributed System
Distributed Computation: An Illustration
P1
1
0
0
1
1
0
0
1
m1
P2
1
0
0
1
0
1
m2
m3
P3
Neeraj Mittal (UT Dallas)
11
00
00
11
Theoretical Foundations
June 9, 2020
58 / 62
Monitoring a Distributed System
Distributed Computation: An Illustration
P1
1
0
0
1
1
0
0
1
m1
m4
P2
1
0
0
1
0
1
m2
m3
P3
Neeraj Mittal (UT Dallas)
11
00
00
11
Theoretical Foundations
June 9, 2020
58 / 62
Monitoring a Distributed System
Distributed Computation: An Illustration
P1
1
0
0
1
1
0
0
1
11
00
00
11
m1
m4
P2
1
0
0
1
0
1
m2
m3
P3
Neeraj Mittal (UT Dallas)
11
00
00
11
Theoretical Foundations
June 9, 2020
58 / 62
Monitoring a Distributed System
Distributed Computation: An Illustration
P1
1
0
0
1
1
0
0
1
11
00
00
11
m1
m4
P2
1
0
0
1
0
1
1
0
0
1
m2
m3
P3
Neeraj Mittal (UT Dallas)
11
00
00
11
1
0
0
1
Theoretical Foundations
June 9, 2020
58 / 62
Monitoring a Distributed System
Distributed Computation: An Illustration
P1
1
0
0
1
1
0
0
1
11
00
00
11
1
0
0
1
m1
m4
P2
1
0
0
1
0
1
1
0
0
1
m2
m3
P3
Neeraj Mittal (UT Dallas)
11
00
00
11
1
0
0
1
Theoretical Foundations
June 9, 2020
58 / 62
Monitoring a Distributed System
Termination Detection
• To detect if the computation has finished doing all the work
• all processes have become passive, and
• all channels have become empty
• Different types of computations:
• diffusing: only one process is active in the beginning
• non-diffusing: any subset of processes can be active in the beginning
• no process knows which processes are active and which processes are
passive
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
59 / 62
Monitoring a Distributed System
Huang’s Protocol
• Assumption: computation is diffusing
• the initially active process is called the coordinator
• coordinator is responsible for detecting termination
• Initially:
• coordinator has a weight of 1
• all other processes have a weight of 0
• Invariants:
• total amount of weight in the system is 1
• a non-coordinator process has a non-zero weight if and only if it is
active
• a channel has a non-zero weight if and only if it is non-empty
• weight of a channel is the sum of the weight of all messages in it
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
60 / 62
Monitoring a Distributed System
Huang’s Protocol (Continued)
• Actions:
• On sending an application message:
• send half of the weight along with the message
• On receiving an application message:
• add the weight of the message to the current weight
• On becoming passive:
• send the current weight to the coordinator
• Coordinator announces termination once:
• it has become passive and
• it has collected all the weight
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
61 / 62
Monitoring a Distributed System
Huang’s Protocol: An Illustration
1
00
11
00
00
11
P11
1
0
P2
0
P3
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
62 / 62
Monitoring a Distributed System
Huang’s Protocol: An Illustration
1
00
11
00
00
11
P11
1
1
2
m1( 21 )
0
P2
0
P3
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
62 / 62
Monitoring a Distributed System
Huang’s Protocol: An Illustration
1
00
11
00
00
11
P11
1
1
2
m1( 21 )
0
P2
1
002
11
11
00
00
11
0
P3
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
62 / 62
Monitoring a Distributed System
Huang’s Protocol: An Illustration
1
00
11
00
00
11
P11
1
1
2
m1( 21 )
0
P2
1
002
11
11
00
00
11
1
4
m2( 14 )
0
P3
Neeraj Mittal (UT Dallas)
Theoretical Foundations
June 9, 2020
62 / 62
Monitoring a Distributed System
Huang’s Protocol: An Illustration
1
00
11
00
00
11
P11
1
1
2
00
11
11
00
00
11
m1( 21 )
0
P2
1
002
11
11
00
00
11
1
4
m2( 14 )
0
P3
Neeraj Mittal (UT Dallas)
1
4
11
00
00
11
Theoretical Foundations
June 9, 2020
62 / 62
Monitoring a Distributed System
Huang’s Protocol: An Illustration
1
00
11
00
00
11
P11
1
1
2
00
11
11
00
00
11
m1( 21 )
0
P2
1
002
11
11
00
00
11
1
4
m2( 14 )
0
P3
Neeraj Mittal (UT Dallas)
1
4
11
00
00
11
m3( 81 )
1
8
Theoretical Foundations
June 9, 2020
62 / 62
Monitoring a Distributed System
Huang’s Protocol: An Illustration
1
00
11
00
00
11
P11
1
1
2
00
11
11
00
00
11
m1( 21 )
0
P2
1
002
11
11
00
00
11
1
m4( 16
)
1
4
3
8
m2( 14 )
0
P3
Neeraj Mittal (UT Dallas)
1
4
11
00
00
11
m3( 18 )
1
8
1
16
Theoretical Foundations
June 9, 2020
62 / 62
Monitoring a Distributed System
Huang’s Protocol: An Illustration
1
00
11
00
00
11
P11
1
1
2
9
0016
11
11
00
00
11
00
11
11
00
00
11
m1( 21 )
0
P2
1
002
11
11
00
00
11
1
m4( 16
)
1
4
3
8
m2( 14 )
0
P3
Neeraj Mittal (UT Dallas)
1
4
11
00
00
11
m3( 18 )
1
8
1
16
Theoretical Foundations
June 9, 2020
62 / 62
Monitoring a Distributed System
Huang’s Protocol: An Illustration
1
00
11
00
00
11
P11
1
1
2
9
0016
11
11
00
00
11
00
11
11
00
00
11
m1( 21 )
15
16
1
m4( 16
)
3
8
0
P2
1
2
00
11
00
11
00
11
1
4
3
8
11
00
00
11
0
1
16
m2( 14 )
0
P3
Neeraj Mittal (UT Dallas)
1
4
11
00
00
11
m3( 18 )
1
8
1
16 00
11
00
11
Theoretical Foundations
0
June 9, 2020
62 / 62
Monitoring a Distributed System
Huang’s Protocol: An Illustration
1
00
11
00
00
11
P11
1
1
2
9
0016
11
11
00
00
11
00
11
11
00
00
11
m1( 21 )
15
16 0
1
0
1
0
1
1
1
m4( 16
)
3
8
0
P2
1
2
00
11
00
11
00
11
1
4
3
8
11
00
00
11
0
1
16
m2( 14 )
0
P3
Neeraj Mittal (UT Dallas)
1
4
11
00
00
11
m3( 18 )
1
8
1
16 00
11
00
11
Theoretical Foundations
0
June 9, 2020
62 / 62
Download