Distributed System • Definition: Lecture 4: Distributed Real-time Systems ... a system of multiple autonomous processing elements, cooperating in a common purpose or to achieve a common goal. Mikael Asplund Real-Time Systems Laboratory Department of Computer and Information Science Linköpings universitet Sweden Lecture 4: Distributed Real-time Systems Mikael Asplund Lecture 4: Distributed Real-time Systems Mikael Asplund Advantages 2 Challenges • Performance • Transparency • Distribution • Communication • Reliability • Performance • Scalability • Heterogeneity • Sharing of resources • Openness • Communication • Reliability • Security Lecture 4: Distributed Real-time Systems Mikael Asplund 3 Lecture 4: Distributed Real-time Systems Mikael Asplund 4 Definition The second is the duration of 9,192,631,770 periods of the radiation corresponding to the transition between the two hyperfine levels of the ground state of the caesium 133 atom at a temperature of 0 Kelvin. Time and Ordering Lecture 4: Distributed Real-time Systems Mikael Asplund 5 Lecture 4: Distributed Real-time Systems Mikael Asplund 6 Modeling time • A set of instants T that is isomorphic to the set Terminology • A duration is the interval between two of real numbers, that is: instances – a field • An event is an action at an instant of time – ordered – events are thus partially ordered! – Dedekind complete • Alternatives: – discrete time – dense time (rational numbers) • Ignore relativity! Lecture 4: Distributed Real-time Systems Mikael Asplund Lecture 4: Distributed Real-time Systems Mikael Asplund 7 Standards • International atomic time (TAI) 8 Clocks • A device for measuring time – proper time of earths geoid • The granularity of the clock is the smallest – Chronoscopic (no disconuities) duration that can be measured – Starts from January 1st 1958 • A timestamp of an event e is the state of the clock immediately at the instant of the event, • Universal time Coordinated (UTC) denoted clock(event). – Astronomical time – Based on UTC (with leap seconds) Lecture 4: Distributed Real-time Systems Mikael Asplund Lecture 4: Distributed Real-time Systems Mikael Asplund 9 Global vs. local clocks 10 Synchronous vs. asynchronous • A distributed system S is a set of sequential processes – p1, p2, …, pn • System is synchronous: whenever pi makes one step, pj makes n (n ≥ 1) steps, and there is a bound on message delays • System is asynchronous if no such bounds exists Lecture 4: Distributed Real-time Systems Mikael Asplund 11 Lecture 4: Distributed Real-time Systems Mikael Asplund 12 Ordering of events • Events: Happened-before ordering • a → b, if a and b are events at the same process and a happened before b – Local execution steps • a → b, if a is the event of sending a message m and b is the event of the same message being received – Send message – Receive message • Transitive, if a → b and b → c then a → c p0 q0 p1 p2 q1 p3 q2 p4 q3 p5 p0 q4 Lecture 4: Distributed Real-time Systems Mikael Asplund q0 13 p1 p2 q1 p3 q2 p4 q3 p5 q4 Lecture 4: Distributed Real-time Systems Mikael Asplund Lamport's logical clocks Happened-before ordering • Usually called causal order • Let each process keep a counter C • Cannot be realised using physical clocks • For every internal event: increment C – Why? • For each message that is sent: piggyback a • Solution: use logical clocks timestamp t=C – Each process has a logical clock C • For every message that is received: – Requirment: if a → b, then C(a) < C(b) Lecture 4: Distributed Real-time Systems Mikael Asplund 14 Let C = max(t,C) 15 Lecture 4: Distributed Real-time Systems Mikael Asplund 16 Dependability & Distribution • Making systems fault-tolerant typically uses redundancy – Redundancy in space leads to distribution Fault-tolerant Distributed Systems – But distributed systems are not necessarily faulttolerant! Lecture 4: Distributed Real-time Systems Mikael Asplund 17 Lecture 4: Distributed Real-time Systems Mikael Asplund 18 Achieving availability Fault models revisited • Node failures – Crash – Omission • Active replication – Byzantine (arbitrary) – Group membership • Passive replication • Channel failures – Primary – backup – Crash (and potential partitions) • What do I need to implement it? – Message loss – Message ordering – Agreement among replicas – Erroneous/arbitrary messages Lecture 4: Distributed Real-time Systems Mikael Asplund 19 Lecture 4: Distributed Real-time Systems Mikael Asplund 20 A useful broadcast ”Chicken and egg” problem • Reliable broadcast – Agreement: All non-crashed processes agree on • Replication is useful in presence of failures if messages delivered there is a consistent common state among • I.e for any message m, if a correct process replicas delivers m, then every correct process delivers m. – What happens when a replica fails? – Integrity: No spurious messages • To get consistency, processes need to • I.e. no erroneous, duplicated or created messages communicate their state via broadcast – Validity: All messages broadcast by non-crashed processes are delivered • But broadcast algorithms are distributed algorithms that run on every node – also affected by failures… Lecture 4: Distributed Real-time Systems Mikael Asplund 21 How to implement? • The first step is to separate the underlying network Lecture 4: Distributed Real-time Systems Mikael Asplund 22 Common channel assumptions • Communication channel assumptions (transport) and the broadcast mechanism • Distinguish between receipt and delivery of a message – No link failures lead to partition – Send does not duplicate or change messages – Receive does not ”invent” messages Lecture 4: Distributed Real-time Systems Mikael Asplund 23 Lecture 4: Distributed Real-time Systems Mikael Asplund 24 Reliable broadcast Failures • What happens if p fails • Within every process p – Directly after a receipt – While relaying – Execute broadcast(m) of message m by: – Before sending the message • adding sender(m) and a unique ID as a header to the message m – After sending to some, but not all neighbours • send(m) to all neighbours including itself • Prove correctness of algorithm by proving the necessary properties in: – When receive(m): – Validity • if previously not executed deliver(m) then – Integrity • if sender(m) ≠ p then send(m) to all neighbours – Agreement • deliver(m) – Order Lecture 4: Distributed Real-time Systems Mikael Asplund 25 Brake by wire Lecture 4: Distributed Real-time Systems Mikael Asplund 26 The consensus problem • Processes p1,…, pn take part in a decision – Each pi proposes a value vi – All correct processes decide on a common value v that is equal to one of the proposed values • Desired properties – Termination: Every correct process eventually decides – Agreement: No two correct processes decide differently – Validity: If a process decides v then the value v was proposed by some process Lecture 4: Distributed Real-time Systems Mikael Asplund 27 Lecture 4: Distributed Real-time Systems Mikael Asplund 28 Assume synchrony Basic impossibility result • If a node does not respond within time t, it will [Fischer, Lynch and Paterson 1985] not respond at time t+d • Partial synchrony – Bounds exist but are not known • There is no deterministic algorithm solving the consensus problem in an asynchronous distributed system with a single crash failure. • Powerful abstraction: – Unreliable failure detectors • Why? Lecture 4: Distributed Real-time Systems Mikael Asplund 29 Lecture 4: Distributed Real-time Systems Mikael Asplund 30 Total order broadcast Byzantine generals • Total order broadcast – Reliable broadcast + total order property • Total order property – Let m and m’ be any two messages. – Let p be a (correct) process that delivers m without having delivered m’ – Then no (correct) process delivers m’ before m • Total order property by consensus in – A synchronous network, with – Reliable broadcast Lecture 4: Distributed Real-time Systems Mikael Asplund 31 Lecture 4: Distributed Real-time Systems Mikael Asplund Byzantine generals • Theorem: There is an upper bound t for the number of Byzantine 32 Scenario 1 • G and L1 are correct, L2 is faulty failures compared to the size of the network N – N ≥ 3t+1 • Gives a t+1 round algorithm for solving consensus in a synchronous network Lecture 4: Distributed Real-time Systems Mikael Asplund 33 Lecture 4: Distributed Real-time Systems Mikael Asplund Scenario 2 • G and L2 are correct, L1 is faulty Lecture 4: Distributed Real-time Systems Mikael Asplund 34 Scenario 3 • L1 and L2 are correct, G is faulty 35 Lecture 4: Distributed Real-time Systems Mikael Asplund 36 2-round algorithm • … does not work with t=1, N=3! • Seen from L1, scenario 1 and 3 are identical, so if L1 decides 1 in scenario 1 it will decide 1 Clock synchronization in scenario 3 • Similarly for L2, if it decides 0 in scenario 2 it decides 0 in scenario 3 • Lecture Distributed Systems L1 4:and L2Real-time do not agree in scenario 3 ! Mikael Asplund 37 Lecture 4: Distributed Real-time Systems Mikael Asplund Clock drift • Assume the existence of a reference clock r Precision • Offset between clocks c_i and c_j: • Clock drift for clock c: drift= 38 ij offset =∣c i e−c j e∣ c e1 −c e2 r e1 −r e2 • Precision of a group of clocks • Drift rate: P = max offset ij i , j ∈[1,n] =∣drift−1∣ • Bounding precision -> internal synchronization Lecture 4: Distributed Real-time Systems Mikael Asplund 39 Lecture 4: Distributed Real-time Systems Mikael Asplund Accuracy 40 Faults • Excessive drift • The accuracy of a clock is the offset to the reference clock • Clock reading errors • Bounding accuracy -> external synchronization • Byzantine faults (dual-faced clocks) • External synchronization implies internal synchronization Lecture 4: Distributed Real-time Systems Mikael Asplund 41 Lecture 4: Distributed Real-time Systems Mikael Asplund 42 How often Time server • ρ: Drift rate • R: Time between synchronizations mr • P: Precision after synchronization • Choose R so that PR⋅2 P required mt p Lecture 4: Distributed Real-time Systems Mikael Asplund 43 Time server,S Lecture 4: Distributed Real-time Systems Mikael Asplund Network Time Protocol 44 Averaging • Mean of clocks excluding t fastest and t slowest 1 2 3 2 3 3 New clock value Note: Arrows denote synchronization control, numbers denote strata. Lecture 4: Distributed Real-time Systems Mikael Asplund 45 Lecture 4: Distributed Real-time Systems Mikael Asplund 46