Distributed Computer System TDDB47 Real Time Systems Lecture 6: Distributed systems

advertisement
Distributed Computer System
TDDB47 Real Time Systems
•
Definition
… a system of multiple autonomous processing elements,
cooperating in a common purpose or to achieve a common goal.
– Excludes computer networks with no common purpose, e.g.,
Internet
– although Internet computers computing genome information
are
•
Tightly coupled: access to common memory
– Synchronization possible by the use of shared variables
Loosely coupled: no common memory
– Synchronization by the use of message passing
Lecture 6: Distributed systems
Calin Curescu
Real-Time Systems Laboratory
Department of Computer and Information Science
Linköping University, Sweden
•
The lecture notes are partly based on lecture notes by Simin Nadjm-Tehrani, Jörgen Hansson, Anders Törne.
They also loosely follow Burns’ and Welling book “Real-Time Systems and Programming Languages”. These
lecture notes should only be used for internal teaching purposes at Linköping University.
Lecture 6: Distributed systems
Calin Curescu
30 pages
•
Homogeneous vs. heterogeneous system
Lecture 6: Distributed systems
Calin Curescu
Reasons for distribution
•
•
•
•
Lecture 6: Distributed systems
Calin Curescu
Issues
•
Exploitation of parallelism
– Improved performance
• Blue Gene/L – 33000 CPUs - 70.72 teraflops
• Heavy duty computation – e.g. weather forecast
Exploitation of redundancy
– Increased availability and reliability!
– More faults in the system!
• Banking, communications
Dispersion of computing power to locations where it is used
• Engine control, brake system, gearbox control, airbag …
Addition or enhancement of processors and communication links
– Scalability, load balancing
• Web server farms
•
•
•
3 of 30
Language support
– Support for partition, configuration, allocation, reconfig.
– Distribution transparency?
• RPC, Real-time CORBA
Dependability & Reliability
– Possibility for more reliability
– Deal with partial failures
Distributed control algorithms
– Distributed process synchronisation
– Communication system support
Scheduling
– Ensure end-to-end deadlines
– Single processor systems are not optimal any longer
Lecture 6: Distributed systems
Calin Curescu
Dependability & Distribution
•
Making systems fault-tolerant typically uses redundancy
•
Brake-by-wire
– Redundancy: Having distributed sensors and actuators
makes brake control more fault-tolerant
•
Distributed decision
– Has more information
– May impose sub-optimal action with respect to local
decision
– What if one node is acting differently than the others?
Local decision
– May take the best action on local conditions
– What if there is a reading error?
•
Lecture 6: Distributed systems
Calin Curescu
5 of 30
4 of 30
Justifying safety
– Redundancy in space leads to distribution
– But distributed systems are not necessarily faulttolerant!
2 of 30
Lecture 6: Distributed systems
Calin Curescu
6 of 30
Justifying availability
•
Active replication
– Group membership
•
Passive replication
– Primary – backup
Lecture 6: Distributed systems
Calin Curescu
7 of 30
Distributed systems & FT
•
Introduce new complications
– No global clock
– Richer failure models
• Node failures
• Communication failures
•
Provides replication and group mechanisms
– Transparency in treatment of faults
– Like N-Version Programming
• Even better
Lecture 6: Distributed systems
Calin Curescu
Failure models
•
Node failures
– Crash
– Omission
– Byzantine (arbitrary)
•
Channel failures
– Crash (and potential partitions)
– Message loss
– Erroneous/arbitrary messages
Lecture 6: Distributed systems
Calin Curescu
9 of 30
”Chicken and egg” problem
•
Replication is useful in presence of failures if there is a
consistent common state among replicas
– What happens when a replica fails?
• Active ?
• Passive ?
•
To get consistency, processes need to communicate their
state via broadcast
•
But broadcast algorithms are distributed algorithms that run
on every node
– also affected by failures…
Lecture 6: Distributed systems
Calin Curescu
A useful broadcast
•
Reliable broadcast
– All non-crashed processes agree on messages delivered
• I.e for any message m, if a correct process delivers m,
then every correct process delivers m.
• Agreement property
– No spurious messages
• I.e. no erroneous, duplicated or created messages
• Integrity property
– All messages broadcast by non-crashed processes are
delivered
• Validity property
Lecture 6: Distributed systems
Calin Curescu
11 of 30
8 of 30
10 of 30
How to implement?
•
The first step is to separate the underlying network
(transport) and the broadcast mechanism
•
Distinguish between receipt and delivery of a message
Lecture 6: Distributed systems
Calin Curescu
12 of 30
Common channel assumptions
•
Communication channel assumptions
Reliable broadcast
•
– No link failures lead to partition
– Send does not duplicate or change messages
Within every process p
– Execute broadcast(m) of message m by:
• adding sender(m) and a unique ID as a header to the
message m
• send(m) to all neighbours including itself
– When receive(m):
• if previously not executed deliver(m) then
• if sender(m) ≠ p then send(m) to all neighbours
• deliver(m)
– Receive does not ”invent” messages
Lecture 6: Distributed systems
Calin Curescu
13 of 30
Lecture 6: Distributed systems
Calin Curescu
Failures
•
•
What happens if p fails
– Directly after a receipt
– While relaying
– Before sending the message
– After sending to some, but not all neighbours
The consensus problem
•
Processes p1,…, pn take part in a decision
– Each pi proposes a value vi
– All correct processes decide on a common value v that
is equal to one of the proposed values
•
Desired properties
– Every correct process eventually decides
• Termination property
– No two correct processes decide differently
• Agreement property
– If a process decides v then the value v was proposed by
some process
• Validity property
Prove correctness of algorithm by proving the necessary
properties in:
– Validity
– Integrity
– Agreement
– Order
Lecture 6: Distributed systems
Calin Curescu
15 of 30
Lecture 6: Distributed systems
Calin Curescu
Basic impossibility result
[Fischer, Lynch and Paterson 1985]
• There is no deterministic algorithm solving the
consensus problem in an asynchronous
distributed system with a single crash failure.
14 of 30
16 of 30
Assume Synchrony:
•
Distributed computations proceed in rounds initiated by
pulses
•
Pulses implemented using local physical clocks,
synchronised assuming bounded message delays
• Why?
Lecture 6: Distributed systems
Calin Curescu
17 of 30
Lecture 6: Distributed systems
Calin Curescu
18 of 30
Byzantine generals
•
A difficult problem solved in 1980 by Pease, Shostak and
Lamport
•
Consensus in the wake of arbitrary (node) failures
– Each process may fail in an arbitrary way (may be
malicious)
•
Theorem: There is an upper bound t for the number of
Byzantine failures compared to the size of the network N
– N ≥ 3t+1
•
Gives a t+1 round algorithm for solving consensus in a
synchronous network
Lecture 6: Distributed systems
Calin Curescu
Scenario 1
•
19 of 30
G and L1 are correct, L2 is faulty
Lecture 6: Distributed systems
Calin Curescu
20 of 30
Scenario 2
•
G and L2 are correct, L1 is faulty
Scenario 3
•
Lecture 6: Distributed systems
Calin Curescu
21 of 30
L1 and L2 are correct, G is faulty
Lecture 6: Distributed systems
Calin Curescu
2-round algorithm
•
… does not work with t=1, N=3!
•
Seen from L1, scenario 1 and 3 are identical, so if L1
decides 1 in scenario 1 it will decide 1 in scenario 3
•
Similarly for L2, if it decides 0 in scenario 2 it decides 0 in
scenario 3
•
L1 and L2 do not agree in scenario 3 !
Lecture 6: Distributed systems
Calin Curescu
23 of 30
22 of 30
Distributed Scheduling
•
Characteristics of a synchronous distributed system
– Upper bound on communication delays
– Local clocks available and drift is bounded
– Each node make progress at minimum rate
•
Dynamic processor allocation
– Anomalies: Response time might increase if
• WCET is decreased;
• priority is increased; or
• number of nodes is increased.
Lecture 6: Distributed systems
Calin Curescu
24 of 30
Allocation problem
•
P1, P2: WCET=25, Period=50, P3: WCET=80, Period=100
•
P1 -> CPU1; P2-> CPU2; P3 -> CPU1 or CPU2
– Not feasible
•
P1 & P2 -> CPU1; P3 -> CPU2
– Feasible schedule
Lecture 6: Distributed systems
Calin Curescu
25 of 30
Allocation
•
Static allocation of processes more reliable than dynamic
reallocation of processes
– Low utilisation
– Schedulability analysis for each processor
•
Remote blocking
– Difficult problem
– Replicate data to other nodes in order to ensure local
access
•
Practical approach:
– Static allocation for safety-critical (periodic and
sporadic) processes; let aperiodic processes migrate.
Lecture 6: Distributed systems
Calin Curescu
Bidding Algorithm
26 of 30
Scheduling of Communication
•
Aperiodic processes arrives at some node
– Admission is performed to see if process can be
guaranteed locally
• If yes, then admit process
• If no, initiate bidding
•
Different problem than CPU scheduling
– Non-preemptive by nature
– Distributed protocol is required
– Additional deadlines at buffers of communication nodes
•
•
Bidding: request information about current state
(processing surplus) at other nodes.
– Migrate process to the node with highest surplus
– Admission of process is performed at the new node
• If no, initiate bidding... (if the deadline is still feasible)
•
TDMA - Time Division Multiple Access
– Time Triggered Architecture (TTA)
Priority-based CAN
– Event based
Token Passing
• Send is only allowed of node is holding a token
• Token hold time is bounded
•
•
•
Improvement to bidding: focused addressing
Lecture 6: Distributed systems
Calin Curescu
27 of 30
In principle any communication system offering bounded message
passing should do
Lecture 6: Distributed systems
Calin Curescu
TTP
28 of 30
Reading
•
The time triggered architecture [Kopetz et. al]
•
Chapter 14 of Burns & Wellings
•
TDMA
– Allocates pre-defined slots within which pre-defined
nodes can send their pre-defined messages
– Periodical architecture
•
Chapter 8 of Herman Kopetz book, Realtime Systems,
Design principles for distributed embedded applications
•
Article by Ramamritham, Stankovic, and Zhao, IEEE
Transactions on Computers, Volume 38(8), August 1989
•
Temporal firewall
•
Replication & failure detection
Lecture 6: Distributed systems
Calin Curescu
29 of 30
Lecture 6: Distributed systems
Calin Curescu
30 of 30
Download