Overview: Next three lectures From one CPU to networked CPUs: • First, from one CPU to multiple CPUs TDDD07 Real-time Systems Lecture 4: Distributed Systems – Allocating VMs on multiple CPUs: Cloud • Next, fully distributed systems – fundamental issues with timing and order of events Simin Nadjm-Tehrani • Next, hard real-time communication Real-time Systems Laboratory – Guaranteed message delivery within a deadline, bandwidth as a resource Department of Computer and Information Science Linköping University • Finally: QoS guarantees instead of timing guarantees, focus on soft RT Undergraduate course on Real-time Systems Linköping University 40 pages Autumn 2014 Undergraduate course on Real-time Systems Linköping University Reading material Distributed applications • • • • • Note specially that from now on we have articles and e-books from LiU e-brary • These are posted for each lecture, linked from the course web page! 2 of 40 Autumn 2014 Banking systems On-line access & electronic services Peer-to-Peer networks Distributed control – Cars, Airplanes • Sensor and ad hoc networks – Buildings, Env. monitoring • Mobile/Cloud computing Undergraduate course on Real-time Systems Linköping University 3 of 40 Autumn 2014 Undergraduate course on Real-time Systems Linköping University Common in all these? Distributed model of computing: • • • • 4 of 40 Autumn 2014 Reasons for distribution • Locality – Engine control, brake system, gearbox control, airbag,… – Clients and servers Multiple processes Disjoint address spaces Inter-process communication Collectivegoal goal Collective • Organisation of functions/code – An extension of modularisation, avoiding single points of failure • Load sharing – Web services, search, parallelisation of heavy computations Though multicore scheduling not part of this course! Undergraduate course on Real-time Systems Linköping University 5 of 40 Autumn 2014 Undergraduate course on Real-time Systems Linköping University 6 of 40 Autumn 2014 1 This lecture (1) Can we guarantee scheduling of tasks arriving from distributed nodes over a set of CPUs in the cloud? (2) What are the fundamental timerelated issues in distributed systems? – Time, clocks, and ordering of events – And why faults cannot be ignored... Datacentre Scheduling • Tasks are encapsulated in virtual machines Appi (VM) • Arrive from different nodes and their resource need varies over time • Goals of scheduling: Allocate VMs to physical machines (PMs) such that – No PM is overloaded so that performance of tasks is not degraded – No PM is severely underloaded so that energy is not wasted • As load changes, decide which VM to migrate! [Xiao et al. 2013] Undergraduate course on Real-time Systems Linköping University Overview Adaptation architecture Predictor Usher controller Prober 9 of 40 Autumn 2014 Two mechanisms • The hotspot solver detects if any PM’s sum of resource usage is above the hot threshold – It will migrate away some VM from that PM … PMk Appn LNM App1 PM1 – Coldspot solver Hypervisor … Prober … Hypervisor Undergraduate course on Real-time Systems Linköping University 10 of 40 Autumn 2014 Estimating resource usage • Predictor: Based on observation O(t) and previous estimate E(t-1) they suggest the Fast Up Slow Down (FUSD) estimation model: E(t) = α . E(t-1) + (1 - α). O(t) • The coldspot solver detects if the PMs are on average running at a utilisation below the green computing threshold – It will migrate away the VMs from some cold PM to prepare it for shut down Undergraduate course on Real-time Systems Linköping University Coldspot solver Migration list – Hotspot solver • Identify any PM that runs at lower utilisation than an energy-efficient one Undergraduate course on Real-time Systems Linköping University Hotspot solver Appm • Skewness: Notion that describes uneven utilisation of resources - minimise skewness over a set of PMs • To deal with fluctuations it helps to predict the forthcoming load on each PM • Identify the candidates for overload 8 of 40 Autumn 2014 App1 7 of 40 Autumn 2014 LNM Undergraduate course on Real-time Systems Linköping University 11 of 40 Autumn 2014 where α is a parameter (experimentally) chosen as -0.2 when O(t) is rising and 0.7 when O(t) is falling Undergraduate course on Real-time Systems Linköping University 12 of 40 Autumn 2014 2 Window of observations Adaptation architecture Predictor Hotspot solver Coldspot solver Migration list Usher controller Adding a window W of observations and taking the 90th percentile Undergraduate course on Real-time Systems Linköping University 13 of 40 Autumn 2014 Skewness • The skewness for a server p (here a PM) as a function of the individual resources ri used within it (here sum of VM utilisations for a resource ri) and the average resource utilisation ‾ r This image cannot currently be display ed. • A hotspot is a server that has high temperature Undergraduate course on Real-time Systems Linköping University 15 of 40 Autumn 2014 Adaptation architecture Predictor Hotspot solver Coldspot solver Migration list Usher controller Prober Hypervisor Undergraduate course on Real-time Systems Linköping University … Prober … Appm App1 LNM … PMk Appn App1 LNM PM1 Hypervisor 17 of 40 Autumn 2014 Prober Hypervisor Undergraduate course on Real-time Systems Linköping University … Prober … Appm LNM … App1 PMk Appn App1 LNM PM1 Hypervisor 14 of 40 Autumn 2014 Load adaptation algorithm • Order the PMs, choose a PM with highest temperature first • Choose one VM on that PM that would reduce PM’s temperature (if migrated away) • If many such VMs, choose the one that reduces skewness most • Find a PM that can accept this VM and not become a hot spot • If many such PMs, choose the one that reduces its skewness most after accepting the VM • If no such PM can be found proceed to the next VM on the hot PM (and eventually to next PM) Undergraduate course on Real-time Systems Linköping University 16 of 40 Autumn 2014 Green computing adaptation • A PM is a coldspot if the sum of the PM’s utilisations for different resources is below a given cold threshold • Choose a green computing (GC) threshold as an indication that a PM is underloaded • The algorithm is invoked when average PM resource utilisation is below the GC threshold • Start with the PM that has the lowest utilisation for memory below the cold threshold • Try to migrate all its VMs to another PM that will not be hot/cold afterwards (will stay below a warm threshold) to avoid future hotspots Undergraduate course on Real-time Systems Linköping University 18 of 40 Autumn 2014 3 Does adaptation work? • A first attempt to study load adaptation vs. energy optimisation • Clearly shows how scheduling is related to load control • Note that there are no performance guarantees (live migration is expected to somewhat increase response time) • Check the scalability arguments! • Mass scale data as input: web access, online learning and store data traces from China Undergraduate course on Real-time Systems Linköping University 19 of 40 Autumn 2014 This lecture (1) Can we guarantee scheduling of tasks arriving from distributed nodes over a set of CPUs in the cloud? (2) What are the fundamental timerelated issues in distributed systems? – Time, clocks, and ordering of events – And why faults cannot be ignored... Undergraduate course on Real-time Systems Linköping University Summary: Sharing the load 21 of 40 Autumn 2014 Undergraduate course on Real-time Systems Linköping University Recall from lecture 1… Event sequence analysis: From the report on the 2003 electricity blackout in the USA-Canada: A valuable lesson from the ... blackout is the importance of having time-synchronized system data recorders. The Task Force’s investigators labored over thousands of data items to determine the sequence of events much like putting together small pieces of a very large puzzle. That process would have been significantly faster and easier if there had been wider use of synchronized data recording services. Undergraduate course on Real-time Systems Linköping University Other application examples From Kopetz (1997): • Consider a nuclear reactor equipped with many sensors that monitor different RT entities (e.g. Values of pressures, flows in various pipes). If a pipe ruptures, a number of RT entities will deviate outside their normal operating ranges. When an RT entity enters its alarm region an alarm event is signalled to the operator. Undergraduate course on Real-time Systems Linköping University 23 of 40 Autumn 2014 20 of 40 Autumn 2014 22 of 40 Autumn 2014 Different views The (global) system view: • First, the pressure in the ruptured pipe changes abruptly • Then the flow changes causing many other RT entities to react • These in turn generate own alarms The operator view: • Operator sees a set of (correlated) alarms, called an “alarm shower” • Operator wants to identify the primary event Undergraduate course on Real-time Systems Linköping University 24 of 40 Autumn 2014 4 Event detection • The computer system must assist the operator to detect the primary event that triggers the alarm shower Causality • If event e occurs after event e’ then e cannot be the cause of e’ • If event e occurs before event e’ then e can be the cause of e’ (but need not be) • Knowledge of exact temporal order of the events is helpful in identifying the primary event • Temporal order is necessary but not sufficient for causal order Time vs. Events Undergraduate course on Real-time Systems Linköping University 25 of 40 Autumn 2014 Other contexts: Time matters Banking and finance: • The rate of interest is applied to funds 26 of 40 Autumn 2014 Safety-critical systems • Inaccurate local clocks can be a problem if the result of computations at different nodes depend on time – In a server at a given point in time – to a balance that reflects transactions by several clients prior to that point • The gain/loss on sales of stocks is dependent on dynamic values of stocks at a given time (the time of sale and purchase) Undergraduate course on Real-time Systems Linköping University Undergraduate course on Real-time Systems Linköping University 27 of 40 Autumn 2014 – Calculation of trajectories: if a missile was at a given position at a time point before a computation, where will it be after the computation? – If the break signal is issued separately in different wheels, will the car stop and when? Undergraduate course on Real-time Systems Linköping University Brake-by-wire 28 of 40 Autumn 2014 This lecture (1) Can we guarantee scheduling of tasks arriving from distributed nodes over a set of CPUs in the cloud? (2) What are the fundamental timerelated issues in distributed systems? – Time, clocks, and ordering of events – And why faults cannot be ignored... Undergraduate course on Real-time Systems Linköping University 29 of 40 Autumn 2014 Undergraduate course on Real-time Systems Linköping University 30 of 40 Autumn 2014 5 Questions • Can we temporally order all events in a distributed system? – Only if we can timestamp them with a value from a global (universal) clock • Can we draw any conclusions if we do not have a global clock? – What about a set of local clocks? – What if no clocks at all? Undergraduate course on Real-time Systems Linköping University 31 of 40 Autumn 2014 6