TDDD07 Real-time Systems Lecture 6: From Hard to Soft RT Communication, overloads & QoS Simin Nadjm-Tehrani Real-time Systems Laboratory Department of Computer and Information Science Linköping university Undergraduate course on Real-time Systems Linköping University 52 pages Autunn 2015 Overview: Next three lectures • From one CPU to networked CPUs • This lecture (and part of next): – Is hard real-time possible with distributed nodes? – Fundamental issues with timing in distributed systems • Next: Hard real-time communication – Guaranteed message delivery within a deadline • Finally: QoS guarantees instead of timing guarantees, moving on to soft RT Undergraduate course on Real-time Systems Linköping University 2 of 52 Autumn 2015 Worst Case Response According to [Tindell & Burns 94]: Message response time = Ji: Jitter (from event to placement in queue)+ wi: Queuing time (response time of first bit)+ Ci: Transmission time for whole message Ri = Ji +wi + Ci – tbit wi = tbit + Bi + Ii Bi + Ii : Blocking and Interference time (as RMS) Undergraduate course on Real-time Systems Linköping University 3 of 52 Autumn 2015 Interference and blocking • Ii: waiting due to higher priority processes, bounded if messages are sent periodically • Bi: waiting due to lower priority messages, only one can start before i • Ji: jitter, has to be assumed bounded (by assumptions on the node scheduling policy!) Undergraduate course on Real-time Systems Linköping University 4 of 52 Autumn 2015 Solving recurrent equations • Blocking is fixed: max Cj of all lower priority messages • wi = Bi + khp(i) ( w + J i k + tbit )/ Tk Ck • w0i = Bi • wn+1i = Bi + khp(i) ( win + Jk + tbit )/ Tk Ck • After fixed point is reached: Bi + wn+1i + Ci Di ? Undergraduate course on Real-time Systems Linköping University 5 of 52 Autumn 2015 Solving recurrent equations • Blocking is fixed: max Cj of all lower priority messages • wi = Bi + khp(i) ( w + J i k + tbit )/ Tk Ck • w0i = Bi • wn+1i = Bi + khp(i) ( win + Jk + tbit )/ Tk Ck • After fixed point is reached: Bi + wn+1i + Ci Di ? Undergraduate course on Real-time Systems Linköping University 6 of 52 Autumn 2015 Example • From [Davis et al. 2007]: – To show how a w-term for each message was computed based on original method from 1994 – Assume Ji= 0 Message priority Period = Ti Deadline TX time = Ci = Di A 1 (high) 2.5ms 2.5ms 1ms B 2 (med) 3.5ms 3.25ms 1ms C 3 (low) 3.25ms 1ms Undergraduate course on Real-time Systems Linköping University 3.5ms 7 of 52 Autumn 2015 Exercise • Compute response times for the three messages. Undergraduate course on Real-time Systems Linköping University 8 of 52 Autumn 2015 The original analysis • From 1994 • … was Optimistic! • Here is a case where analysis shows schedulability but in fact deadlines can be missed! [Davis, Burns, Bril, Lukkien 2007] Undergraduate course on Real-time Systems Linköping University 9 of 52 Autumn 2015 CAN: The original analysis • From 1994 • … was Optimistic! • Here is a case where analysis shows schedulability but in fact deadlines can be missed! [Davis, Burns, Bril, Lukkien 2007] Undergraduate course on Real-time Systems Linköping University 10 of 52 Autumn 2015 Solving recurrent equations • Blocking is fixed: max Cj of all lower priority messages • wi = Bi + khp(i) ( w + J i k + tbit )/ Tk Ck • w0i = Bi • wn+1i = Bi + khp(i) ( win + Jk + tbit )/ Tk Ck • After fixed point is reached: Bi + wn+1i + Ci Di ? Undergraduate course on Real-time Systems Linköping University 11 of 52 Autumn 2015 The correct analysis • Takes account of the fact that different instances of the same message may appear in a busy period and • All such instances should be shown to meet their deadlines! [Reading: Sec. 3.1 & 3.2, Davis et al.07] Undergraduate course on Real-time Systems Linköping University 12 of 52 Autumn 2015 Revised computation • Rm(q)= Jm + wm(q) – qTm + Cm • q=i, w(q) computes busy period for ith instance of message m • To know range of q, i.e. how many instances of message m are relevant, we need to find the longest busy period for message m, denoted tm Undergraduate course on Real-time Systems Linköping University 13 of 52 Autumn 2015 Exercise • Redo the old exercise with the Davis et al. variant of the busy period, where q stands for the qth instance of the same message (q {0,..., Qm -1}) wn+1m (q) = Bm + q.Cm + khp(m) ( wnm + Jk + tbit )/ Tk Ck • Qm = ( tm + Jm )/ Tm Undergraduate course on Real-time Systems Linköping University 14 of 52 Autumn 2015 Example revisited • Now with the new formula where busy period term is according to [Davis et al. 2007]: Message priority period deadline TX time A 1 2.5ms 2.5ms 1ms B 2 3.5ms 3.25ms 1ms C 3 3.5ms 3.25ms 1ms Undergraduate course on Real-time Systems Linköping University 15 of 52 Autumn 2015 Solution • To know how many instances of message m are relevant we need to find the longest busy period for m, denoted tm • t0C = CC= 1 • t1C = t0C/TC CC + t0C/TB CB+ t0C/TA CA = 1+1+1= 3 • t2C = t1C/TC CC + t1C/TB CB+ t1C/TA CA = 1+1+2= 4 • t3C = t2C/TC CC + t2C/TB CB+ t2C/TA CA = 2+2+2= 6 • t4C = t3c/TC CC + t3C/TB CB+ t3C/TA CA = 2+2+3= 7 • t5C = t4C/TC CC + t4C/TB CB+ t4C/TA CA = 2+2+3= 7 • tC = 7 Which means 2 instances of message C are relevant! QC= 2 and q: 0..QC-1 Undergraduate course on Real-time Systems Linköping University 16 of 52 Autumn 2015 Computing the queuing time • w0C (0) = BC+0.CC= 0 • w1C (0) = (w0C(0)+ tbit)/TB CB + (w0C (0)+ tbit) /TA CA = 1+1 =2 • w2C (0) = 1+1= 2 wC (0) = 2 RC (0) = wC (0) – qTC + CC = 3 • w0C (1) = wC (0) + CC = 2+1= 3 • w1C (1) = CC + (w0C (1)+ tbit)/TB CB + (w0C (1)+ tbit)/TA CA = 1+1+2 = 4 • w2C (1) = CC + (w1C (1)+ tbit)/TB CB + (w1C(1)+ tbit)/TA CA = 1+2+2=5 • w3C (1) = CC + (w2C (1)+ tbit)/TB CB + (w2C (1)+ tbit)/TA CA = 1+2+3 = 6 • w4C (1) = CC + (w3C(1) + tbit )/TB CB + (w3C (1)+ tbit) /TA CA = 1+2+3 = 6 wC (1) = 6 RC (1) = wC (1) – qTC+ CC = 3.5 Undergraduate course on Real-time Systems Linköping University 17 of 52 Autumn 2015 Maximum response time RC =max{q:0..QC-1} RC(q) = 3.5 So the second instance is why the set is not schedulable! • Recall deadline for message C= 3.25 • Next instance does not increase Rc! Undergraduate course on Real-time Systems Linköping University 18 of 52 Autumn 2015 CAN Error detection • If a message is corrupted in transmission the Cyclic Redundancy Check (CRC) field will be wrong • The first receiver that notes this sends 000000 • Note that corruption at source and corruption in transit cannot be distinguished • This works as long as the node is not erroneous – Babbling idiot! Undergraduate course on Real-time Systems Linköping University 19 of 52 Autumn 2015 Two approaches • We will look at two well-known methods for bus scheduling – Event triggered (CAN) – Time triggered (TTP) • Used extensively in automotive and aerospace applications respectively Undergraduate course on Real-time Systems Linköping University 20 of 52 Autumn 2015 Time-triggered protocol (TTP) • Origin in research projects in Vienna in early 90´s [Kopetz et al.] Node 1 … Node n-1 Node n • Time division multiple access (TDMA) Undergraduate course on Real-time Systems Linköping University 21 of 52 Autumn 2015 Temporal firewall Host CNI CC Node 1 … Host Node n CC • CNI provides temporally accurate state information (via clock synchronisation) • When the data is temporally not valid, it can no longer be exchanged Undergraduate course on Real-time Systems Linköping University 22 of 52 Autumn 2015 Message scheduling • TDMA round implemented through the MEDL (message description list) – The communication system (collection of CC:s and the bus) reads a message from the CNI of sending node at the apriori known fetch instant and places in the CNI of all other nodes at the apriori known delivery instant, replacing the previous value • No constraints on (local) node CPU scheduling Undergraduate course on Real-time Systems Linköping University 23 of 52 Autumn 2015 Communication protocol • Message Description List (MEDL): allocates a pre-defined slot within which each node can send its (pre-defined) message Node 1 … Node n-1 Node n … A TDMA round Undergraduate course on Real-time Systems Linköping University 24 of 52 Autumn 2015 TTP error detection Host CNI CC BG BG BG BG Host … CC BG BGBG BG BG: Buss Guardian (stops babbling idiots) CRC: for corruption in transit Undergraduate course on Real-time Systems Linköping University 25 of 52 Autumn 2015 • The major success of the TTP bus is due to the possibility of detecting additional faults including arbitrary (Byzantine) faults • We will come back to this later… Undergraduate course on Real-time Systems Linköping University 26 of 52 Autumn 2015 Last decade developments • Formal verification (mathematical proofs) of clock synchronisation and fault tolerance mechanisms in TTP • Some X-by-wire applications opt for TTP solutions due to higher reliability and fault tolerance • New solutions to combine event-triggered and time-triggered messages appear: – Simulating CAN over TTP, or TT-CAN – FlexRay – RT/TT-Ethernet Undergraduate course on Real-time Systems Linköping University 27 of 52 Autumn 2015 Overview: Next three lectures • From one CPU to networked CPUs • This lecture (and part of next): – Is hard real-time possible with distributed nodes? – Fundamental issues with timing in distributed systems • Next: Hard real-time communication – Guaranteed message delivery within a deadline • Finally: QoS guarantees instead of timing guarantees, moving on to soft RT Undergraduate course on Real-time Systems Linköping University 28 of 52 Autumn 2015 From messages to flows When there is overload: • Need to allocate available resources – To some applications/flows (which ones?) • Need to adapt as load mix and resource dynamics changes – Same flow can get different treatments at different nodes Undergraduate course on Real-time Systems Linköping University 29 of 52 Autumn 2015 Overview • Some basic notions: QoS parameters, requirement vs. provision • We focus on allocation, not adaptation • Quality of service in networked (wired) applications – QoS mechanisms at nodes – Network-wide: Intserv, Diffserv Undergraduate course on Real-time Systems Linköping University 30 of 52 Autumn 2015 Which resources? • Application nodes (edge nodes) – CPU – Memory (buffer space) – Power • Links – Bandwidth • Forwarding nodes: buffer space Undergraduate course on Real-time Systems Linköping University 31 of 52 Autumn 2015 What is quality of service? • QoS: Capability to provide resource assurance and service differentiation in a network • Why is it important? – See what stand Netflix is taking on http://www.technologyreview.com/new s/527006/talk-of-an-internet-fast-laneis-already-hurting-some-startups/ Undergraduate course on Real-time Systems Linköping University 32 of 52 Autumn 2015 How do we characterise QoS? • Application level requirements – Image quality (resolution/sharpness), viewing size, voice quality • Enforcement (provision) level indicators – Bandwidth guarantee (measured throughput) – delay – jitter – loss ratio – reliability (lack of erroneous messages and duplications) Undergraduate course on Real-time Systems Linköping University 33 of 52 Autumn 2015 Application categories • Elastic or inelastic – Mail or video conference • Interactive or non-interactive – Voice communication or file transfer • Tolerant or non-tolerant – MPEG video-on-demand or automated control • Adaptive or non-adaptive – Audio/video streaming or electronic trading • Real-time or non-real-time – IP-telephony or A/V on demand (streaming) Undergraduate course on Real-time Systems Linköping University 34 of 52 Autumn 2015 QoS guarantees • Need description of required/provided service – service commitment: e.g. % of dropped packets, average end-to-end delay • In presence of a traffic model – Traffic profile: definition of the flow entitled to the service e.g. by arrival rates, burstiness, packet size,… Undergraduate course on Real-time Systems Linköping University 35 of 52 Autumn 2015 Philosophies • Service differentiation – When there are overloads some connections/packets/applications are preferred to others Which • Fairness to drop? – All should get something • Orthogonal to above: Adaptation – Adaptive ones may adapt to make room for non-adaptive ones Undergraduate course on Real-time Systems Linköping University 36 of 52 Autumn 2015 Overview • Some basic notions: QoS parameters, requirement vs. provision • We focus on allocation, not adaptation • Quality of service in networked (wired) applications – QoS mechanisms at nodes – Network-wide: Intserv, Diffserv Undergraduate course on Real-time Systems Linköping University 37 of 52 Autumn 2015 QoS mechanisms • Admission control – To manage the limited resources in presence of oversubscriptions – Examples: • Policing (does the application ask for the same level of resources that was assumed as a traffic profile?) • Shaping (influencing the rate of packets fed into the network to adapt to current resource picture) • Scheduling • Buffer management Undergraduate course on Real-time Systems Linköping University 38 of 52 Autumn 2015 Leaky bucket • Arrival profile can be described in terms of a pair (r, b) where r is the average bit rate, and b is an indication of burst size Shaping Undergraduate course on Real-time Systems Linköping University 39 of 52 Autumn 2015 Scheduling • Which packet should be forwarded at the network layer (to serve which QoS parameters)? • No QoS: FIFO • Fixed priority scheduling (similar to CAN when selecting from a queue) – With no guarantees on per packet delay, some can starve • Weighted Fair Queuing (WFQ) • Class based queuing Undergraduate course on Real-time Systems Linköping University 40 of 52 Autumn 2015 WFQ rough description • Instead of allocating to all packets from one flow at a time, imagine an approximation to an ideally fair scheduler: one packet from each flow in a given time interval • Allocate the outgoing bandwidth according to a weight for each flow • For flows that are described as a leaky bucket, the max delay per packet is computable Undergraduate course on Real-time Systems Linköping University 41 of 52 Autumn 2015 Class-based link sharing • Hierarchical allocation of the bandwidth according to traffic classes • Each class allocated a max share under a given interval, and the excess shared according to some sharing policy Link User type A User type B 40% 60% Real-time 30% Usenet … ftp … 5% Undergraduate course on Real-time Systems Linköping University … News … … Real-time 42 of 52 Autumn 2015 Buffer management • Scheduling is enough as long as buffers are infinite – In reality buffers (queues) get full during overloads – Shall we drop all the packets arriving after the overload starts? • Buffer management is about determining which stored packets to drop in preference to incoming ones – Can adopt differentiated drop policies Undergraduate course on Real-time Systems Linköping University 43 of 52 Autumn 2015 Overview • Some basic notions: QoS parameters, requirement vs. provision • We focus on allocation, not adaptation • Quality of service in networked (wired) applications – QoS mechanisms at nodes – Network-wide: Intserv, Diffserv Undergraduate course on Real-time Systems Linköping University 44 of 52 Autumn 2015 Across network nodes • IP datagrams delivered with best effort • IntServ was defined to deliver IP packets with differentiated treatment across multiple routers (1994) • Introduced 3 service classes: – BE: Best effort – CL: Controlled Load (acceptable service when no overload) – GS: Guaranteed Service (strict bounds on eto-e delay) Undergraduate course on Real-time Systems Linköping University 45 of 52 Autumn 2015 IntServ • Each router keeps a “soft state” for each flow (a session) currently passing through it – GS: the leaky-bucket-based requirements from a flow induce a max local delay in each router • The soft state is created with a reservation scheme RSVP, and refreshed while the session is in progress Undergraduate course on Real-time Systems Linköping University 46 of 52 Autumn 2015 Sending Tspec & receiving Rspec Destination PATH RESV Source RSVP routers Undergraduate course on Real-time Systems Linköping University 47 of 52 Autumn 2015 QoS specifications • T-spec (traffic specification) – A token bucket specification • • • • • token rate - r bucket size - b peak rate - p maximum packet size - M minimum policed unit - m • R-spec (reservation specification) – Service Rate – R • The bandwidth requirement – Slack Term – S • The delay requirement Undergraduate course on Real-time Systems Linköping University 48 of 52 Autumn 2015 Not deployed successfully! • IntServ has met resistance for several reasons, including: – Dynamic and major changes in reservation – Not all routers RSVP enabled – Set up time can be proportionately long compared to session time – Interactive sessions need to set up a path at both ends Undergraduate course on Real-time Systems Linköping University 49 of 52 Autumn 2015 DiffServ (1998) • Based on resource provisioning (for a given SLA) as opposed to reservation • Applied to traffic aggregates as opposed to single flows • Forwarding treatment as opposed to end-to-end guarantees • Edge routers labelling packets/flows in forwarding to next domain, and accepting only in-profile packets when accepting from other domains Undergraduate course on Real-time Systems Linköping University 50 of 52 Autumn 2015 Service classes • Marked with two bits: – (P) Premium class: intended for preferential treatment to which policing is applied with a small bucket size – (A) Assured class: pass through policing with a bucket size equal to the given burst – Packets with A-bit compete with best effort packets when buffers get full Undergraduate course on Real-time Systems Linköping University 51 of 52 Autumn 2015 Scalability of DiffServ • Admission control is now at edge nodes not every path on a route • No set-up time and per-flow state in each router • At the cost of no end-to-end guarantees Undergraduate course on Real-time Systems Linköping University 52 of 52 Autumn 2015