QoS and QoE - Dspcsp.com

QoS Presented by: Yaakov (J) Stein CTO The Access Company QoS 1 What am I going to talk about today ? • The telecommunications service model – SLAs • QoE and QoS • Soft QoS (DiffServ) – – – – – – packet marking (PCP, DSCP) PHBs – BE, EF, AF queuing mechanisms (strict priority, WFQ) specifying datarate, bucketing algorithms (leaky, token) traffic policing traffic shaping • Hard QoS (IntServ) – – – – service levels – BE, CLS, GS Network Engineering (planning) vs. Traffic Engineering (resource reservation) RSVP Routing protocols and RSVP-TE QoS 2 the telecommunications service model QoS 3 Why do we pay for services ? Generally good (and frequently much better than toll quality) voice service is available free of charge (Skype, Fring, Nimbuzz, …) So why does anyone pay for voice services ? Similarly, one can get free • • • • • (WiFi) Internet access email boxes file storage and sharing web hosting software services So why pay for any service ? QoS 4 Paying for QoS The simple answer is that one doesn’t pay for the service one pays for Quality of Service guarantees In our voice model price toll quality with mobility QoS BE But what does QoS mean and why are we willing to pay for it ? To explain, we need to review some history … QoS 5 Father of the telephone Everyone knows that the father of the telephone was Alexander Graham Bell (along with his assistant Mr. Watson) But Bell did not invent the telephone network Bell and Watson sold pairs of phones to customers The father of the telephone network was Theodore Vail QoS 6 Father of the telephone network Theodore Who? • • • • Cousin of Alfred Vail (Morse’s coworker) Ex-General Superintendent of US Railway Mail Service First general manager of Bell Telephone Father of the PSTN Organized telephony as a service (like the postal service!) * Why else is he important? • • • • Established principle of reinvestment in R&D Established Bell Telephones IPR division Executed merger with Western Union to form AT&T Solved major technological problems • use of copper wire • use of twisted pairs * Vailism is the philosophy that public services should be run as closed centralized monopolies for the public good QoS 7 What’s the difference ? In the Bell-Watson model the customer pays once, but is responsible for • installation – wires – wiring • operations + – power – fault repair – performance (distortion and noise) • infrastructure maintenance while the Bell company is responsible only for providing functioning telephones In the Vail model the customer pays a monthly fee but the provider assumes responsibility for everything including fault repair and performance maintenance The telephone company owns the telephone sets and even the wires in the walls ! QoS 8 Service Level Agreements In order to justify recurring payments the provider agrees to a minimum level of service in an SLA An SLA is a legal commitment between a service provider (SP) and a customer, for example: • Telco and subscriber • ISP and Internet user • VPN operator and enterprise • cloud application provider and cloud user SLAs typically include (financial) penalties for breaches If objectives or penalties are too low, SLA is useless If objectives or penalties are too high, cost will be prohibitive Badly defined SLAs may damage operations by setting incorrect goals QoS 9 SLAs and QoS parameters SLAs should capture Quality of user Experience (QoE) but this is often hard to quantify So SLAs usually actually detail measurable network parameters that influence QoE, such as : Connectivity parameters • • availability (e.g., the famous five nines) time to repair (e.g., the famous 50 ms) Noise (error) level parameters • • • • SNR BER Packet Loss Ratio defect densities Information rate parameters • performance parameters bandwidth, throughput, goodput Information latency parameters • • 1-way delay, round trip delay QoS 10 Connectivity vs. the rest Basic connectivity (availability) always influences QoE The other parameters may influence QoE depending on service/ application (voice, video, browsing, …) Some services only require basic connectivity Some also require minimum available throughput Some require delay less then some end-end (or RT) delay Some require packet loss ratio (PLR) less than some percentage Note: these parameters are not necessarily independent For example, TCP throughput drops with PLR 1000 B packets 50 ms RTT QoS 11 Some rules of thumb Mission Critical (and life critical) services require • high availability If there are any MC services then system traffic requires high availability too MC services do not necessarily require strict throughput but always indirectly require • a certain minimal average throughput • bounded delay If the MC service uses TCP then it requires • low PLR Real-time services require • sufficient throughput but not necessarily low PLR (audio and video codecs have PLC) Interactive services require • low RT delay QoS 12 QoS monitoring RECAP: SLA compliance is the SP’s justification for payment To ensure SLA compliance, the SP must : • monitor the SLA parameters • take action if parameter is dropping below compliance levels But how does the SP verify/ensure that the SLA is being met ? Monitoring is carried out using Operations, Administration, Maintenance (OAM) The customer too may use OAM to check that the SP is compliant ! Technical note: OAM is a user-plane function but may influence control and management plane operations for example • OAM may trigger protection switching, but doesn’t switch • OAM may detect provisioned links, but doesn’t provision them QoS 13 OAM – FM and PM The difference between connectivity and performance parameters leads to two types of OAM : 1. Fault Monitoring required for maintenance of connectivity (availability) – – – detection and reporting of anomalies, defects, and failures OAM runs continuously/periodically at required rate used to trigger mechanisms in the • control plane (e.g. protection switching) and • management plane (alarms) 2. Performance Monitoring required for maintenance of all other QoS parameters – – measurement of performance criteria (delay, PDV, etc.) OAM run : • before enabling a service • on-demand or • per schedule QoS 14 QoS assurance : availability The difference between connectivity and performance parameters leads to 2 types of QoS assurance – availability and performance Availability is usually specified in “nines” nines up % permitted down time 3 nines 99.9% < 7 hour 18 min / month 4 nines 99.99% < 44 minutes / month 5 nines 99.999% < 4 min 23 sec / month 6 nines 99.9999% < 26 sec / month typical service electric power service PSTN In order to ensure high availability, one employs • FM OAM • Automatic Protection Switching (APS) QoS 15 QoS assurance : performance There are two main approaches to ensuring performance QoS IntServ (guaranteed QoS) – hard QoS – – – – define traffic flows (CO approach) guarantee QoS attributes for each flow reserve resources at each router along the flow signaling protocol (e.g., RSVP) needed DiffServ (statistical QoS) – soft QoS – – – – retain CL paradigm no guaranteed QoS attributes no resource reservation mark packets (differentiated – e.g., gold, silver, bronze) • marking can be by VLAN, P-bits, IP-ToS/DSCP, or general “flow” – offer special treatment (priority) relative to other packets DiffServ is the preferred approach for Ethernet and IP IntServ is used in MPLS-TE QoS 16 QoE QoS 17 QoE and MOS ITU-T defines QoE as the acceptability of a service as perceived subjectively by the end-user A well-known QoE measure for telephony-grade voice is Mean Opinion Score (MOS) (ITU-T P.800) MOS is measured by having a number of listeners listen and score speech on a scale from 1 (bad) to 5 (excellent) and averaging over these scores (finding the mean) • Toll quality voice has MOS = 4 • Cellphone voice has MOS  3.5 • Synthetic or military voice has MOS = 2 and below QoS 18 QoE and QoS Theory - QoE for a given application is a function of QoS parameters QoE = f (service; QoS1, QoS2, … QoSn) Researchers have found various functional forms for the dependence of QoE on a particular QoS parameter form expression examples Linear QoE  QoSk perceived download time vs. PLR Logarithmic QoE  log(QoSk) perceived download time vs. datarate Exponential QoE  exp(QoSk) VoIP MOS vs. PLR Power Law QoE  QoSkp perceived streaming video quality vs. PDV see e.g., work of Markus Fiedler (BTH, Sweden) QoS 19 Absolute vs. Comparative QoE QoE measures may be absolute determined by observing the degraded message or comparative determined by comparing the degraded message to the original Comparative measures are often more accurate but can not be used unintrusively on a live network scenarios Absolute measures can be used single-ended (non-intrusively) MOS variations Absolute Category Rating (ACR) : listeners hear only the degraded speech Degradation Category Rating (DCR) : listeners hear first the original and then the degraded speech and score 1 = very annoying degradation … 5 inaudible degradation Comparative Category Rating (CCR) : listeners hear the original and the degraded speech in random order and score -3 2nd is much worse than 1st ... 3 2nd is much better than 1st even simpler : AB test – simply report which sounds better QoS 20 Subjective vs. Objective QoE Direct human QoE scoring is expensive and time-consuming ITU-T has defined objective measures that can be automated These entail algorithms that produce scores that correlate well with human QoE PSQM (ITU-T P.861) and PESQ (ITU-T P.862) are objective comparative MOS-like measures for telephone grade speech They model the human auditory perception system (Bark scale, masking, etc.) PEAQ (ITU-R BS-1387) similarly scores wideband audio These were selected in competitions to have highest correlation with human MOS ITU-T P.563 is a single-ended (absolute) objective MOS-like score It determines un-naturalness of telephone-grade speech sounds and the amount of non-speech-like noise QoS 21 PSQM processing x [n] y [n] Hanning window Hanning window xwi[n] ywi[n] FFT FFT Xi[k] Yi[k] Pxi[k] Pyi[k] frequency warping Px'i[j] frequency warping calculate local scaling factor Py'i[j] Py"i[j] Si filter with receiving characteristics of handset PFxi[j] PHxi[j] filter with receiving characteristics of handset PFyi[j] Hoth noise PHyi[j] intensity warping Lxi[j] intensity warping calculate loudness scaling factor Lyi[j] S1i Cognitive subtraction Ly'i[j] T1207950-96 Ni[j] asymmetry processing Ni silent interval weighting Nwsil QoS 22 E-model The E-model defined in ITU-T G.107 is a planning tool It predicts a “mouth-to-ear” transmission rating factor R • between 0 and 100 • higher values signify better voice quality • should be uniquely convertible to a MOS level R = f(QoS1, … QoSn) and is additive in individual QoSk degradations R starts with the basic signal to noise ratio R is reduced to account for various impairments, including • simultaneous impairments (loudness, sidetone, clipping, quantization noise) • delay impairments (delay, echo delay and loudness) • equipment impairments (codec distortion, packet loss) R is increased when there are additional advantages such as mobility (cellphone receives A=10) R = R0 – Is – Id – Ie + A QoS 23 R value meanings R values meaning Equivalent MOS 90 - 100 Very satisfied 4.3-5.0 80- 90 Satisfied 4.0-4.3 70-80 Some users dissatisfied 3.6-4.0 60-70 Many users dissatisfied 3.1-3.6 50-60 Nearly all users dissatisfied 2.6-3.1 Below 50 Not recommended 1-2.6 QoS 24 VQMON Years before P.563 ETSI specified VQmon TIPHON (Telecommunications and Internet Protocol Harmonization Over Networks) TS 101 329-5 Annex E VQmon (developed by Telchemy) is a single-ended method for estimating the E-model factors for VoIP audio based on QoS parameters (packet loss statistics, delay) Depends on codec type Takes human perception phenomena into account (e.g., recency effect) VQmon was later extended to • audio (MOS-A) • video (MOS-V) • audio-video (MOS-AV) QoS 25 Video quality ITU-R produced BT.500 for subjective assessment of TV quality Similar to MOS : • television sequences are shown to a group of viewers • subjective opinions are averaged ITU-T has produced many Recommendations for video and multimedia quality : • Subjective (P.9xx, J.140) • Objective (J.143, J.144, J.147, J.148, J.24x, J.34x) Since 1997 the Video Quality Experts Group (VQEG) has been producing standards and tutorials PEVQ (J.247) is a comparative pixel by pixel objective measure that models the human visual tract and returns a 5-point MOS score and further KPIs QoS 26 QoE for other applications G.1011 is a reference guide to existing standards for QoE and provides a taxonomy G.1010 discusses many applications, including • conversational voice, voice messaging, streaming audio • videophone, one-way video • web-browsing, bulk data transfer, email, e-commerce, • interactive games • SMS, instant messaging and gives performance targets for delay, PDV, and PLR G.1050 gives an IP network model for evaluating the performance of IP streams based on QoS parameters (delay, PDV, PLR). J.163 treats real-time services over cable modems X.140 defines QoS parameters for public data networks QoS 27 Network planning tools In addition to subjective/objective methods to quantify the QoE of a specific (live or simulated) service instance Network planners need tools to predict service quality in order to efficiently allocate resources G.1030 provides network planners with end-to-end (E-model-like) tools for applications over IP networks It includes an appendix devoted to web browsing that presents empirical perception of users to response times and proposes a MOS measure G.1070 proposes an algorithm for network planners to estimate videophone quality QoS 28 Useful background documents ISO 8402 defines quality as : the totality of features and characteristics of a product or service that bears its ability to satisfy stated or implied needs E.800 Definitions of terms related to QoS G.1000 Communications quality of service: A framework and definitions ETSI ETR 003 General aspects of QoS and Network Performance (NP) * Note – terminology in these documents is outdated QoS 29 TR-126 The Broadband Forum (BBF) has produced TR-126, which includes • a tutorial on QoE • guidelines for QoE vs. QoS for triple play applications TR-126 also discusses : • QoE dimensions: service set-up, operation, and tear-down • QoE facets: user effort, application responsiveness • information fidelity, security, and dependability/availability; • localization of QoE contributions (access, ISP, application SPs) Guidelines are given for : • video (conferencing, surveillance, streaming) • voice (wired, wireless, voice messaging, IVR) • best-effort Internet data (browsing, email, file transfer, VPN, P2P, ecommerce, …) QoS 30 TMF The TeleManagement Forum (TMF) discusses QoE SLA management TMF’s Wireless Services Measurement Handbook GB923 defines : • Key Quality Indicators (KQIs) (like QoE scores) • Key Performance Indicators (KPIs) KQIs may be determined from KPIs (the mapping may be complex) KPIs are derived from QoS parameters TMF has defined a set of KQIs including : • response time • service availability • speech/video quality • transaction rate • offered throughput An SLA consists of a set of KQI and KPI thresholds (see SLA Management Handbook GB917 and its Application Notes) QoS 31 Apdex The Apdex Alliance is a consortium of companies functions as a IEEE-ISTO (Industry Standards and Technology Organization) program Apdex develops open standardized methods to • report • benchmark and • track application performance. The Apdex (Application Performance Index) • is a number between 0 and 1 • is meant to capture user satisfaction from an application • 0 means no user would be satisfied • 1 means that all users would be satisfied QoS 32 Apdex (cont.) To compute the Apdex N users are divided into 3 categories • satisfied (S users) e.g., web page completely loads within 2 seconds • tolerating (T users) e.g., web page completely loads within 8 seconds • frustrated (F users) e.g., web page takes > 8 seconds to load The Apdex is given by Apdex = ( S + T/2 ) / N Apdex hierarchically deconstructs application transactions into sessions processes tasks turns protocols packets Sessions consist of the entire connect time Processes are interactions accomplishing a goal Tasks are individual interactions The user is mainly aware of the task response time since must wait for the task to complete before proceeding! QoS 33 Behavioral QoE All of the above subjective and objective QoE measures are service/application-specific. But new services and applications are created every day and different users use different features of a single application So it is no longer feasible to study each application in depth A new approach is behavioral QoE estimation the user’s satisfaction is estimated based on actions / reactions Example : there is a high measured correlation between a user being unsatisfied with a service level his aborting the application (or at least waiting until the service level improves) Behavioral QoE can be used instead of traditional QoE measurement or to automatically find QoE(new app, QoS) QoS 34 soft QoS QoS 35 Queuing theory Why isn’t the traffic distribution problem simple ? If we have available data rate A, simply allow traffic from all sources that sums to  A Wrong ! Even if the average rates sum to much below A there may be peak rates that exceed A In order to accommodate peaks, we insert a queue Customers/packets/whatever wait in queue to be serviced arrivals queue S Queue behavior can be counter-intuitive Problem 1 : Problem 2: two buses arrive at my bus stop every 10 minutes I take the first bus that arrives Why do I take bus A much more than bus B ? (hint : correlation) buses leave their first stop every 10 minutes and pick up passengers at intermediate stops Why do the buses bunch ? QoS 36 Scheduling disciplines When there is a single queue, service order is usually First In First Out (FIFO) AKA First Come First Served but may be Last In First Out (LIFO/stack), random order, etc. When there are multiple queues packets belonging to a particular flow are consistently mapped to a single queue there may be one or more flows in each queue Service order may be : • Round Robin : each queue visited in order • Strict Priority : take from non-empty queue of highest priority • Fair Queuing : preserve average datarate from each queue • Weighted Fair Queuing : fair queuing with priority • hybrid : mixture of several disciplines QoS 37 Erlang In early 1900s telcos needed to calculate the number of switches, lines, and operators they needed per call volume • too little and subscribers would be unhappy • too much wastes labor and money Same problem for customers in stores, cars at traffic lights, manufacturing processes, call centers, etc. Agner Krarup Erlang developed queuing theory to solve this problem for the Copenhagen telephone exchange The unit of traffic use is called the Erlang in his honor 1 Erlang is 1 channel being used 100%, or 2 channels used 50% when averaged over some time (generally an hour) There is also a functional programming language (used by Ericsson for telephony applications) named after him QoS 38 Kendall notation Let’s call A – the statistics of arrival times (customers / packets / whatever) B – the statistics of service times C – the number of servers (there may be only 1 server) K – the maximum queue length (if too many arrivals, need to drop) Then we can describe a queuing system by A/B/C/K and if K= then we call it A/B/C Important statistics types: D Deterministic distribution (often fixed intervals) M Markov process (Poisson (exponential) distribution) E(k) Erlang distribution with parameter k Example: M/M/1 is a queue with a single server and Poisson distributed arrivals and service times QoS 39 Queuing theory results Queuing theory derives important values, such as • waiting times • number of customers/packets waiting • number of customers/packets being processed We will not develop queuing theory here Some models are completely understood (M/M/1, M/M/K, M/D/1) Some formulae are true in general Little’s law L = λW L : long-term average number of customers in a queue λ : long-term average arrival rate W : average time a customer (waits) in the system This law is true for any arrival distribution, service distribution, number of servers, service order, etc. QoS 40 Recap: soft QoS Soft QoS does not provide any hard service level guarantees It merely breaks fairness by giving priority to certain users or applications or flows or individual packets When there aren’t network resources to forward all packets packets are forwarded in order of priority from highest to lowest Low(er) priority packets may be delayed or discarded(when K < ) In order to correctly prioritorize packets they need to be priority marked Marking is best accomplished by the originator but may need to be performed by an intermediate element based on port, or header fields, or even DPI QoS 41 Marking Ethernet VLAN ID (VID) indicates priority In addition, for VLAN tagged frames Priority Code Point (PCP) AKA user priority field, P-bits • • • • • 3 bits so takes values 0 … 7 monotonically increasing priority can use priority tagging (VLAN=0) if no VLAN P=0 means non-expedited traffic 802.1Q gives recommends mappings DA (6B) SA (6B) ET=8100 (2B) ET=88A8 (2B) ET (2B) P(3b) CFI(1b) CVID(12b) P(3b) DEA(1b) SVID(12b) QoS 42 Marking MPLS Only top label is relevant Label can indicate priority (L-LSP) or TC field (previously called EXP field, previously called COS field) (E-LSP) • 3 bits so takes values 0 … 7 • no recognized TC value meanings Top Label (20b) Label (20b) TC(3b) S(1b) TTL (8b) TC(3b) S(1b) TTL (8b) Bottom Label (20b) TC(3b) S(1b) TTL (8b) QoS 43 Marking IPv4 The IPv4 header has a Type of Service (ToS) field RFC 2474 redefined ToS to consist of • 6 bit DSCP (see also RFC 4594) • 2 bit ECN (least significant bits) Guidelines for use of DSCP in many documents Ver(4b) IHL(4b) ToS(1B) ... Len(2B) Source IP Address (4B) Destination IP Address (4B) QoS 44 Marking IPv6 The IPv6 header has a Traffic Class (TC) field RFC 2460 states that it is to be used like the IPv4 ToS field The Flow Label (intended to ensure flow of packets follow the same path) may be indirectly used Ver(4b) TC(1B) payload len(2B) Flow Label (20b) next(1B) hop(1B) Source IP Address (16B) Destination IP Address (16B) QoS 45 IP DiffServ methods DiffServ was developed in IETF after IntServ RFCs 2474, 2475 because • IntServ was considered too heavy-weight for most purposes • resource reservation is against IP-philosophy if not enough BW, then more democratic for all to suffer if reserve BW and don’t use, then this is simply over-provisioning DiffServ is evolutionary “coarse-grained” approach to IP QoS DiffServ – divides traffic into service classes and allocates resources on a per-class basis – uses 6 bits of ToS byte in IP header to mark packets •field is renamed Differentiated Services Code Point •no setup or router state required – DSCP defines per-hop behaviors (PHB) •tells router how to treat packet – three standard PHBs (BE, AF, EF) but you are free to create more QoS 46 DiffServ Per Hop Behaviors (PHBs) Best Effort – standard IP service – QoS depends on momentary network load WARNING: DiffServ does not provide true assurances Assured Forwarding – AF specifies class that determines queue – in addition, three drop-precedence levels (low, med, high) – AF packets from a single source should not be mis-ordered even if have different drop-precedence (i.e. single queue) Expedited Forwarding – EF packet should experience no queuing delays – EF packets should have low loss – implemented by dedicated EF router queue QoS 47 What about MPLS ? MPLS DiffServ - each FEC is defined by – destination address – service class L-LSP (Labeled-inferred LSP) – – – – behavior based on label alone support different service classes by using different labels LSP BW allocated from specific queue (class) TC field may be used for drop precedence E-LSP (EXP-inferred LSP) – behavior based on label + TC (ex-EXP) field – TC bits in MPLS shim header are similar to DSCPs, but only three bits (like P-bits) while DiffServ ended up with 6 bits – but 8 service classes is usually more than enough commonly 4 classes are offered (bronze, silver, gold, platinum) – LSP BW allocated from link QoS 48 Queuing in switches/routers Switches and routers have queues FIFO buffers output port output port switch fabric input port input port input port queue queue queue queue Queues are emptied according to scheduling discipline (strict priority, WFQ, etc.) output port If there were only one queue then traffic handling would be FCFS To enable DiffServ prioritization multiple queues are used Outgoing frames are inserted into queues according to priority marking output port on each output port QoS 49 Traffic conditioning One of the most important parts of an SLA is the Committed Information Rate (bps) This is the datarate (bandwidth) SP tries to forward WARNING: DiffServ does not provide true commitments There may also be an Extra Information Rate (bps) This is a datarate that the SP will forward if possible A customer who did not send data for a while will expect to be able to send a higher rate afterwards Enforcement of these rates is accomplished via traffic conditioning Three strategies : • policing (rate limiting, throttling) • shaping (in ATM – GCRA) • metering QoS 50 Traffic policing Traffic exceeding the committed datarate is immediately discarded (or at least marked as discard eligible) CIR time policed to CIR CIR time QoS 51 Traffic shaping Traffic exceeding the committed datarate is delayed until it can be forwarded (placed or remains in a buffer) (discarded only if rate exceeds CIR for an extended time) CIR time shaped to CIR CIR time QoS 52 Traffic metering Charge extra for traffic exceeding the committed datarate (leads customer to self-police) CIR time extra charge over CIR CIR time QoS 53 BW specification What does an SLA commitment of X bps mean ? BW usage naturally varies for many services Must the customer send < X bps at all times even if he transmitted much less than X up to now ? May the customer remain silent for 9 minutes and then send 10X bps for 1 minute ? If the measurement interval is 10 minutes then this is precisely X bps ! A BW cap is only meaningful when we specify the integration time Or, we can specify the rate and the maximum burst size (in bytes) and enforce these using a bucketing algorithm A bucketing algorithm allows bursts above X for a limited time as long as the average remains at X or below QoS 54 Leaky and token buckets Leaky bucket model Token bucket model water is poured into bucket as needed water leaks out at a constant rate if too much water poured in it overflows water is poured into bucket at a constant rate water is removed as needed if too much water poured in it overflows Interpretation Interpretation bps of traffic are added to bucket committed rate is continuously removed if packet fits into bucket it is sent unused data rate is lost tokens are added at committed rate to send traffic there have to be enough tokens unused data rate is lost QoS 55 How does a token bucket work ? Part 1 Let’s look at a token bucket policer (other cases are equivalent) The bucket is configured with height - Committed Information Rate (CIR) filling rate - Committed Burst Size (CBS) CIR If packets are sent at precisely the committed rate CBS – the bucket height stays constant – and all packets are forwarded If packets arrive at less then the committed rate Note: Some people complicate formulas by – the bucket height increases specifying CIR in bps and CBS in Bytes Such people should be committed – all packets are forwarded – excess information rate overflows and is lost If packets arrive at more then the committed rate – the bucket height decreases – when no tokens are left packets are discarded continued QoS 56 How does a token bucket work ? Part 2 If no packets have been sent for some time and then CBS worth of packets are sent – the bucket is initially full of CBS tokens – the tokens are all removed – all packets are forwarded CIR If more than CBS information rate in burst – the first CBS of packets are forwarded – the rest are discarded until new tokens arrive CBS Note: adding of tokens can be in discrete time - every T (e.g., 1 sec) the token are added continuous time – tokens are continuously added (in practice, when new packet arrives, calculate number of tokens added since the last packet) for continuous time, the maximum burst size is larger than the configured CBS QoS 57 Dual token buckets Sometimes SPs sell (and customers purchase) Extra traffic This is a rate above the committed rate that the SP will forward if it can (but doesn’t commit to forward) Extra traffic is priced much lower than committed rate traffic To handle Extra traffic, we use two (token or leaky) buckets, C and E the C bucket is of height CBS and is filled at rate CIR the E bucket is of height EBS and is filled at rate EIR EIR CIR EBS CBS C E continued QoS 58 Dual token buckets (cont.) Furthermore, we classify packets as green if passes C bucket test (green packets are forwarded) yellow if fails C bucket test, but passes E bucket test (yellow packets may be forwarded, but SLA objectives don’t apply) red if fails both bucket tests (red packets are always discarded) More precisely : if ingress traffic < number of tokens in C bucket frame is green and its length in tokens is debited from C bucket else if ingress frame length < number of tokens in E bucket frame is yellow and its length of tokens is debited from E bucket else frame is red QoS 59 More token bucket variations As if this isn’t complicate enough … MEF added two more twists – coupling and sharing Unused rate is not lost – it is coupled or shared ! coupling priority sharing sharing coupling QoS 60 hard QoS QoS 61 CO vs. CL networks To guarantee QoS (and thus QoE) we need to • find path through network that can provide the needed QoS • reserve resources along this path to guarantee the QoS • not accept flows for which there are insufficient resources (CAC) • optionally – optimize path placement (to maximize number of flows that can be accommodated) ATM and (some) MPLS networks are Connection Oriented (CO), thus • we specify (or at least know) the path the packets will take • we can reserve resources at the network elements along the path Standard IP networks are ConnectionLess (CL), thus • it is hard to ensure packets will go where we want them to • it is meaningless to reserve resources as we don’t know which network elements a packet will traverse Thus hard QoS is not easy to add to IP which is why hard QoS is not popular in pure IP networks … QoS 62 Network and Traffic Engineering Network Engineering (planning) putting the bandwidth where the traffic is – physical cable deployment (thick pipes) – over-provisioning and backup connection provisioning – does it violate provider objectives ? Traffic Engineering (TE) putting the traffic where the bandwidth is – explicit traffic routing – route optimization – can it meet user objectives? QoS 63 Traffic Engineering (TE) TE is control of network traffic to achieve specific objectives unfortunately users and providers have contradictory objectives user objectives (QoS) – – – – – – network availability packet loss end-to-end delay round-trip delay packet delay variation (PDV) error rate provider objectives (CAPEX, OPEX) – – – – – bandwidth utilization resource utilization speed of failure recovery ease of management monetary outlay QoS 64 Simple example - fish diagram A D 1G G C B ½G E all links 1G except EF ½ G F In the above example, there is sufficient BW for all traffic (were these ATM switches, 1G over ACDG, ½ over BCEFG Without TE, can’t use all the physical bandwidth! standard IP routing : minimum hop count is CDG, so all traffic flows there 1½ G over 1G link, so ½G is dropped ! with administrative cost can force all traffic to go CEFG - which is worse ! with ECMP half (750M) goes CDG and half (750M) goes CEFG – still drops ! QoS 65 Constraint-based routing IP uses distributed routing protocols, not centralized management Distributed protocols are very good at finding basic connectivity and minimizing an additive metric (e.g., hop count) but are not good at optimally utilizing network resources or obeying constraints Common constraints include : • explicit include/exclude links/routers (local constraint) • conform to link BW constraints (local inequality constraint) • meet end-end delay / PLR objectives (global inequality constraint) Routing that takes constraints into account is called constraint-based routing (CR) After CR finds an acceptable path we may need to set it up (reserve resources) using another protocol QoS 66 OSPF-TE and IS-IS-TE Constraint-based routing needs link attribute information most importantly - available BW Link-state protocols have mechanisms to flood link-up/link-down to every router in the domain We can piggyback attributes as TLVs on these messages – OSPF add to Link-State Advertisement (LSA) (RFC 3630) – IS-IS add to Link-State Packets and the routing protocol builds an extended “TE” RIB When attribute information changes need to reflood information To decrease overhead: – only flood when change passes threshold – inherent timing bounds Note: CR routing can be NP-hard Standards don’t include (proprietary) efficient algorithms QoS 67 IP IntServ RFCs 2205-2216, 2379-2382 IntServ is an overall QoS architecture (not just RSVP) initially developed for VoIP QoS IntServ is a radical departure from pure IP and requires IntServ-enabled routers IntServ – enables providing end-to-end QoS guarantees – defines flows (introduces CO to IP’s CL architecture) flows are classified into three service classes (BE,CLS,GS) – specifies admission control and policing – like all CO architectures, requires signaling protocol (RSVP) IntServ-enabled routers – reserve needed resources along the flow’s path – must retain state QoS 68 IntServ CoS levels Best Effort – standard IP service – QoS depends on momentary network load Controlled Load Service – service equivalent to unloaded network – low packet loss – most packets will experience delay close to minimum – no quantitative guarantees Guaranteed Service – bounded worst case delay (no PDV guarantee) – low packet loss (zero if node buffers correctly provisioned) – quantitative guarantees QoS 69 RSVP The primary signaling protocol for IntServ is Resource reSerVation Protocol (RSVP) RSVP RSVP protocol • runs between hosts and routers • runs over raw IP or UDP/IP • is unidirectional • does not find path (fed by routing protocols) • sessions identified by source and destination socket numbers • requests unidirectional QoS characteristics from network • causes routers along path to reserve link and node resources • network responds with success/failure • reservations are soft-state - time-out unless refreshed two main message types: PATH and RESV QoS 70 RSVP Messages receivers sender PATH – message from sender to receiver(s) – carries classification info and TSpecs RESV – response of receiver to PATH message – carries session ID and RSpec specifying QoS required – contains the actual request for resource reservation QoS 71 MPLS TE MPLS-TE protocols enable Fast ReRoute to guarantee fault recovery (connectivity assurance) Also, MPLS FECs can take QoS constraints into account MPLS-TE LSPs can be setup according to constraints – include/exclude specific LSRs (for any reason) – only include in LSP LSRs with sufficient available BW – only include in LSP LSRs that guarantee sufficiently low delay OSPF-TE or IS-IS-TE can be used to get needed network information But how can the path be set-up ? Vanilla LDP has no TE capabilities and its extension CR-LDP is now obsolete The answer is a set of extensions to RSVP called RSVP-TE Unlike RSVP, this protocol runs only between routers (LSRs) QoS 72 RSVP-TE RSVP-TE (RFC 3209 …) is a label distribution protocol • For downstream-on-demand binding • Creates and distributes bindings between RSVP flows and labels • Uses labels instead of source and destination socket numbers • Extends RSVP by adding new objects (e.g. label) and procedures • Allows strict/loose explicitly routed LSPs • Has peer discovery, label requests, binding messages (like LDP) • Transparent transport of QoS and traffic parameters in TSpecs and RSpecs • Although between routers - still soft state ! Note: RSVP-TE is frequently used to set up FRR alongside LDP QoS 73 RSVP-TE LSP setup procedure egress ingress Example setup with explicit routes A B C • A sends PATH message to B w/ explicit route BC and resource requirements • B forwards PATH message to C after changing explicit route to C • C determines required resources, reserves, locally binds label and sends RESV to B • B matches, reserves resources, remotely binds, and sends RESV to A • A matches, remotely binds label and reserves resources QoS 74 PCE To optimally utilize network resources (e.g., link BW), we need to gather all the topology and network constraints into one centralized Traffic Engineering Database (TED) enough computational power to solve the complex optimization problem to send path set-up commands to the routers to set up TE-LSP RFC 4655 defines a Path Computation Element (PCE nicknamed Godbox) and Path Computation Clients (PCCs) The PCE may be • a designated router or • the management system or • a dedicated computational platform PCE PCE is an evolutionary solution to adding computational resources QoS 75 PCEP (RFC 5440) The protocol between PCE and PCCs and between multiple PCEs (if there are) is called the Path Computation Element Protocol (PCEP) PCEP runs over TCP with registered TCP port 4189 Messages are objects (with common object header) with optional TLVs PCEP Messages : • • • • • • • Open Keepalive PCReq PCRep PCNtf PCErr Close between PCE and PCC to open a new session optional heartbeat sent if no other PCEP messages PCC → PCE to request a path computation PCE → PCC with set of computed paths or negative reply event notification from either PCE or PCC protocol error message session close QoS 76 Optimization How does the PCE compute paths ? The general path optimization problem is intractable (NP-hard) But there are many combinatorial optimization problems with known efficient algorithms that return approximate solutions For example, the (1-dimensional) bin packing problem : Given N values V1 V2 … VN between 0 and 1 how can we place them in bins of maximum size 1 using the minimum number of bins ? This problem comes up in many applications : • • • • • multiprocessor scheduling stock cutting mapping short messages into time slots of ATM cells mapping CBR flows onto WDM wavelengths Full PCE problem is even harder ! mapping flows onto orthogonal paths QoS 77 SDN An even more radical centralized solution is the Software Defined Network (SDN) Like PCE, SDN replaces distributed routing protocols with a centralized controller that communicates with SDN switches An SDN switch is a forwarding device that can be programmed to match an arbitrary set of fields in the packet and edit / forward the packet accordingly SDN switches need not obey standard network layers The most popular controller-switch protocol is OpenFlow Google uses SDN switches to optimize utilization in its inter-datacenter network QoS 78

QoS and QoE - Dspcsp.com

Related documents

Products

Support

QoS and QoE - Dspcsp.com

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib