Computer Networks: Mechanisms for Quality of Service Ivan Marsic Rutgers University Chapter 5 - Mechanisms for Quality-of-Service Mechanisms for Quality-of-Service Chapter 5 Topic: Scheduling Max-Min Fair Share Fair Queuing (FQ) Weighted Fair Queuing (WFQ) Why Scheduling? • To prevent aggressive or misbehaving sources from overtaking network resources • To provide preferential service to packets from preferred sources Scheduler C la s s 1 q u e u e (W aitin g lin e ) T ra n sm itte r (S e rve r) C la s s 2 q u e u e A rrivin g p a c k e ts C la s sifie r S c h e d u le r C la s s n q u e u e S c h e d u lin g d is cip lin e P a c k e t d ro p w h e n q u e u e fu ll Scheduler Design Concerns • Classification policy for forming the queues: – – – – Priority based Source identity based Flow identity based (source-destination pair) … • Scheduling policy for taking packets into service: – Round robin – Work conserving vs. Non-work conserving Fair Share of a Resource d e s ire d : 2 /3 d e s ire d : 1 /8 P2 P1 d e s ire d : 1 /3 P3 Max-Min Fair Share (1) 1 . S atisfy cu stom e rs w h o n ee d less tha n th e ir fa ir sh a re 2 . S p lit th e re m a in de r e qu a lly a m o n g the re m a in ing cu sto m e rs F a ir sh are: 1/3 ea ch d es ired : 1 /8 P2 P1 R e turn su rp lus : 1 /3 1/8 = 5/24 N e w fa ir sh are for P 2 & P 3: 1 /3 + ½ (5/24 ) e ach P3 Max-Min Fair Share (2) 1 . S atisfy cu stom e rs w h o n ee d less tha n th e ir fa ir sh a re 2 . S p lit th e re m a in de r e qu a lly a m o n g the re m a in ing cu sto m e rs F a ir sh are: 1 /3 + ½ (5/24 ) e ach rec e ive d : 1/8 P2 P1 R e m a in de r of R e turn su rp lus : 1 /3 + ½ (5/24 ) 1/3 = ½ (5 /2 4) 1 /3 + 2 ½ (5/24 ) g o es to P 2 P3 Max-Min Fair Share (3) F in a l fa ir d is trib u tio n : rec e ive d : 1/8 P2 rec e ive d : 1/3 + 5 /24 P1 rec e ive d : 1/3 d efic it: 1 /8 P3 Max-Min Fair Share 1 . S atisfy cu stom e rs w h o n ee d less tha n th e ir fa ir sh a re 2 . S p lit th e re m a in de r e qu a lly a m o n g the re m a in ing cu sto m e rs F a ir sh are: 1/3 ea ch d e s ire d : 2 /3 d e s ire d : 1 /8 d es ired : 1 /8 P2 P1 P2 P1 d e s ire d : 1 /3 R e turn su rp lus : 1 /3 1/8 = 5/24 a 1 /3 + ½ (5/24 ) e ach b P3 1 . S atisfy cu stom e rs w h o n ee d less tha n th e ir fa ir sh a re 2 . S p lit th e re m a in de r e qu a lly a m o n g the re m a in ing cu sto m e rs N e w fa ir sh are for P 2 & P 3: P3 F in a l fa ir d is trib u tio n : F a ir sh are: 1 /3 + ½ (5/24 ) e ach rec e ive d : 1/8 P2 rec e ive d : 1/8 P2 rec e ive d : 1/3 + 5 /24 P1 P1 R e m a in de r of c R e turn su rp lus : 1 /3 + ½ (5/24 ) 1/3 = ½ (5 /2 4) rec e ive d : 1/3 1 /3 + 2 ½ (5/24 ) g o es to P 2 P3 d efic it: 1 /8 d P3 Example 5.1: Max-Min Fair Share A p p lic a tio n 1 8 p a c k e ts p e r s e c L 1 = 2 0 4 8 b yte s A p p lic a tio n 2 2 5 p k ts /s L2 = 2 K B A p p lic a tio n 3 5 0 p k ts /s L 3 = 5 1 2 b yte s A p p lic a tio n 4 L in k c a p a c ity = 1 M bps 4 0 p k ts /s L4 = 1 K B W i-F i tra n sm itte r (S e rve r) Example 5.1: Max-Min Fair Share 8 2048 Appl. A + 25 2048 Appl. B + 50 512 Appl. C 40 1024 Appl. D + = 134,144 bytes/sec Total demand = 1,073,152 bits/sec available capacity of the link is C = 1 Mbps = 1,000,000 bits/sec Sources Demands [bps] Balances after 1st round Allocation #2 [bps] Balances after 2nd round Allocation #3 (Final) [bps] Final balances Application 1 131,072 bps 118,928 bps 131,072 0 131,072 bps 0 Application 2 409,600 bps 159,600 bps 332,064 77,536 bps 336,448 bps 73,152 bps Application 3 204,800 bps 45,200 bps 204,800 0 204,800 bps 0 Application 4 327,680 bps 77,680 bps 332,064 4,384 bps 327,680 bps 0 Weighted Max-Min Fair Share: Source weights : w1 = 0.5, w2 = 2, w3 = 1.75, and w4 = 0.75 Src Demands Allocation #1 [bps] Balances after Allocation #2 1st round [bps] Balances after Allocation #3 2nd round (Final) [bps] Final balances 1 131,072 bps 100,000 31,072 122,338 8,734 bps 131,072 bps 0 2 409,600 bps 400,000 9,600 489,354 79,754 bps 409,600 bps 0 3 204,800 bps 350,000 145,200 204,800 0 204,800 bps 0 4 327,680 bps 150,000 177,680 183,508 144,172 bps 254,528 bps 73,152 bps Implementing MMFS with Packets • Problem: Packets are of different length, arrive randomly, and must be transmitted atomically, as a whole – Packets cannot be split in pieces to satisfy the MMFS bandwidth allocation • Solution: Fair Queuing (FQ) Example: Airport Check-in W aitin g lin e s (qu eue s) S e rvic e tim es : X E ,3 = 3 X E ,2 = 5 X E ,1 = 2 E c o n o m y-c la ss p a s s e n g e rs ? C la s sific a tio n C u s tom e r in s e rvic e F irs t-c la ss p a s s e n g e rs S e rvic e tim e: X F ,1 = 8 S e rve r Bit-by-bit GPS F lo w 1 qu eu e T ra n s m itte r (S e rv e r) F lo w 2 q u e u e qu 3 w F lo eu e B it-b y-b it ro u n d ro b in s e rv ic e FQ: Run an imaginary GPS simulation, to find when packet transmission would be finished if its bits could be transmitted one-by-one pkt A2,2 | pkt A2,1 1st bit from pkt A2,1 1st bit from pkt A3,1 Flow 2 queue 2nd bit from pkt A2,1 GPS Transmitter (imaginary) 2nd bit from pkt A3,1 ue 3rd bit from pkt A2,1 que 1st bit from pkt A2,2 Flo w1 d un ro ound 5 th r nd 4 d rou und r 3 nd ro und 2 st ro 1 th 2nd bit from pkt A2,2 Bit-by-bit GPS Server 3-bit packets Flo w1 rd 3 que ue d un ro nd 2 d un ro st 1 nd rou FQ Transmitter (real) Flow 2 queue pkt A2,2 | pkt A2,1 packet A2,1 packet A3,1 Server ,1 Packet-by-packet round robin service 1 st ro A3 pkt un d e ueu un d (b) 3q packet A2,2 2 nd ro w Flo un d Bit-by-bit round robin service ets ack it p 2-b 1 st ro ro ,1 nd A3 pkt un d ue 4 th ro 3 rd un d ro un d 2 F que 3 rd ro un d (a) 3 low Bit-by-bit GPS -- Example A1,1 Flow 1 L1 = 3 bits Ai,j A2,2 Flow 2 Arrival Waiting A2,1 Service L2 = 3 bits Finish A3,2 Flow 3 A3,1 L3 = 2 bits Time Round number 0 1 2 1 3 4 2 5 3 6 4 7 5 8 9 10 6 11 12 7 13 8 14 GPS Round Number vs. Time = 1/ 1 8 sl op e 7 5 sl 1 /2 = pe s lo /3 = 1 e op 4 = 1/ 1 R ound num ber op e 3 F 1 (t) sl R o u n d n u m b e r R (t) 6 2 F 2 (t) F 3 (t) 1 /2 = pe s lo 1 0 1 2 3 4 5 6 7 T im e t 8 9 10 11 12 13 14 Example 5.3: GPS Round Numbers 4 R ound n u m b e r R (t) a 3 A rriva l tim e F low ID P a cket size t= 0 @ Flo w 1 1 6 ,38 4 b ytes t= 0 @ Flo w 3 4 ,096 b ytes t= 3 @ Flo w 2 1 6 ,38 4 b ytes t= 3 @ Flo w 4 8 ,192 b ytes t= 6 @ Flo w 3 4 ,096 b ytes t = 12 @ Flo w 3 4 ,096 b ytes 1 /1 2 1 /2 1 2 3 4 5 T im e t P 1 ,1 , P 3 ,1 6 R ound n u m b e r R (t) b 1 /1 5 4 1 /3 3 2 1 /1 F lo w 2 : 1 F lo w 4 : 1 F lo w 3 : 0 F lo w 1 : F lo w 4 : F lo w 3 : F lo w 2 : F lo w 1 : 0 2 3 4 P 2 ,1 , P 4 ,1 5 6 7 8 9 10 11 T im e t R oun d nu m be r R (t) Example 5.3: GPS Round Numbers (Cont’d) 0 0 3 3 6 12 16,384 @1 4,096 @3 16,384 @2 8,192 @4 4,096 @3 4,096 @3 7 6 1 /1 5 1 /4 4 1 /3 3 1 /1 2 1 /2 1 T im e t 3 4 6 7 8 9 10 11 12 13 1 3 3, 2 P 1 2, P 3, ,P 3, ,P 1 1, P 5 P 2 1 1 4, 0 F lo w 4 : F lo w 3 : F lo w 2 : F lo w 1 : 0 Example 5.3: Fair Queuing A i,j F lo w 1 A rriva l W aitin g A 1 ,1 L1 = 2 KB S e rvice F lo w 2 A 2 ,1 L2 = 2 KB A 3 ,3 F lo w 3 A 3 ,2 L 3 = 5 12 B A 3 ,1 F lo w 4 A 4 ,1 L4 = 1 KB T im e [m s ] 0 4 .0 9 6 8 .1 9 2 1 2 .2 8 8 1 6 .3 8 4 2 0 .4 8 0 2 4 .5 7 6 2 8 .6 7 2 3 2 .7 6 8 3 6 .8 6 4 4 0 .9 6 4 5 .0 5 6 4 9 .1 5 2 5 3 .2 4 8 Example 4.4: Fair Queuing (1) A 1 ,1 A 1 ,2 A 1 ,3 A 1 ,4 A 1 ,8 F lo w 1 A 2 ,1 A 2 ,2 A 2 ,4 F lo w 2 A 3 ,1 A 3 ,2 A 3 ,3 A 3 ,4 A 3 ,5 A 3 ,6 A 3 ,7 A 3 ,8 A 3 ,9 A 3 ,1 0 A 3 ,1 1 A 3 ,3 2 A 3 ,3 3 F lo w 3 A 4 ,1 A 4 ,2 A 4 ,3 A 4 ,4 A 4 ,5 A 4 ,6 A 4 ,7 A 4 ,8 A 4 ,9 A 4 ,1 9 F lo w 4 0 1 se c T im e Example 4.4: Fair Queuing (2b) T im e [m se c ] A 1 ,1 W 1,1 F lo w 1 1 4 2 .8 6 F lo w 2 A 3 ,4 A 3 ,5 W 3,4 W 3,5 ? A 3 ,6 W 3,6 F lo w 3 125 A 4 ,2 1 5 6 .2 5 A 4 ,3 1 1 1 .1 1 W 4,3 ? F lo w 4 1 1 1 .1 1 P a cke t in tra n sm issio n 1 6 6 .6 7 X 4 ,2 1 1 1 .1 1 X 3 ,4 1 2 8 .7 1 X 1 ,1 1 4 6 .3 1 ? 1 8 1 .1 5 Topic: Policing Leaky Bucket Algorithm Delay Magnitude & Variability T raffic p a tte rn 1 D e la y A ve ra g e d e la y T raffic p a tte rn 2 T im e Leaky Bucket tokens generated at rate r [tokens/sec] bucket holds up to b tokens Token generator 1 token dispensed for each packet r tokens/sec Token dispenser (bucket) b = bucket capacity Token waiting area to network token-operated turnstile packet (a) arriving packets Regulator (b) Topic: Active Queue Management Random Early Detection (RED) Explicit Congestion Notification (ECN) Why Active Queue Management D e -s yn c h ro n iz e d T C P s e n d e rs C o n g e stio n W in d o w s S yn c h ro n iz e d T C P s e n d e rs T im e R e so u rce u sa g e T im e (a) (b) Random Early Detection (RED) R a n d o m -d ro p zo n e : R o u te r bu ffer P a c k e ts m ig h t b e d ro p p e d Alw ays dropped Packet c u rre n tly in s e rvic e N ever dropped A rrivin g p a c k e ts To n e tw o rk (a) Server T hreshold M ax Head of th e q u e u e T hreshold M in (D ro p s ta rt lo c a tio n ) Ptem p(d ro p ) 1 .0 (b ) Pm a x AverageQ Len 0 T hresholdM in T hresholdM ax F ull Random Early Detection (RED) AverageQLen(t) = (1) AverageQLen(t1) MeasuredQLen(t) Ptem p ( AverageQ Le n ) Pm ax P ( A verageQ Le n ) ( AverageQ Len T hreshold m in ) (Threshold m ax Threshold m in ) Ptemp ( A verageQ Le n ) 1 count Ptemp ( A vera geQ Le n ) (5.4) (5.5) (5.6) Random Early Detection (RED) Listing 5-1: Summary of the RED algorithm. Set count = 1 For each newly arrived packet, do: 1. Calculate AvgQLen using equation (5.4) 2. If Thrmin AvgQLen Thrmax a) increment count count 1 b) calculate the drop probability for this packet using equation (5.6) c) decide if to drop the packet (Y/N) and drop it if decided YES d) if in step c) the packet is dropped, reset count 0 3. Else if Thrmax AvgQLen a) drop the packet and reset count 0 4. Otherwise, reset count 1 End of the for loop Explicit Congestion Notification (ECN) • Want: avoid dropping packets as a way of notifying the senders of congestion • Need: avoid sending direct messages to senders because it makes congestion worse • Solution: piggyback notifications to sender on DATA packets flowing to receiver; receiver will piggyback notifications to ACKs ECN details: RFC 3168 Explicit Congestion Notification (ECN) Data + ECN for sender Data (a) Data + ECN for sender (congested) notification: congestion! TCP Sender Router i ACK + ECN for sender Router j ACK + ECN for sender TCP Receiver ACK + ECN for sender Router needs to notify the TCP sender of incipient congestion, so it piggybacks an ECN notification on the data packet. The TCP receiver should send the ECN notification back to the sender, piggybacked on the ACK packet. Routers work only with IP (not TCP), so … • Need to modify IP header format to support [router receiver] ECN notifications • Need to modify TCP header format to support [receiver sender] ECN notifications Explicit Congestion Notification (ECN) Not enough to modify the IP header format: The Sender would not know if the ECN notification is for the Sender or for the Receiver; It would not matter if both data and ACK packets travelled over the same path, because it would be relevant for both. But, packets may take different paths, so Sender must know if the ECN is for itself or Receiver. Having two bits, one for “forward” (data) path and one for “reverse” (ACK) path would not solve the problem, because routers cannot distinguish “forward/reverse” – the distinction makes sense only to the TCP layer (not available on routers)! Therefore, must modify TCP header format to allow the Receiver to notify the Sender IPv4 Header 0 7 8 4 -b it ve rsio n num ber 4 -b it header le n g th 15 16 8 -b it d iffe re n tia te d se rvice s (D S ) 1 6 -b it d a ta g ra m le n g th (in b yte s) u n u D M s F F e d 1 6 -b it d a ta g ra m id e n tifica tio n 8 -b it tim e to live (T T L ) fla g s 31 8 -b it u se r p ro to co l 1 3 -b it fra gm e n t o ffse t 1 6 -b it h e a d e r ch e cksu m 3 2 -b it so u rce IP a d d re ss 3 2 -b it d e stin a tio n IP a d d re ss o p tio n s (if a n y) d a ta 20 b yte s Modify IP Header Format for ECN • Need two bits: – One for the congested router to notify the Sender of an incipient congestion – One for the Sender to notify routers that it is ECN capable (because otherwise the router’s notification would not make difference) • Note: If routers applied ECN regardless of whether Senders are ECN-capable, then non-ECN-capable senders get a free ride (no packets dropped when congestion, just ECN bit set!) and would not have incentive to upgrade to ECN – So, apply RED to packets from non-ECN-capable senders IPv4 Packet Header + ECN Field 0 7 8 4 -b it ve rsio n num ber 4 -b it header le n g th 15 16 6 -b it d iff. se rvice s 1 6 -b it d a ta g ra m id e n tifica tio n E C C E T fla g s u n u D M s F F e d 31 1 6 -b it d a ta g ra m le n g th (in b yte s) 1 3 -b it fra gm e n t o ffse t (a) ECN Field 14 15 E C T C E ECT 0 0 1 1 CE 0 1 0 1 = Not-ECT = ECT(1) = ECT(0) = CE, cong. experienced Detecting a Misbehaving Node Data, CE Data, ECT(1) Data, ECT(0) (congested) Misbehaving node TCP Sender Discover misbehavior ACK, ECT(0) () ACK, ECT(0) TCP Receiver ACK, ECT(0) Sender can alternate between the ECT(0) / ECT(1) codepoints to discover if a misbehaving node (another router or Receiver) is erasing the CE codepoint. The misbehaving node would not be repeatedly able to correctly reconstruct the Sender’s ECT dodepoint. More likely, repeated erasure of the CE codepoint would be soon discovered by the Sender. () But then how can the congested router notify the Receiver about the congestion? -- The router cannot distinguish between “Sender” and “Receiver”– for it all IP packets are just IP packets!? Note that RFC 3168 does not say that misbehaving node detection is based on a simple reflection of ECT() codepoints by the Receiver. Instead, it only says: “The ECN nonce allows the development of mechanisms for the sender to probabilistically verify that network elements are not erasing the CE codepoint…” TCP Header 0 15 16 1 6 -b it so u rce p o rt n u m b e r 31 1 6 -b it d e stin a tio n p o rt n u m b e r TC P header 3 2 -b it se q u e n ce n u m b e r 4 -b it header le n g th u n u se d (6 b its) fla g s 3 2 -b it a ckn o w le d ge m e n t n u m b e r U A P R S F R C S S Y I G K H T N N 1 6 -b it a d ve rtise d re ce ive w in d o w size 1 6 -b it T C P ch e cksu m 1 6 -b it u rg e n t d a ta p o in te r T C P p a ylo a d o p tio n s (if a n y) T C P se g m e n t d a ta (if a n y) 20 b yte s Modify TCP Header 0 15 16 1 6 -b it so u rce p o rt n u m b e r 31 1 6 -b it d e stin a tio n p o rt n u m b e r 3 2 -b it se q u e n ce n u m b e r fla g s 8 9 C W R E C E 4 -b it header le n g th CWR = Sender informs Receiver that CongWin reduced ECE = ECN echo, Receiver informs Sender when CE received C E U A P R S F re se rve d W C R C S S Y I (4 b its) R E G K H T N N (b) 3 2 -b it a ckn o w le d ge m e n t n u m b e r 1 6 -b it a d ve rtise d re ce ive w in d o w size Explicit Congestion Notification (ECN) congestion notification for Sender Data, ECT(1) () The congested router cannot notify the Receiver about the congestion using “pure” ACK packets, because they must use codepoint “00,” indicates a not-ECT-capable source Data, CE Data, CE (congested) 2 1 TCP Sender 4 3 5 ACK, CE, ECE ACK, ECE congestion notification TCP Receiver ACK, ECE ECN-Echo for Sender for Receiver () 1 An ECT codepoint is set in packets transmitted by the sender to indicate that ECN is supported by the transport entities for these packets. 2 An ECN-capable router detects impending congestion and detects that an ECT codepoint is set in the packet it is about to drop. Instead of dropping the packet, the router chooses to set the CE codepoint in the IP header and forwards the packet. 3 The receiver receives the packet with the CE codepoint set, and sets the ECN-Echo flag in its next TCP ACK sent to the sender. 4 The sender receives the TCP ACK with ECN-Echo set, and reacts to the congestion as if a packet had been dropped. 5 The sender sets the CWR flag in the TCP header of the next packet sent to the receiver to acknowledge its receipt of and reaction to the ECN-Echo flag. Topic: Multiprotocol Label Switching (MPLS) Constraint-based routing Traffic engineering Virtual private networks (VPNs) MPLS Operation: Supporting Tunnels G A F Egress LSR MPLS domain IP datagrams Flow of (denoted by FEC) (or, exit) D entry) LSR Ingress (or, ted by LSP deno of a tunnel C B E MPLS label Link-layer hdr IP header IP payload LFIB LSR = Label switching router FEC = Forwarding equivalence class LSP = Label switched path LFIB = Label forwarding information base H How Is MPLS Switching Different From IP Forwarding? • MPLS labels are simple integer numbers – IP addresses are hierarchically structured (dotted decimal notation) • Matching MPLS labels is simpler because they are fixed length; can be directly indexed into an array – IP network prefixes are variable length; require search for a longest match • MPLS switching table needs to have entries only for the adjacent routers – IP forwarding table needs to have entries for all network prefixes in the world! Label Switched Path segm ent 1 la b e l = 5 segm ent 2 segm ent 3 la b e l = 1 7 la b e l = 6 p a ck e ts b e lo n g in g to th e s am e F E C ONE W AY L S P (L a b e l S w itc h e d P a th = tu n n e l) (F o rw a rd in g E q u iva le n c e C la s s ) EN TER E X IT MPLS Protocol Layering Network/IP layer – Routing plane Network layer LSR Edge LSR Edge LSR Edge LSR Control plane Network/IP layer – Forwarding plane Data plane N e tw o rk la ye r LSP 3 M P L S la ye r MPLS layer plane LSP 2 LSP 1 L in k la ye r O NE WAY MPLS domain LSP = Label switched path LSR = Label switching router Link layer plane (Network’s physical topology) A LSR D G Edge LSR Edge LSR B C Edge LSR E F MPLS Label Format N e tw o rk la ye r M P L S la ye r L in k la ye r Link-layer header bits: MPLS label Network-layer header Payload Label value Exp S TTL 20 3 1 8 32 bits (4 bytes) MPLS Label Format Intermediate label (S=0) Bottom label (S=1) Label stack Network layer TTL Label TTL Label Link-layer header Label TTL Top label (S=0) MPLS layer IP header IP payload Link layer Label Bindings, LIB, and LFIB LSR IP forwarding table Destin. prefix Out port (not used in MPLS, except by edge LSRs) (from routing protocols) Routing table From To Label forwarding information base (LFIB) Dest. prefix In label Out label Out port (from peer LSRs) Label information base (LIB) LSP path setup & Label binding Dest. Prefix Out label Out port Dest. Prefix Out label Out port Network 96.1.1/24 LFIB(B) Edge LSR A 96.1.1.13 Port 4 Port 2 B LFIB(D) LSR 5 1 C Edge LSR 3 D H In label Out label Out port LFIB(C) (a) 96.1.1.13 Data pkt held waiting Edge LSR A 96.1.1.13 4 B 2 LSR 5 C Label req. 96.1.1/24 1 Edge LSR 3 D Label req. 96.1.1/24 H (b) Dest. Prefix Out label Out port 96.1.1/24 9 4 9 LFIB(B) Edge LSR A 96.1.1.13 Dest. Prefix Out label Out port B 4 17 2 Pfx: 96.1.1/24 Label = 9 LSR C 5 LFIB(C) 9 17 LFIB(D) 1 Pfx: 96.1.1/24 Label = 17 In label Out label Out port (c) Network 96.1.1/24 5 Edge LSR 3 D H LIB binding: 96.1.1/24 17 Forwarding labeled packets Dest. Prefix Out label Out port 96.1.1/24 9 4 Network 96.1.1/24 LFIB(B) Edge LSR A B Port 4 Port 2 96.1.1.13 9 LSR C 5 1 96.1.1.13 17 Edge LSR 3 D H In label Out label Out port (a) 9 LFIB(C) 17 5 D’s IP Forwarding table Dest. Prefix Out label Out port 96.1.1/24 9 4 Destin. Prefix Out port 96.1.1/24 3 Network 96.1.1/24 A Edge LSR LSR Edge LSR B C D 3 96.1.1.13 96.1.1.13 In label Out label Out port (b) 9 17 H 5 96.1.1.13 LSP Topologies LSP-2: A B C EGJ D LSP-1: H G E F MPLS layer plane LSP-1 LSP-2 MPLS domain Link layer plane (Network’s physical topology) B A D E H G C F I J LSR Control Plane N e tw o rk -la ye r ro u tin g p ro to c o ls (e .g ., O S P F , B G P , P IM ) F E C - to - n e x t - h o p m ap p in g P ro c e d u re s fo r b in d in g F E C s to la b e ls L a b e l-b in d in g d is trib u tio n p ro to c o l F E C - to - la b e l m ap p in g M a in te n a n c e o f L F IB (la b e l fo rw a rd in g in fo rm a tio n b a s e) Methods for Label Distribution On-demand Downstream Label Distribution LSR A 1 LSR A LSR B Request for Binding Label-to-FEC Binding (a) Unsolicited Downstream Label Distribution LSR B Label-to-FEC Binding 2 (b) CSPF Computations Extended Link State Routing Protocol User constraints: Link attributes: - Link color - Reserved bandwidth LSP attributes: TED Routing table Traffic Engineering Database (TED) (shared by all routers in TE domain) Constrained Shortest Path First (CSPF) Computation - Link color - Bandwidth - Explicit route - Hop limitations - Admin weight / Priority Explicit Route Object (ERO) RSVP-TE Signaling for LSP setup CE = customer edge router PE = provider edge router The computation of explicit routes (known as Explicit Route Objects or EROs) is done using a distributed Traffic Engineering Database (TED). The TED database contains all IP addresses, links and current bandwidth reservation states. How CSPF Works • Instead of just a single cost for a link between two neighbors, there's also: – Bandwidth – Link attributes – Administrative weight CSPF: Example B cost=10 bandwidth=100 Mbps A 1 150 Mbps 1 40 Mbps 1 50 Mbps Need path A D 7 100 Mbps C Without taking bandwidth into account, Router A's best path to Router D is A C B D, with a total cost of 3. Constraint: minimum path bandwidth of 90 Mbps D Build A’s routing table using CSPF Step Confirmed path set 0 (A, 0, , N/A) Tentative set Comments Initially, A is the only member of Confirmed(A), with a distance of 0, a next hop of self, and the bandwidth set to Not/Available. 1 (A, 0, , N/A) (B, 10, B, 100) Put A’s neighbors in the Tentative(A) set. Note that (C, 1, C, 50) was not added to Tentative because it does not meet the minimum bandwidth requirement. 2 (A, 0, , N/A), (D, 11, B, 100) Move B to the Confirmed set, and put B’s neighbors in the Tentative set. Again, (C, 11, B, (B, 10, B, 100) 40) was not added to Tentative because it does not meet the minimum bandwidth requirement. 3 (A, 0, , N/A), Move D to the Confirmed set. At this point, the (B, 10, B, 100), tunnel endpoint (D) is in Confirmed, so we are (D, 11, B, 100) done. END. Tiebreakers in CSPF Tiebreakers between paths, in order: 1. Take the path with the largest minimum available bandwidth. 2. If there is still a tie, take the path with the lowest hop count (the number of routers in the path). 3. If there is still a tie, take one path at random. CSPF: Example B cost=2 bandwidth=100 Mbps 1 50 Mbps D A 7 150 Mbps 4 150 Mbps G 4 90 Mbps 1 100 Mbps H 4 100 Mbps Need path A F Constraint: minimum path bandwidth of 80 Mbps 4 150 Mbps E 1 100 Mbps 2 150 Mbps C I 4 100 F 4 150 Tiebreakers in CSPF Tiebreakers between paths, in order: 1. Take the path with the largest minimum available bandwidth. 2. If there is still a tie, take the path with the lowest hop count (the number of routers in the path). 3. If there is still a tie, take one path at random. Possible Paths from A to F B cost=2 bandwidth=100 Mbps 1 50 Mbps D A 7 150 Mbps 4 150 Mbps G 4 90 Mbps 4 150 Mbps E 1 100 Mbps 2 150 Mbps C 1 100 Mbps H 4 100 Mbps 4 100 F 4 150 I Path name Routers in path Path length Minimum bandwidth on path (in Mbps) P1 ABCF 274= 13 Min{100, 150, 150} = 100 P2 ADEF 214= 7 Min{150, 50, 150} = 50 P3 ADGHF 2114= 8 Min{150, 100, 100, 100} = 100 P4 AHF 44= 8 Min{90, 100} = 90 P5 AHGDEF 41114= 11 Min{150, 100, 100, 50, 150} = 50 P6 AIF 44= 8 Min{100, 150} = 100 Apply tiebreakers to select best path • Eliminate the paths with minimum bandwidth 80 Mbps (P2 and P5) because their minimum bandwidth is lower than the minimum required bandwidth of 80 Mbps; Remaining paths: P1, P3, P4, and P6 • Select the path(s) with the shortest length and eliminate all greater length paths (P1); Remaining paths: P3, P4, and P6 — all have the length equal to 8 • Path P4 is not used because its minimum bandwidth is lower than the minimum bandwidths of paths P3 and P6 • Path P3 is not used because it has a hop count of 5, and the other remaining path (P6) has a hop count of 3 Finally, LSR A selects path P6 to build a tunnel to LSR F. The actual CSPF computation must follow the steps in Table 5-2 and apply the tiebreakers where needed. CSPF Computations Extended Link State Routing Protocol User constraints: Link attributes: - Link color - Reserved bandwidth LSP attributes: TED Routing table Traffic Engineering Database (TED) (shared by all routers in TE domain) Constrained Shortest Path First (CSPF) Computation - Link color - Bandwidth - Explicit route - Hop limitations - Admin weight / Priority Explicit Route Object (ERO) RSVP-TE Signaling for LSP setup The computation of explicit routes (known as Explicit Route Objects or EROs) is done using a distributed Traffic Engineering Database (TED). The TED database contains all IP addresses, links and current bandwidth reservation states. MPLS / DS-TE MPLS VPN: The Problem Customer 1 Site 1 Provider Network 10.2/16 Customer 1 Site 2 10.1/16 10.2/16 Customer 2 Site 2 10.1/16 Customer 2 Site 1 10.3/16 Customer 2 Site 3 10.3/16 Customer 1 Site 3 MPLS VPN: The Model Customer 1 Site 1 10.1/16 10.2/16 Customer 1 Site 2 Customer 1 Virtual Network 10.2/16 10.1/16 Customer 2 Site 2 Customer 2 Virtual Network Customer 2 Site 1 10.3/16 Customer 2 Site 3 Customer 1 Site 3 10.3/16 MPLS is used to tunnel data across a network of MPLS-enabled routers MPLS VPN: The Solution MPLS LSP Customer 1 Site 1 10.2/16 Customer 1 Site 2 VRF 1 10.1/16 VRF 1 10.2/16 VRF 2 Customer 2 Site 2 VRF 2 10.1/16 VRF 1 Customer 2 Site 1 VRF 2 MPLS LSP 10.3/16 Customer 2 Site 3 10.3/16 Customer 1 Site 3