Cisco QoS Notes Methods of Implementing QoS Legacy CLI Modular QoS (MQC) AutoQoS VoIP AutoQoS Enterprise QoS Policy Manager (QPM) Used for Administration & Monitoring multiple interfaces provisioned for QoS across the Network (not just on individual devices like AutoQoS does). AutoQoS AutoQoS VOIP Creates a QoS policy to prioritize Voice Over IP traffic ONLY, cannot be used to discover and implement QoS for other traffic types. AutoQoS Enterprise Uses Network-Based Application Recognition (NBAR) to discover traffic types on the Network and create a QoS Policy based on bestpractices for each flow. Steps for implementing QoS 1. Identify Traffic on the Network Use a Network Analyzer to identify the different protocols and applications used on the network, and their requirements. Steps for implementing QoS 2. Divide Traffic into Classes Voice: Highest Priority Mission-Critical: Transactional (Database) Best-Effort: Email, Web Browsing, FTP, etc Scavenger: P2P Apps, less than Best-Effort 3. Define QoS Policy How much bandwidth to reserve for a particular class Which traffic to prioritize and give preferential treatment to Define a policy for how to manage congestion Classification & Marking IP Precedence Deprecated standard for marking packets at Layer 3 for QoS, superseded by DSCP; uses the ToS byte in the IP header. IP ToS Byte 8-Bit Field within the IP Header of a packet, mainly used for marking packets with IP Precedence values. Classification & Marking What is Classification? The ability for a network device to identify different traffic types and divide them into different classes based on Business Requirements. Classification occurs on a devices inbound (Ingress) interface. Classification Tools Network-based Application Recognition (NBAR) Policy-Based Routing (PBR) Access Control Lists (ACLs) Marking Methods of Marking: Class of Service (COS) Frame Relay DE Bit MPLS Experimental (EXP) bits IP Precedence Differentiated Services Code Point (DSCP) In best practices you should limit the number of traffic classes for provisioning QoS to about 4 or 5 classes. If more is needed, usually no more than 11 different classes are necessary. An 11-Class QoS model might be benefit a large enterprise that requires more granularity for classes. Class of Service (COS) What is CoS? Turning on bits in the 802.1P (user priority) field within the 802.1Q Header (or Cisco ISL Header) of an Ethernet Frame. Supported values are 0-5, 7 and 6 are reserved and typically are not used to classify traffic. CoS 5 should be treated for high priority (i.e: Voice) traffic. Class of Service (COS) Limitation of COS Devices that receive packets on non-trunking or Ethernet ports will not preserve the L2 headers and the 802.1Q (or ISL) field, by stripping them of their priority markings. Therefore, CoS Markings should be mapped to mechanism which preserves the CoS as it transits other network devices, such as mapping CoS Values at Layer 2 to IP Precedence or DSCP values within header of packets at Layer 3 (IP). Marking with MQC set cos <0,1,2,3,4,5,6,7> Sets the COS bit on traffic class, within a policy-map set ip precedence Sets the IP Precedence for a class of traffic set dscp <0...63> Sets the DSCP for a class of traffic Differentiated Services (DiffServ) DiffServ Field Formerly known as the ToS Byte of an IP Packet DS Code Point (DSCP) The six left-most bits of the DiffServ Field. Packets can be divided into different classes or Behavior Aggregates (BA) and given preferential forwarding based on the bits set. Network devices, such as routers, switches, and IP Phones recognize DSCP markings on received packet(s) and can quickly determine the "Forwarding and Queuing Method" to use based on them. This is known as Per-Hop Behavior. With DSCP, packets can be marked with 64 different values (0-63). Per-Hop Behaviors Expedited Forwarding (EF) • • • • • • DSCP Value: 46 (101110) Backwards compatibility with IP Precedence 5 (101) Ensures minimal departure of packets Guarantees a maximum limit of bandwidth Marks packets with highest priority and zero drop rate Ideal for Voice traffic (audio, not signaling) Per-Hop Behaviors Assured Forwarding (AF): • • Commonly used for Mission Critical traffic Consists of four classes and Three Drop Preference Levels. Guarantees a minimum amount of bandwidth AF Classes: AF1 = Lowest Priority AF2 & AF3 = Medium Priority AF4 = Highest Priority AF DSCP Values Value AF Class Drop Pref Binary (DSCP) AF11 AF1 Low 001010 AF12 AF1 Medium 001100 AF13 AF1 High 001110 AF21 AF2 Low 010010 AF22 AF2 Medium 010100 AF23 AF2 High 010110 AF31 AF3 Low 011010 AF32 AF3 Medium 011100 AF33 AF3 High 011110 AF41 AF4 Low 100010 AF42 AF4 Medium 100100 AF43 AF4 High 100110 Drop Preference bits bolded in Red Per-Hop Behaviors What are the Drop Preference Levels for? The Drop Preference is used as a tie-breaker between packets of the same class during congestion. For example, If the router receives two packets of class AF1, it will check which packet has a higher drop preference set and discard that one in favor of the packet with the lower preference. Drop Preference is ignored between packets of different classes. If a packet marked with AF11 (Low Drop) and a packet with AF43 (High Drop) arrive at the router, the first one will be dropped because it is in the lower class, even though the other packet has a higher Drop Preference. The higher class is always favored. Class-Selector (CS) For backwards compatibility with IP Precedence devices. Uses the first 3 left-most bits Remaining 3 bits set to 0s For example, we tell the router to mark incoming packets with CS5 (101000), Non-DiffServ compliant devices that receive theses packets only read the first 3 bits of “101”, which it interprets as IP Precedence 5. The last 3 bits are completely ignored. Network-Based Application Recognition (NBAR) NBAR protocol discovery Discovers protocols running on the network by means of deep packet inspection (DPI) rather than determining based on port numbers. NBAR Port Map With NBAR, the router can be configured to recognize applications based on different port numbers instead of their common default ones with the ip nbar port-map command. NBAR by itself is used to classify traffic. Network-Based Application Recognition (NBAR) PDLMs Packet Description Language Modules expand the packet identification capabilities of the NBAR discovery. PDLMs are files that can be stored directly in the routers Flash Memory cards while the device is turned on; no reboot necessary for newly added protocols to be recognized. NBAR is not supported on Fast EtherChannel, tunnel or crypto interfaces Network-Based Application Recognition (NBAR) Configuration ip nbar pdlm <file-name> Imports a pdlm file into the NBAR process ip nbar port-map <protocol> <port> Configures router to recognize traffic from a certain protocol based on the port number you specify. ip nbar protocol-discovery Inspects packets and discovers the traffic types that go in or out of the interface Network-Based Application Recognition (NBAR) Verifying Configuration show ip nbar protocol-discovery Display statistics of discovered applications show ip nbar port-map Display the current protocol/port mappings match protocol <protocol> <element within> QoS Pre-Classification QoS & VPN Tunnels: By default, Cisco IOS devices that use Tunnel interfaces copy the ToS byte from the IP header of Packets and attach them to the ToS byte of the Tunnel Headers before put on the VPN. QoS Preclassify: Used when you want to classify traffic not based on the ToS Byte / DSCP markings as they traverse a tunnel. A Device uses a QoS policy on the original IP Header of the packet rather than the Tunnel Header. qos pre-classify You can confirm Pre-classification is enabled on an interface by running show interface <int> and seeing (QoS Pre-classification) on the Queuing Strategy line. QoS on the LAN How to classify traffic on a Switch? NBAR classification is not available for Cisco Switches Access Control Lists (ACLs) are the only supported method for classifying traffic Catalyst Switches use IP & Layer 2 ACLs to Classify traffic Cisco Catalyst Switch commands: mac access-list extended Creates a Layer 2 ACL. Deny actions are ignored in ACLs when used for QoS Classification. mls qos trust changes port state to trusted on the selected switch port. mls qos trust cos <cos> Trust the cos marking received, but not dscp. Maps CoS-to-DSCP values before switching to output interface. mls qos trust dscp <dscp> Trust the dscp marking received, but not the cos. Maps DSCP-to-CoS values before switching to output interface. QoS on the LAN mls qos cos <value> sets default CoS value for packets received on the port. mls qos map cos-dscp <values> mls qos map dscp-cos <values> to cos Defines a custom mapping for COS-to-DSCP (and vice versa) QoS on the LAN Trust CoS markings only from a Cisco IP Phone: mls qos trust cos mls qos trust device cisco-phone switchport priority extend cos 0 NOTE: the last command enables the IP Phone to change CoS markings received on packets from an attached device (i.e: a laptop) switchport priority extend trust Allows an IP phone to trust CoS markings received from the PC. QoS on the LAN mls qos trust trust the CoS marking received on the interface show mls qos interface Display QOS configurations for a switch port show mls qos maps Display CoS and DSCP mappings configured on the switch. Congestion Management Mechanisms for managing queues and giving preferential forwarding to delaysensitive traffic. If the Hardware Queue (TxQ) is congested, the Software Queue (Queuing Mechanisms) will take over and schedule packets as they arrive at the interface. The TxQ queue ALWAYS uses FIFO and cannot be configured to use anything else. If the TxQ queue is not congested, then any packets that arrive at the interface will bypass the software queuing process and be sent directly to the hardware queue to be sent out the physical interface. Software interfaces (i.e: Subinterfaces) only congest when the Hardware Queue for the Interface has reached capacity Queuing Mechanisms: Priority Queuing (PQ) - Obsolete Custom Queuing (CQ) - Obsolete Weighted Fair Queuing (WFQ) Class-Based Weighted Fair Queuing (CBWFQ) Low-Latency Queuing (LLQ) Queuing Weighted Fair Queuing (WFQ) Normally does not require any configuration Priority given to low-bandwidth traffic Allocates additional bandwidth to high precedence flows Not ideal for Voice traffic WFQ Explained How does it work? WFQ dynamically creates queues for each flow. A Flow is determined based on matching: Source & Destination IP, Ports or ToS values. A queue is established as long as there are packets being sent. When the queue for that flow is empty and no more packets need to be sent, the queue is removed from the routers memory. Even though a connection might still be established with the other end, if no packets are being sent, there are no queues for it. Weighted Fair Queuing (WFQ) Hold-Queue Out limit (HQO) Max number of packets the WFQ system can hold per interface. Congestive Discard Threshold (CDT) Maximum length a single queue can be before packets are dropped from it. Finish Time Used by the WFQ Algorithm, pckets with larger Finish Times are more likely to be discarded during congestion. WFQ is turned on by default for Serial Interfaces under 2.048mbps. It cannot be manually configured by the Administrator. Weighted Fair Queuing (WFQ) fair-queue <cdt> Sets the Congestive Discard Threshold on an interface. fair-queue <dynamic-queues> Sets total queues that can be created by the WFQ system. fair-queue <reservable-queues> Sets limit of queues used for RSVP hold-queue max-limit out Sets the HQO for an interface Class-Based WFQ Good for everything BUT Voice & Video Guarantees a chunk of bandwidth per class Not supported on Subinterfaces queue-limit <limit> Adjusts the queue size for a class, by setting the maximum # of packets that the queue can hold before congestion occurs and packets start to drop. The default queue size is set to 64 Class-Based WFQ bandwidth bandwidth percent bandwidth remaining percent These commands are used for bandwidth reservations for a traffic class. NOTE: Once bandwidth is reserved to a class using kbps, the ‘bandwidth percent’ command cannot be applied to other classes within that same policy-map. This would confuse the router and make improper calculations when reserving bandwidth. Class-Based WFQ max-reserved-bandwidth Changes the default max bandwidth that can be reserved for user-defined classes (not the default). The default value is 75% of the links bandwidth (or what’s defined in the CIR agreement) can be reserved to different classes. Whatever is left on the link is reserved for keepalives and the default class (non-classified traffic). Low-Latency Queuing (LLQ) AKA: CBWFQ + PQ Uses a Priority Queue Recommended for Voice Policed bandwidth for priority traffic WFQ or FIFO used for regular traffic PQ is serviced entirely before other queues Low-Latency Queuing (LLQ) What is the meaning of “Policed”: Traffic in the PQ cannot consume more bandwidth than what is assigned to it. If the limit is exceeded those packets are tail-dropped. Policing prevents starvation of other classes. Low-Latency Queuing (LLQ) priority <bandwidth in kbps> Guarantees “priority” bandwidth to a class The random-detect and queue-limit commands are not supported for priority classes. Queuing on a Switch Contain up to four queues Some have configurable drop thresholds Packet drops occur in Standard queues Packets NEVER dropped in Priority Queues Cisco Catalyst 2950 Queue 4 is a high priority queue used for Mission Critical or Voice traffic. Can be set as a 'Strict-Priority' queue Expedite queues are recommended for reducing delay with Voice Weighted Round Robin (WRR) Default queuing Algorithm used by Cisco Catalyst switches. Services queues fairly by assigning 'Weights'. Example: Queue 2 has a Weight of 7 and Queue 1 has a Weight of 10. This means 7 packets are sent from Queue 2 for every 10 packets sent from Queue 1. Prevents starvation of other applications such as if a large download is in progress. Weighted Round Robin (WRR) Is WRR Good for Voice?: Voice is still degraded when WRR is used. WRR with a strict-priority queue will resolve the delay problem with Voice. Queue 4 on the switch uses PQ while the remaining queues use WRR Scheduling Weighted Round Robin (WRR) wrr-queue bandwidth <weight1>...<weight4> Transmit X amount of packets from each of the four queues. If weight4 is set to zero (0), queue 4 will be treated as an Strict Priority' queue. Packets in the other queues will not be serviced until queue 4 is emptied. Weighted Round Robin (WRR) wrr-queue cos-map <Queue ID> <cos1,cos2...> Tells the switch what Queue to place packets with specific CoS markings in show wrr-queue bandwidth Displays bandwidth allocations for the four different queues show wrr-queue cos-map Displays the cos-value to queue ID mappings. Congestion Avoidance - Terms TCP Slow Start An algorithm used in the TCP/IP Protocol Stack where a sender transmits segments of data and gradually increases its Window Size (cWND) for each Acknowledgment (ACK) received. When an ACK is not received by the other device, this indicates a segment of data was lost in transmission. The sender decreases its cWND size and the process starts over again until the sender determines the maximum amount of data it can send at a time without overwhelming the other end. Congestion Avoidance - Terms TCP Global Synchronization Tail Drop is an inefficient drop policy to use on large networks. Tail Drops cause TCP flows to go into a constant startup/back-off cycle because of each flow throttling their transmission rate at the same time. This causes many gaps of under utilization in the network. Random Early Detection (RED) RED is a congestion avoidance mechanism that starts discarding TCP packets before a queue begins to fill and not after it is full. The random dropping of packets from different TCP flows prevents phenomenon's like global synchronization from occurring. TCP Starvation However, because RED actively drops flows that are only TCP- based, a large UDP packet can quickly fill the queue and prevent the router from buffering possibly more critical traffic. Random Early Detection (RED) The Three RED Modes 1. No Drop: Average queue size less than the min drop threshold. 2. Random Drop: Avg queue size is between min drop and max thresholds. 3. Full Drop: Avg queue size > max threshold. Incoming packets are tail-dropped from queue until congestion minimizes back to Random Drop, when max threshold is reached. Random Early Detection (RED) RED does NOT differentiate flows or take packet markings into consideration and will drop voice and mission-critical traffic the same as it would for Best-Effort traffic. RED is not supported on Cisco routers. WRED is the preferred congestion avoidance alternative for devices running Cisco IOS. Weighted RED (WRED) Differentiates flows by means of CBWFQ Drops less important packets based on marking. Supports both DSCP and IP Precedence Enable DSCP with: random-detect dscp-based Weighted RED (WRED) Only throttles congestion caused by TCP-based flows, as TCP has built in mechanisms to resend packets lost by tail-drops. UDP packets are not affected by WRED and can still cause congestion if too much UDP flows are established. Voice traffic is UDP-based. Mark Probability Denominator Calculates the number of packets to drop when the average queue depth reaches the maximum threshold. The MPD is calculated based on 1/x. An MPD of 4 translates to 1/4 = 25% drop probability or 1 in every 4 packets will be dropped from the queue. If the queue exceeds the max threshold, the router reverts back to the default drop policy which is Tail Drop, meaning all incoming packets are dropped from the queue until the average queue length falls below the max threshold. Weighted RED (WRED) WRED reduces congestion by dropping non-voice (data) traffic, which is the root cause for congestion in most networks. Voice traffic should NEVER be dropped!. Where to implement WRED? WRED can be applied to aggregation points, WAN interfaces and other potential areas of congestion Class-Based WRED (CB-WRED) Applies the same 3 RED drop modes to each class of traffic defined with existing CBWFQ configuration Each class can have their drop modes set to different values. Allows the ability to drop the less important traffic (i.e: BE) earlier and minimize congestion for more important traffic. Utilizes the Assured Forwarding PHB Classes in DSCP. Class-Based WRED (CB-WRED) random-detect precedence <value> <min> <max> <drop> Changes the default min,max and MPD values for packets marked with IP Precedence values. random-detect dscp <dscp-value> <min> <max> <drop> Changes these values for certain DSCP markings, random-detect dscp-based must be entered before DSCP markings can be used with WRED. show policy-map interface Verify configuration of WRED on an interface Explicit Congestion Notification (ECN) Why use ECN?: Endpoints only know to slow down their transmission speed when packet drops begin to occur in the routers output queue. ECN notifies endpoints that congestion is occurring before and gives them a chance to reduce their transmit speed before the need to drop packets. Marking with ECN ECN uses the last 2-bits of the DiffServ Field Bits for ECN: 00 = ECN not in use 01 or 10 = ECT Bit (ECN enabled) 11 = CE Bit (Congestion has occurred) ECN + WRED When packets in a queue exceed the minimum drop threshold set for WRED, the router begins to transmit packets marked with an ECN bit to the host sending the TCP segments. This informs the sender that the router is experiencing congestion, this signals the host to reduce its window size and transmission speed and prevents tail drops from occurring. Note about ECN In order for ECN to be effective, applications need to support the ECN standard of IP, which a lot of applications do not at this point in time. Tail drops can still occur if the Avg queue length is beyond the max threshold. ECN Commands random-detect ecn Enables ECN + WRED for a traffic class show policy-map show policy-map interface <int> Displays WRED + ECN info and statistics. Policing & Shaping What makes them different? • Policing drops (or remarks) excessive traffic • Shaping delays excessive traffic • Policing prevents starvation of application bandwidth • Shaping prevents oversubscription of link bandwidth by “buffering” packets. Policing TCP/IP applications by default will consume as much bandwidth as they need if it is available, at the expense of others. Policing limits how much bandwidth a flow (Application) can consume before those packets get dropped from queue or remarked with a lower priority QoS marking (ie: 0 for Best-Effort) By dropping or lowering the priority of packets from aggressive flows you can effectively free up the queues on interfaces and prevent congestion A common practice is to police non-mission critical traffic such as peer-to-peer file sharing applications (i.e: Limewire). Tokens Both Policing and Shaping use a mathematical concept known as Tokens and Token Buckets. A Token is the amount of data that can be sent in a single second, several Tokens might be required to send a single packet of data. For every second, a number of tokens are placed inside a Bucket. For a packet to be sent, a number of tokens must be present inside the Token Bucket. If there are insufficient Tokens in the bucket to transmit the data, an exceed action occurs. Tokens (cont’d) With a single Token bucket, when there are not enough tokens in it to send the packet it is dropped. A way to prevent this is to implement a Dual-Bucket model, where Tokens can be taken from it when the first bucket does not have enough to send the packet. A second bucket (Be) accumulates packets by data being sent below the CIR (Bc) of the first bucket. Today’s networks that use Policing either use a Dual or Single Token Bucket model. Tokens – Example A Packet of 1500 Bytes needs to be sent. To send this packet a total of 400 Tokens is required. If there are 400 Tokens or more available in Bucket #1 the packet is transmitted. If there are less than 400 Tokens available, the packet is discarded. If a Dual-Bucket model is used and there are 400 or more Tokens in the second bucket, tokens are taken from Bucket #2 to transmit the packet. If there are insufficient Tokens to send the packet from either bucket, it is ultimately discarded. Terminology Conform-Action When a bucket has enough Tokens to send the packet. The necessary amount of Tokens are subtracted from the total and the packet is transmitted out the interface. Exceed-Action When there are not enough Tokens in the first bucket to send the packet, so it is either dropped or re-marked with a lower priority (depending on the policy configured). Violate-Action When there are insufficient Tokens in either bucket. Dual-Metering Consists of a CIR (Bc) and a Peak Information Rate (PIR) bucket (Be). Tokens taken from the CIR bucket are also subtracted from the PIR bucket when a conform-action is met. An exceed-action occurs when there are insufficient Tokens in the PIR bucket to send the packet. Insufficient tokens in either bucket is a violate-action Policing (cont’d) Service Providers use policing (aka Metering) to limit a customers upload/download speed based on the level of service they are paying for, called the ‘Committed Information Rate’ (CIR). Actual link speed is called the Committed Access Rate (CAR). Policing is generally implemented in the Access or Distribution Layer of a network and Shaping is deployed on the WAN edge Class-Based Policing Bandwidth for a class of traffic can be policed in bits per second (bps) or allocated a fraction of bandwidth from the link. The default is to use bits per second. using bits police <bps> conform-action <action> exceed-action <action> violate-action <action> using percentage police percent <percentage> conform-action <action> exceed-action <action> violate-action <action> By using percentage rather than bps, this same policy can be applied to multiple interfaces regardless of what their link capacity is. Defaults The default unit used in configuring policing is bits per second the default conform-action is transmit and the default exceed-action is drop. Changing the default exceed-action Packets that exceed their rate are dropped by default. Administrators may choose to remark these packets to a lower QoS priority instead. This command will remark packets that do not conform to IP Precedence 0. Police 56000 conform-action transmit exceed-action set-prec-transmit 0 One or more QoS markings can be applied to a single packet when an exceed-action is triggered. These are called Multiaction statements Traffic Shaping A companies HQ is connected via a 1Gbps Fiber link over the WAN to a Branch office router using a 64Kbps serial link. Data being sent from HQ would overwhelm the router used at the Branch office because it is sent from much faster from the HQ than the Branch can receive at once. This is called oversubscription and results in congestion on the Wide Area Network. Shaping prevents this phenomena by buffering packets that are sent in excess of the speed of the link on the connected device. A policy can be implemented to say that packets destined for the Branch office are limited to a rate of 64Kbps instead of the full link capacity of 1Gbps. Traffic Shaping (Cont’d) How Shaping is Applied: Shaping of any kind is always applied to an outbound (egress) interface and cannot be applied inbound (ingress) Packets can be shaped by an Average Rate which is the CIR (Bc) or by Peak (Bc + Be) Packets that exceed the average rate are eligible to be discarded in the event of congestion Shaping by CIR (Average) shape average <bps or percent> Shaping by Peak shape peak <bps or percent> Class-Based Traffic Shaping Bandwidth statements for a traffic class in MQC guarantee a minimum amount of bandwidth to be reserved to that class. “shape” statements used together with these guarantee a maximum limit, which prevents a class from starving other ones. Traffic Shaping w/ Frame Relay circuits Frame Relay circuits send two types of frames to notified other network devices when there is congestion. Forward Explicit Congestion Notification (FECN) Notification sent upstream (to receiving router) from Frame Relay Switch Backward Explicit Congestion Notification (BECN) Notification sent downstream (to sender) from Frame Relay Switch FECN and BECN frames are identified by bits within data packets sent by hosts, they are not sent as separate ones. Traffic Shaping w/ Frame Relay circuits (Cont’d) BECN Adaptation A Shaping technique used on Frame Relay interfaces that reduces the average shaping rate by 25% of the current value when frames marked with the BECN bits are received. When BECN frames are not received for certain time interval, the shaping rate gradually increases back to the previous average. The command to enable this in MQC is shape adaptive <rate> Frames will not be shape below the rate configured. Traffic Shaping w/ Frame Relay circuits (Cont’d) FECN to BECN Propagation Notifies original sender by requesting the receiver to send a random frame of data, known as a Q.922 Test Frame, that the Frame Relay switch then sets the BECN bit on. This tells the sender that congestion is occurring in the direction of the receiver and to reduce its transmission rate, even though "real" data has not been sent to the sender. The command to enable this is.. shape fecn-adapt Frame Relay Voice-Adaptive Traffic Shaping (FRF.VATS) Feature that dynamically turns on Adaptive Traffic Shaping and or FRF.12 Fragmentation. Fragments & interleaves data packets with voice when voice packets are detected on a Frame Relay circuit (PVC), if there is congestion. When voice packets are not detected for 30 secs, data is transmitted normally again. • • Voice packets are identified by The packets present in the Strict Priority queue Packets that contain H.323 protocol signaling Link Efficiency - Compression Payload Compression Shrinks the total size of the entire frame Ideal for transmitting large frames via slow links Payload Compression techniques (Hardware): • Stacker • Predictor • Microsoft Point-to-Point Compression (MPPC) Link Efficiency – Compression Benefits of Hardware Compression • Software compression techniques introduce processing delay which causes the CPU to work more when forwarding packets. • Therefore, compression done in hardware is recommended. Link Efficiency - Compression Header Compression • Saves link bandwidth • Reduces packet size and serialization delay • • • • • Suppresses IP & Layer 4 redundant addresses Implemented on a per-link basis Ideal for low-bandwidth traffic (Voice,Telnet,etc) cRTP reduces IP/UDP/RTP headers down to 2-4 Bytes cTCP reduces TCP/IP overhead down to 3-5 Bytes Link Efficiency - Compression RTP Header compression (cRTP) An integer is used to associate the RTP session after the initial packets have been exchanged. This integer is known as the Session Context Identifier and is transmitted inside subsequent packets. It is stored locally on each device within a table and is used to reference the session for the remainder of the conversation alterations to the headers are sent along with it. Class-Based Header Compression cRTP Configuration compression header ip rtp cTCP Configuration compression header ip tcp compression header ip Enables both cRTP and cTCP by default Link Efficiency - LFI Serialization Delay The lower the capacity of a network link the longer it takes for a frame to be placed on the physical media. Serialization Delay is calculated based on the formula Delay = (frame size in bits) / capacity of link A 1500 byte frame takes 187.5 ms to put on a 64kbps link (1500 * 8) / 64 = 187.5 Link Efficiency - LFI What is LFI? Link Fragmentation & Interleaving are techniques used to reduce delay & jitter when serializing frames onto the WAN. Large frames are chopped into smaller fragments so that Voice and other delay bound traffic can be placed in between them. On a slow link Without LFI, a large frame must be transmitted in its entirety before frames behind it can be sent. Voice cannot survive in this scenario! Link Efficiency - LFI LFI Mechanisms • Multilink PPP LFI (MLP LFI) • VoIP over Frame Relay (FRF.12) • FRF.11 Annex C - Voice over Frame Relay (VoFR) NOTE: LFI is not necessary on high speed links (>T1) Link Efficiency – LFI Rule for Fragment Sizing Fragment sizes are calculated based on the rule: “80 bytes per every 64kbps of the clocking rate” For example, a 256kbps link would need fragments of 320 bytes 64 * 4 = 256kbps 80 * 4 = 320 bytes MLP LFI - Configuration ppp multilink Turns on multilink ppp on a point-to-point interface ppp multilink interleave Turns on interleaving of fragments ppp multilink fragment delay <delay in ms> Configures the maximum fragment delay (default 30ms) 10-15 ms recommended for frames containing Voice MLP LFI – Verification show interfaces multilink <interface #> Displays MLP statistics, count of frames interleaved,etc debug ppp multilink fragments Outputs MLP LFI fragmentation in real-time. Good for troubleshooting correct fragmentation of frames. FRF.12 Fragmentation FRF.12 can be configured on Frame Relay circuits to reduce latency for VoIP packets. The fragment size configured on a VC should be no less than a single frame carrying voice. If it is configured to be less, Voice will be fragmented along with data packets and produce undesirable results. G.711 VoIP packets require 200 bytes, provisioning a VC to fragment frames below that number will degrade a call using G.711. FRF.12 Fragmentation “End-to-End” FRF.12 Fragmentation is the only Frame Relay fragmentation option (for VoIP) available on Cisco IOS devices. This means FRF.12 must be provisioned on both sides of a circuit for it to operate. Enabling Frame Relay Traffic Shaping (FRTS) or Distributed Traffic Shaping (DTS) on the interface (or DLCI) is also a prerequisite. “frame-relay traffic-shaping” Enables FRTS on the interface. FRF.12 Fragmentation Configuration map-class frame-relay <map name> Creates a frame relay map-class for specifying QoS parameters frame-relay fragment <size> Sets the fragment size for both voice/data frames. This is configured inside the map-class in bytes frame-relay class <map name> Applies the frame relay map-class to an interface or DLCI. FRF.12 Fragmentation Verifying Configuration show frame-relay fragment Displays FRF.12 statistics for all interfaces and DLCI’s show frame-relay fragment <interface or dlci> Outputs the statistics for the specific circuit show frame-relay pvc Also displays information related to FRF.12 Calculating bandwidth for Voice Calculate size of packet Note: Include the Layer 2 and other upper layer headers (IPSEC) for a more accurate calculation. IP Header: 20 Bytes UDP Header: 8 Bytes RTP Header: 12 Bytes Sum of headers: 40 Bytes (2 – 4 Bytes with cRTP) Calculating bandwidth for Voice Next, add Payload size which is the actual data in the packet to the sum of the headers. Payload is calculated based on the codec used to compress the audio Payload size for G.711: 160 Bytes Payload size for G.729: 20 Bytes 40 + 160 = 200 Bytes total for this Voice packet Calculating bandwidth for Voice Convert the Bytes to Bits Multiply the packet size by 8 200 * 8 = 1600 Bits Multiply by Packets Per Second Voice samples range from 20 – 30ms of audio 50 pps is required for 20ms and 30ms needs 33 pps 1600 bits * 50 pps = 80000 bits per second (80kbps) Calculating bandwidth for Voice Conclusion One G.711 call consumes 80kbps of bandwidth Voice Bandwidth reference • Using G.711 Codec: 80kbps • Using G.729 Codec: 24kbps