Designing and deploying a VoIP network When ITU meets IETF Thomas(at)Kernen.Net A quick VoIP recap Directory Gatekeeper (DGK): Performs call routing search at highest level (ex: country code distributes). Country codes among other DGKs Forward LRQ (location request) to a partner DGK if call doesn't terminate in local SP DGK Gatekeeper (GK): Performs call routing search at intermediate level (ex: NPA-NXX). Distributes NPA among other GKs. Provides GW resource management (Ressource Availabilty Indicator, gw-priority, ....) Gateway (GW): Acts as interface between the PSTN and IP. Normalizes numbers from PSTN before entering IP. Normalizes numbers from the IP before entering the PSTN. Contains the dial-peer configuration. Registers with the GK. Basic H.323 Call Gatekeeper A LRQ Gatekeeper B LCF ACF ACF IP Network RRQ/RCF ARQ H.225 (Q.931) Setup RRQ/RCF ARQ H.225 (Q.931) Alert and Connect H.245 V Gateway A Phone A RTP V Gateway B Phone B Various Codec Bandwidth Consumptions Encoding/ Compression Standard Transmission Rate for Voice G.711 PCM A-Law/u-Law Result Bit Rate 64 kbps (DS0) G.726 ADPCM 16, 24, 32, 40 kbps G.727 E-ADPCM 16, 24, 32, 40 kbps G.729 CS-ACELP 8 kbps G.728 LD-CELP 16 kbps G.723.1 CELP 6.3/5.3 kbps Variable Cisco Encoding Implementation 20 Byte packet every 20ms (50pps) 8kbps Data Rate Note - This 8bkps for “Voice Payload” only!! Add on 40 bytes of IP/UDP/RTP and you now have 24kbps! RTP Header Compression will take this down to 11.2kbps = 0010110101 Decode Encode IP QoS WAN = Sample 8 kHz (8,000 Samples/Sec) Voice Quality of Service (QoS) Requirements Avoiding The 3 Main QoS Challenges Loss Delay Delay Variation (Jitter) Loss and Delay Sources • CODEC (Encode) • Packetization Voice Path Loss + Delay + Delay Variation • Output queuing • Access (up) link transmission • Backbone network transmission • Access (down) link transmission • Input queuing • Jitter buffer • CODEC (Decode) Delay—How Much Is Too Much? Cumulative Transmission Path Delay CB Zone Satellite Quality Fax Relay, Broadcast High Quality 0 100 200 300 400 500 600 700 Time (msec) Delay Target ITU’s G.114 Recommendation = 0 – 150msec 1-way delay 800 Fixed Delay Components Propagation Delay Serialization Delay— Buffer to Serial Link Processing Delay Propagation—six microseconds per kilometer Serialization Processing Coding/compression/decompression/decoding Packetization Variable Delay Components Queuing Delay Queuing Delay Queuing Delay Dejitter Buffer Queuing delay Dejitter buffers Variable packet sizes Large Packets “Freeze Out” Voice Voice Packet 60 bytes Every 20ms Voice 1500 bytes of Data Voice Packet 60 bytes Every >214ms Voice Voice Packet 60 bytes Every >214ms ~214ms Serialization Delay Voice 1500 bytes of Data 10mbps Ethernet Voice Voice 1500 bytes of Data 10mbps Ethernet 56kb WAN Large packets can cause playback buffer underrun, resulting in slight voice degradation Jitter or playback buffer can accommodate some delay/delay variation Voice RTP Controlling Dejitter Buffer Sender Receiver IP Network RouterA V 20ms RouterB V 20ms C B A 10 30 50 20ms RTP Timestamp From Router A Interframe gap of 20ms 20ms 80ms C B A 10 30 50 RTP Timestamp From Router A Variable Interframe Gap (Jitter) 20ms C B A 10 30 50 RTP Timestamp From Router A Delitter Buffer removes Variation Calculate Delay Budget - Worst Case Coder Queuing Delay Delay 25 ms 4 ms Dejitter Buffer 50 ms Site A Site B Propagation Delay—8 ms (128kbps Frame Relay) Serialization Delay 2 ms Fixed Variable Delay Delay Coder Delay G.729 (5 msec look ahead) 5 msec Coder Delay G.729 (10 msec per frame) 20 msec Packetization Delay—Included in Coder Delay 4 msec Queuing Delay 128 kbps Trunk 2 msec Serialization Delay 128 kbps Trunk Propagation Delay (Private Lines) Network Delay (e.g.,Public Frame Relay Svc) Dejitter Buffer Total Min 8 msec 50 msec 89 msec Fragmentation and Interleaving Serialization delay for 64Kbps link with an MTU of 1500 bytes (1500 bytes x 8bits/byte) / (64000 bits/sec) = 187.5ms Fragmentation size: design for 10ms fragments (0.01 sec x 64000 bps) / (8 bits/byte) = 80 bytes It takes 10 ms to send an 80 byte packet or fragment over a 64kbps link. Fixed Frame Serialization Delay Matrix Frame Size Link Speed 1024 Bytes 1 Byte 64 Bytes 128 Bytes 256 Bytes 512 Bytes 56kbps 143us 9ms 18ms 36ms 72ms 144ms 214ms 64kbps 125us 8ms 16ms 32ms 64ms 128ms 187ms 128kbps 62.5us 4ms 8ms 16ms 32ms 64ms 93ms 256kbps 31us 2ms 4ms 8ms 16ms 32ms 46ms 512kbps 15.5us 1ms 2ms 4ms 8ms 16ms 23ms 768kbps 10us 640us 1.28ms 2.56ms 5.12ms 10.24ms 15mss 1536kbs 5us 320us 640us 1500 Bytes 1.28ms 2.56ms 5.12ms 7.5ms Multilink PPP with Fragmentation and Interleave 64 kbps Line Real-Time MTU Elastic Traffic MTU 187ms Serialization Delay for 1500 byte Frame at 64 kbps 64 kbps Line Addendum to PPP Specification Elastic MTU Elastic MTU Real-Time MTU Elastic MTU Media Link Layer Overhead Layer 2 Media Ethernet Layer 2 Header Size 14 bytes PPP/MLPPP 6 bytes Frame Relay 6 bytes ATM (AAL5) 5 bytes + waste MLPPP over FR MLPPP over ATM 14 bytes 5 bytes for every ATM cell + 20 bytes for MLPPP/AAL5 RTP Header Compression Overhead Version IHL Type of Service Identification Time to Live Total Length Flags Protocol Fragment Offset Header Checksum Source Address Destination Address Options V=2 P Destination Port Length Checksum X CC M 2X Padding Source Port PT Sequence Number 20ms@8kb/s yields 20 byte payload IP header 20; UDP header 8; RTP header 12 payload!!!!!!!! Header compression 40Bytes to 24 much of the time Hop-by-Hop on slow links <512kbps CRTP—Compressed Real-time protocol Timestamp Synchronization Source (SSRC) Identifier RTP Header compression details Can save a lot of bandwidth (>50%) per flow. Works on serial links between 2 routers CPU intensive, might overkill the routers Limited to 256 sessions (128 calls) over FR Limited to 1000 sessions (500 calls) over HDLC (checked in 12.2(8)T) Not recommend on links with data rates above E1 Silence suppression VAD (Voice Activity Detection) (Cisco) Codec built-in silence suppression (G.729a/G.723.1b) Should not be taken into account for circuits carrying less than 24/30 calls since based on aggregate volume, not individual calls. Should not be taken into account when engineering the network. IP Precedence/DSCP DSCP - Differentiated Services Code Point (RFC 2474-2475) Set IP Precedence/DSCP higher for VoIP. Usually set to 5/101000 Set at source (gateway) if possible for less hassle. Queuing mechanisms (in Cisco’s world) FIFO, First In First Out Packets arrive and leave the queue in exactly the same order Simple configuration and fast operation No Priority servicing or bandwidth guarantees possible WFQ, Weighted Fair Queuing A hashing algorithm, places flows into separate queues where weights are used to determine how many packets are serviced at a time. You define weights by setting IP Precedence and DSCP values. Simple configuration. No priority servicing or bandwidth guarantees possible. Queuing mechanisms (2) CQ, Custom Queuing Traffic is classified into multiple queues with configurable queue limits. Has been available for a few years and allows approximate bandwidth allocation for different queues. No priority servicing possible. Bandwidth guarantees are approximate and there are a limited number of queues. Configuration is relatively difficult. PQ, Priority Queuing Traffic is classified into high, medium, normal and low priority traffic is serviced first, then medium priority traffic, followed by normal and low priority traffic. Has been available for a few years and provides priority servicing. Higher priority traffic can starve lower priority queues of bandwidth. No bandwidth guarantees possible. Queuing mechanisms (3) CBWFQ, Class Based Weighted Fair Queuing MQC is used to classify traffic. Classified traffic is placed into reserved bandwidth queues or a default unreserved queue. Similar to LLQ except there is no priority queue. Simple configuration and ability to provide bandwidth guarantees. No priority servicing possible. PQ-WFQ, Priority queue-Weighted Fair Queuing (IP RTP Priority) Single interface command is used to provide priority servicing to all UDP packets destined to even port numbers within a specific range. Simple, one command config. Provides priority servicing to RTP packets. All other traffic is treated with WFQ. RTCP traffic is not prioritized. No guaranteed bandwidth capability. Note: MQC = Modular QoS CLI Queuing mechanisms (4) Low Latency Queueing (LLQ) = Priority Queue (PQ)+ Class Based-Weighted Fair Queue (CB-WFQ). Allows a strict Priority Queue to handle a defined class of packet to be prioritized over all other traffic. Simple config, ability to provide priority to multiple classes of traffic and give upper bounds on priority bandwidth utilization. Can also config bandwidth guaranteed classes and a default class. All priority traffic is sent throught the same priority queue which can introduce jitter. Note: Cisco appears to be working on improving LLQ and this is currently the #1 queuing mechanism according to SEs, TAC and updated documentation. Traffic Engineering Busy Hour (BH) = Number of lines required to support the worst hour of the day Grade of service (GOS) = Percentage of lines that will experience a busy tone on the 1st attempt during the BH A GOS of 0.05 means 5 out of 100 callers might get a busy tone Erlang B, most widely used traffic model to estimate the number of lines required for a specific GOS and BH of traffic. Based on various traffic assumptions such as call queueing, arrival rate, etc... 1 trunk in use for 1 hour = 1 Erlang = 36 CCS of traffic 1 Centrum Call Seconds (CCS) = 100 call seconds 1 hour = 3600 seconds or 36 CCS = 1 Erlang Traffic Engineering (2) Step1: Obtain voice traffic data Sources of traffic information: CDRs (Call Detail Record) or carrier bills, carrier studies, traffic reports Data needs to be adjusted for call processing since a trunk in use = Dialing + Call setup + Ringing + Talking + Releasing Other sources: Ring No Answer, Busy Signal, etc Add 10% to 16% to all call lengths/total time estimates. Traffic Engineering (3) Step 2: Convert to Erlangs Adjusted total hours a month / business days * % of traffic in busy hour Step 3: Calculate the number of voice lines Based on statistical model for the # of lines vs the grade of service desired Step 4: Calculate the data network bandwidth (Codec + protocol overhead) * number of voice lines = required bandwidth POP Sizing Calculate the number of gateways (GW) required to handle anticipated call volume Use Busy Hour Call Attempts (BHCA) metric Calculate the number of (Directory) Gatekeepers required to process the GW signaling GWs = max E1s per GW, BHCA, CPS (Calls per Second) GKs = max CPS (check with vendor, not an obvious figure to get, varies with each chassis/configuration/software release/DSP rev) Tips & tricks Build GK redundancy by making sure all GWs have multiple GKs to reach. HSRP can be very useful in conjunction with multiple GW->GK destinations. Make sure the GWs normalize the format of the called numbers so the VoIP core deals with a single call format (E.164 = country+city+local). Inter provider VoIP services What happens when you want to extend the reach of your VoIP services by interconnecting with other ITSP? Tandem coding (VoIP->PSTN->VoIP) Open Settlement Protocol Tandem Coding In the case where a call is passed back from the VoIP network to the PSTN and then resampled & compressed the call has been sampled and compressed twice and therefore the call quality will degrade very rapidly. Examples: VoIP to GSM via the PSTN. VoIP to the PSTN via another carrier with compression gear. Other VoIP carrier doesn’t want to “risk” interconnects over VoIP (inter-ITSP QoS management issues) Open Settlement Protocol (OSP) Open Settlement Protocol (OSP), client-server protocol defined by the ETSI TIPHON standards organization. Designed to offer billing and accounting record consolidation for voice calls that traverse ITSP boundaries. It also allows service providers to exchange traffic with each other without establishing multiple bilateral peering agreements by using a 3rd party clearinghouse to enable extending the reach of their network. 3rd party clearinghouse with an OSP server will allow services such as route selection, call authorization, call accounting, and inter-carrier settlements, including all the complex rating and routing tables necessary for efficient and cost-effective interconnections. The OSP based clearinghouses provide the least cost and the best route-selection algorithms based on the a wide variety of parameters. How it works Step 1: customer places call via the PSTN to a VoIP Gateway, which authenticates the customer by communicating with a RADIUS server Step 2: The originating VoIP gateway attempts to locate the termination point within it's own network by communicating with a gatekeeper using H.323 RAS. If there's no appropriate route, the gatekeeper tells the gateway to search for a termination point elsewhere. Step 3: The gateway contacts an OSP server at the 3rd party clearinghouse. The gateway establishes an SSL connection to the OSP server and sends an authorization request to the clearinghouse. The authorization request contains pertinent information about the call, including the destination number, the device ID, and the customer ID of the gateway. Step 4: The OSP server processes the information and, assuming the gateway is authorized, returns routing details for the possible terminating gateways that can satisfy the request of the originating gateway. How it works (2) Step 5: The Clearinghouse creates an authorization token, signs it with the certificate and private key, and then replies to the originating gateway with a token and up to 3 selected routes. The originating gateway uses the IP address supplied by the clearinghouse to setup the call. Step 6: The originating gateway sends the token it received from the settlement server in the setup message to the terminating gateway. Step 7: The terminating gateway accepts the call after validating the token and completes the call setup. Voice Speech Quality (VSQ) MOS: ITU P.800 & P.830, scale from 1 (bad) to 5 (excellent), based on human perception (subjective), most widely used by VoIP vendors when comparing codec quality, the oldest model. PSQM (Perceptual Speech Quality Measurement), ITU P.861, compares input and output speech (automated), developed by KPN Research PAMS (Perceptual Analysis Measurement System), Developed by British Telecom, “Objectively” predict results of subjective speech quality tests PESQ (Perceptual Evaluation of Speech Quality) ITU P.862, latest standard (January 2001), currently the most accurate model for automated voice quality perception, improves over PSQM and PAMS Sources of potential VSQ problems Delay jitter: variance in delay (zero, little or excessive delay) Encoding and decoding of voice (PCM/ADPCM/low bit-rate codecs/CLEP) Time-Clipping (Front end clipping) introduced by Voice Activity Detectors (VAD) Temporal signal loss and dropouts introduced by packet less Environmental noise, including background noise Signal attenuation and gain/attenuation variances Level clipping Transmission channel errors Echo: What makes it a problem? When all of the following conditions are true, echo is perceived as annoying: An analog leakage path between analog Tx and Rx paths Sufficient delay in echo return Sufficient echo amplitude How the packet voice impact on echo perception ? PSTN WAN Large delay, no echo sources PSTN Low delay, potential echo sources Bits don’t leak - Echo is not introduced on digital links The packet segment of the voice connection introduces a significant delay (typically 30 ms in each direction). The introduction of delay causes echoes (from analog tail circuits) that are normally indistinguishable from side tone to become perceptible. Because the delay introduced by packet voice is unavoidable, the voice gateways must prevent the echo. Identify and Isolate the echo problem Identify the echo problem. Which side hears echo? Calls to which numbers hear echo ? Isolate the problem as much as possible and try to find a scenario where the echo is reproducible. Whenever I hear echo, the problem is at the OTHER end !! Basic security GWs/GKs w/ACLs with source ip (yes, can be spoofed) appears to be the #1 source of protection against un-authorized calls. Run your VoIP network isolated from any public network using your prefered flavor (physical seperation, VLAN, MPLS, etc..) VoIP packets are _not_ encrypted, if this is an issue used IPSec! Beware that software crypto will add delay and jitter, use hardware crypto for better performance (should add predictable delay and jitter) Note: CRTP doesn't work with IPSec, remember this when designing the bandwidth budget. Questions?