Overcoming VoIP Quality Challenges Dr. Jan Linden, VP of Engineering Global IP Solutions 3 Outline VoIP Quality Challenges Latency Codec Choice Conferencing How to Measure Speech Quality 4 VoIP Design Considerations Speech Quality Time to Market Ease of Use Flexibility Network Impairments Power Consumption Cost Quality Cost Signaling Infrastructure Features Device Considerations 5 Major Challenges for VoIP End-point Design Both Sides of the Call Need to be Considered Speech Codec Hardware Issues (Processor, OS, Acoustics, etc.) Codec Hardware Network Coping with Network Degredation Power Consumption VoIP Design Challenges Power Echo Echo Cancellation Additional Voice Processing Components Voice Environment Environment – Background Noise, Room Acoustics, etc. 6 Delay Major effect is “stepping on each other’s talk” Usage scenario affects annoyance factor – higher delay can be tolerated for mobile devices Long delays make echo more annoying Impact of IP Networks Packet Loss Smooth concealment necessary Network Jitter Jitter buffer necessary to ensure continuous playout Trade-off between delay and quality 7 Sources of Latency Codec Capture Playout Network delay Jitter buffer OS interaction Transcoding A/D A/D PrePreprocessi Processing ng Speech Speech Encoding encoding IP IP Interface interface IP Network IP Network D/A D/A PostPostprocessi Processing ng Speech Speech Decoding decoding Jitter Jitter Butter buffer 8 Impact of Delay on Voice Quality Mean Opinion Score 4 3 2 1 0 250 500 One-w ay transmission time [ms] 750 Data from ITU-T G.114 ITU-T (G.114) recommends: – Less than 150 ms one-way delay for most applications (up to 400 ms acceptable in special cases) Users have got used to longer delays – Still, low delay very important for high quality 9 Speech Codec Many conflicting parameters affect choice of codec Determines upper limit of quality Complexity Memory Delay Speech Codec Support of several codecs necessary – Interoperability Input Signal Robutness Bit-rate – Usage scenario IPR issues a significant concern Packet-loss Robustness Quality Sampling Rate 10 Audio Spectrum Better than PSTN quality is achievable in VoIP – Utilizing full 0 – 4 kHz band in narrowband – Wideband coding offers more natural and crispier voice Telephony band 11 Audio Spectrum vs. Speech Quality Speech Quality Wideband Speech CD Speech Super Wideband Speech Narrowband Speech (PSTN) Frequency 4 kHz 8 kHz 10 kHz 16 kHz 22.1 kHz 12 Speech Codec Design for VoIP Many standard codecs designed for bit errors, not packet loss – Error propagation issue for CELP codecs Variable bit rate attractive for IP networks Packet overhead significant (5 – 32 kb/s) – Makes low bit rate codecs less attractive Packet loss concealment a must Jitter buffer design has significant impact on quality Alternatives to standards – De-facto standards like iSAC – Open source like Speex Echo Cancellation High delay in VoIP makes echo problem more prominent Network/Line echo cancellation for gateways Acoustic echo cancellation – Hands-free/speakerphone – Small devices Biggest challenge is AEC for PC – Acoustic setup unknown and changing – Wideband speech – Very few solutions on the market 14 Effects of Transcoding Transcoding occurs when the endpoints are using different codecs – Every transcoding introduces distortion – Low bit-rate codecs very sensitive to transcoding Transcoding between networks VoIP to PSTN Limited quality degradation since G.711 used on the PSTN side VoIP to Cellular Severe quality degradation common since low bit-rate codecs typically used on both sides VoIP to VoIP Usually occurs in Session Border Controllers Can normally be avoided Transcoding in conferencing – Mixing done in decoded domain results in transcoding 15 How to Make the VoIP Software Robust? Very Quick Jitter Buffer Adaptation – Conditions Change Very Rapidly (on a milisecond basis) Minimize Delay Everywhere – every milisecond counts Spot Jitter Patterns Increase Delay to Keep Good Quality when Unavoidable Packet Loss Concealment - Capable of Handling Several Lost Packets in a Row 16 Measuring Voice Quality Subjective Methods Test the “right thing”, i.e. subjective quality Takes all types of degradation into account Time consuming and costly Lack of repeatability Objective Methods Simple and affordable Inaccurate but repeatable results Sensitive to any processing (nonlinear filtering, echo cancellation, time warping etc.) – Time synchronization major challenge not yet solved Sensitive to background and equipment impairments One step behind development of codecs and error concealment Next generation algorithm in standardization process (P.OLQA) Audio Conferencing Design includes a trade-off between quality and scalability A Client based or server based – – Server based offers better scalability than client based Can be combined Transcoding often unavoidable Two strategies: – – Mix incoming signals to form one output signal Only relay packets and mix at client side Multi-codec support – In relay mode all endpoints need to support all codecs Narrowband and wideband – – – Both can be present in a conference Narrowband participant will hear everything in narrowband Wideband participant hears others in narrowband or wideband A+B+C+D E A+B+C+E B+C+D+E D A+B+D+E B A+C+D+E C 18 Conclusions Latency has a significant impact on the perceived quality in VoIP – Low latency, high quality (e.g. NetEQ) jitter buffer necessary Choose the right codec for the usage scenario – Or a codec that can adapt like iSAC Transcoding should be avoided, if possible Significantly better quality than PSTN possible – Wideband coding No good objective measure for speech quality exists – Always combine with subjective evaluation