Quantization Noise is 1

Waveform SpeechCoding Algorithms: An Overview Outline  Introduction  Concepts  Quantization  PCM  DPCM  ADPCM  Standards & Applications  G711  G726  Performance Comparison & Examples  Summary & Conclusion Technical Presentation  Page 2 Introduction Motivation What is Speech Coding ? It is the procedure of representing a digitized speech signal as efficiently as possible, while maintaining a reasonable level of speech quality. Why would we want to do that ? To Answer this, let’s have a look at the Structure of the Coding System Our Guy Technical Presentation  Page 3 Introduction Motivation Filtering & Sampling (1) Technical Presentation  Page 4 Introduction Motivation Filtering & Sampling (2) Technical Presentation  Page 5 Introduction Motivation Filtering & Sampling (3) Technical Presentation  Page 6 Introduction Motivation Filtering & Sampling (4)  Most of the speech contents lies in between 300 – 3400 Hz  According to Nyquist theorem Fs >= 2 fm (to avoid aliasing)  A value of 8kHz is selected (8 >= 2*3.4).  For good quality16 bits are used to represent each sample.  Bit-rate = 8kHz *16 bits = 128 kbps Input Rate The Input rate could even be more, for example in Skype: 16 kHz sampling frequency is used in skype and so resulting to an input rate of 192 kBit/s. But, this is a waste of bandwidth that could rather be used by other services and applications. Source Coding (Speech Coding in this Context) [1] Technical Presentation  Page 7 Introduction Motivation Desirable Properties of a Speech Coder  Low Bit-Rate: By using a lower bit-rate, a smaller bandwidth for transmission is needed , leaving room for other services and applications .  High Speech Quality: Speech quality is the rival of “low bit-rate”. It is important for the decoded speech quality to be acceptable for the target application.  Low Coding Delays: The process of speech coding introduce extra delay, this might affect application that have real time requirements. [1] Technical Presentation  Page 8 Introduction Speech Coding Categories What are the different Categories of speech coding ? Speech coding is divided into three different categories:  Waveform Codecs (PCM, DM, APCM, DPCM, ADPCM)  Vocoders (LPC, Homo-morphic, …etc )  Hybrid codecs (CELP, SELP, RELP, APC, SBC, … etc) [2] Technical Presentation  Page 9 Concepts Quantization What Is Quantization ? Quantization is the process of transforming the sample amplitude of a message into a discrete amplitude from a finite set of possible amplitudes. [3] Each sampled value is approximated with a quantized pulse, the approximation will result in an error no larger than q/2 in the positive direction or –q/2 in the negative direction. Technical Presentation  Page 10 Concepts Quantization Understanding Quantization To understand quantization a bit more let’s have a look at the following Example: Technical Presentation  Page 11 Concepts Quantization Classification Of Quantization Process The Quantization process is classified as follows:  Uniform Quantization: The representation levels are equally spaced (Uniformly spaced) Midtread type Midrise type  Non-Uniform Quantization: The representation levels have variable spacing from one another . [4] But why do we need such classification ?! Technical Presentation  Page 12 Concepts Quantization Human Speech – Excursion & Recap (1) Speech can broken into two different categories:  Voiced (zzzzz)  Un-Voiced (sssss) Naturally occurring speech signals are composed of a combination of the above categories, take the word “Goat” for example: [4] Goat contains two voiced signals followed by a partial closure of the vocal tract and then an Un-voiced signal. Those occurs at 3400-3900, 3900-5400, and 6300-6900, respectively. Technical Presentation  Page 13 Concepts Quantization - why do we need such classification ?! (1) Human Speech – Excursion & Recap (2) It should be noted that: The peak-to-peak amplitude of voiced signals is approximately ten times that of un-voiced signal. Un-voiced signals contain more information, and thus higher entropy than voiced signals. The telephone system must provide higher resolution for lower amplitude signals Probability of occurrence Statistics of Speech Signals : Amplitude of speech signals Technical Presentation  Page 14 [3] [6] Concepts Quantization - why do we need such classification ?! - (2) Quantization Noise  The Quantization process is lossy (errorneous).  An error defined as the difference between the input signal M and the output signal V. This error E is called the Quantization Noise.  Consider the simple example:  M = (3.117, 4.56, 2.31, 7.82, 1)  V = (3,3,2,7,2)  E = M – V = (0.117 ,1.561, 0.31, 0.89, 1)  How do we calculate the noise power ?  Consider an input m of continuous amplitude of the range (-M_max, M_max)  Assume a uniform Quantizer, how do we get the Quantization Noise Power 1 Technical Presentation  Page 15 Concepts Quantization - why do we need such classification ?! - (3) Comparison – Uniform Vs. Non-Uniform Usage  Speech signals doesn’t require high quantization resolution for high amplitudes (50% Vs. 15%).  wasteful to use uniform quantizer ? The goal is decrease the SQNR, more levels for low amplitudes, less levels for high ones.  Maybe use a Non-uniform quantizer ? [3] Technical Presentation  Page 16 Concepts Quantization More About Non-Uniform Quantizers (Companding)  Uniform quantizer = use more levels when you need it.  The human ear follows a logarithmic process in which high amplitude sound doesn’t require the same resolution as low amplitude sounds.  One way to achieve non-uniform quantization is to use what is called as “Companding”  Companding = “Compression + Expanding” Compressor Function Uniform Quantization Expander Function (-1) Technical Presentation  Page 17 Concepts Quantization What is the purpose of a Compander ?  The purpose of a compander is to equalize the histogram of speech signals so that the reconstruction levels tend to be equally used. [6] There are two famous companding techniques that Follow the Encoding law A-Law Companding µ-Law Companding Technical Presentation  Page 18 [6] 2 Concepts Quantization A-Law Encoding µ-Law Encoding [3] Technical Presentation  Page 19 Concepts Quantization Companding Approximation  Logarithmic functions are slow to compute, why not approximate ?  3 bits, 8 segments ( chords ) to approximate  P is the sign bit of the output  S’s are the segment code  Q’s are the quantization codes [3] Technical Presentation  Page 20 Concepts Quantization Companding Approximation – Algorithm Encoding  Add a bias of 33 to the absolute value of the input sample  Determine the bit position of the most significant among bits 5 to 12 of the input  Subtract 5 from that position, and this is the Segment code  Finally, the 4 bit quantization code is set to 4 bits after the bit position of the most significant among bits 5 to 12 Decoding  Multiply the quantization code by 2 and add 33 the bias to the result  Multiply to the result by 2 raised to the power of the segment code  Decrement the result by the bias  Use P – bit to determine the sign of the result  Example ?! [3] Technical Presentation  Page 21 Concepts Quantization µ-Law Encoding - Example  Example Input - 656 P S2 S1 S3 Q3 Q4 Q5 Q6 1 1 0 0 0 1 0 1  Sample is negative so bit P becomes 1  Add 33 to the absolute value to bias high input values (due to wrapping)  The result after adding is 689 = 0001-0101-10001  The most-significant 1 bit in position 5 to 12 is at position 9  Subtracting 5 from the position values yields 4  The segment code  Finally the 4 bits after the last position are inserted as the quantization code Technical Presentation  Page 22 Concepts Quantization µ-Law Decoding - Example  Example Input - 656 P S2 S1 S3 Q3 Q4 Q5 Q6 1 1 0 0 0 1 0 1  The quantization code is 101 = 5, so 5*2 +33 =43  The segment code is 100 = 4 , so 43* 2^4 = 688.  Decrement the Bias 688 -33 =655  But P is 1 so the final result is -655  Quantization Noise is 1 (Very small) Technical Presentation  Page 23 Concepts Quantization µ-Law Encoding  Approximately linear for smaller values & Logarithmic for high input values The practically used values for µ is 255 Used for speech signals Used for PCM telephone systems in US, Canada and Japan A-Law Encoding  Linear segments for low level inputs & a logarithmic segment for high level inputs The practically used values for A is 100  Used for PCM telephone system in Europe Technical Presentation  Page 24 Concepts Pulse Code Modulation (PCM) PCM Description  Sampling results in PAM  PCM uniformly quantizes PAM  The result of PCM are PCM words  Each PCM word is l= Log2 (L) bits [3] Technical Presentation  Page 25 Concepts Differential Pulse Code Modulation (DPCM) DPCM Description  Signals that are sampled at a high rate have high correlation.  The difference between those samples will not be large  Instead of quantizing each sample, why not quantize the difference ?  This will result in a quantizer with much less number of bits [7]  This is a simple form where (First Order)  More than one signal can be used in the prediction (N-Order)  Problems with this approach ? Technical Presentation  Page 26 [7] Concepts Differential Pulse Code Modulation (DPCM) DPCM Example [7]  It is clear here from the table that the error adds up to produce an output signal which is completely different from the original one Technical Presentation  Page 27 Concepts Differential Pulse Code Modulation (DPCM) DPCM Prediction  Previously, input to predictor in the encoder was different than the one in the decoder.  The difference between the predictor led to reconstruction error e(n) = x[n] – x’[n].  To solve this problem completely the same predictor that was used in the decoder will also be used in the decoder Channel  Therefore the reconstruction error at the decoder output will be the same as the quantization error at the encoder.  There will be no quantization accumulation. Technical Presentation  Page 28 Concepts Adaptive Differential Pulse Code Modulation (ADPCM) ADPCM Description  As can be inferred from the name, ADPCM combines PCM + DPCM and adds the ADPCM  The “A” in ADPCM stands for “Adaptive”  In DPCM, the difference between x[k] and x[k-1] is transmitted instead of x[k]  To further reduce the number of bits per sample, ADPCM adapts the quantization levels to the characteristics of the analog signal . Original 32-Kbps ADPCM used 4 bits [9] Technical Presentation  Page 29 Standards, Examples & Applications G711 G711 Description  A Wave form codec that was Released in 1972  Formal name is Pulse Code Modulation (PCM) since it uses PCM in it’s encoding  G711 achieves 64 kbps bit rate (8 kHz sampling frequency x 8 bits per sample)  G711 defines two main compression algorithms A-Law (Used in North America & Japan)  µ-Law (Used in Europe and the rest of the world)  A and µ laws takes as an input 14-bit and 13-bit signed linear PCM samples and Compress them to 8-bit samples Applications  Public Switching Telephone Network (PSTN)  WiFi phones VoWLAN  Wideband IP Telephony  Audio & Video Conferencing  H.320 & H.323 specifications Technical Presentation  Page 30 Standards, Examples & Applications G726 G726 Description  G726 makes a conversion of a 64 kbps A-law or µ-law PCM channel to and from a 40, 32, 24 or 16 kbps channel.  The conversion is applied to raw PCM using the ADPCM Encoding Technique  Different rates are achieved by adapting the number of quantization levels 4 - levels (2 bits and 16 kbps) 7 - levels (3 bits and 24 kbps)  15 - levels (4 bits and 32 kbps)  31 - levels (5 bits and 64 kbps)  Includes G721 and G723 [12] Technical Presentation  Page 31 Performance Comparison [1] Technical Presentation  Page 32 Summary & Conclusion Summary  We talked about quantization concepts in all it’s flavors  We discussed about the category of waveform coding (PCM,DPCM and ADPCM)  We presented the ITU Standards (G711 and G726) and mentioned some examples and applications  Finally we did a comparison the most prominent speech codec's out there. Conclusion  Speech coding Is an important concept that is required to efficiently use the existing bandwidth  There exist many important metrics to keep in mind when doing speech coding. It is I important for a good speech coder to balance those metrics. The Most important ones are  Data Rate  Speech Quality  Delay  Waveform codec's, achieves the best speech quality as well as low delays.  Vocoders achieves low data rate but at the cost of delays and speech quality  Hybrid coders achieves acceptable speech quality and acceptable delay and data rate. Technical Presentation  Page 33 References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. Wai C. Chu Speech Coding Algorithms: Foundation & Evolution of Standardized Coders Speech Coding: http://www-mobile.ecs.soton.ac.uk/speech_codecs/ Sklar: Digital Communication Fundamentals And Applications. A-Law and mu-Law Companding Implementations Using the TMS320C54x Michael Langer: Data Compression – Introduction to lossy compression Signal Quantization and Compression Overview http://www.ee.ucla.edu/~dsplab/sqc/over.html Wajih Abu-Al-Saud: Ch. VI Sampling & Pulse Code Mod. Lecture 25 Yuli You: Audio Coding: Theory And Applications Tarmo Anttalainen: Introduction to telecommunication Networks Engineering Wikipedia G711: http://en.wikipedia.org/wiki/G.711 David Salomon: Data Communication the Complete Reference ITU CCIT Recommendation G.726 ADPCM Technical Presentation  Page 34 Questions & Discussion Thank you!! Technical Presentation  Page 35

Quantization Noise is 1

Related documents

Products

Support

Quantization Noise is 1

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib