1358 TRANSACTIONS IEEE COMMUNICATIONS, ON H. Murakami,Y.Hatori, and H. Yamamoto,“Comparisonbetween of DPCM ,and Hadamard transform coding efficiencies for the variation statistical characteristics of NTSC signals” (in Japanese), Paper Tech. Group, CS 78-53, IECE Japan, June 1978. N. Ahmed arid,K.R. Rao, Orthogonal Transformsfor Digital Signal Processing. New York:Springer-Verlag,1975. results on the performance of W.Mauersberger,“Experimental mismatched quantizers,” IEEE Trans. Irform. Theory, vol. IT-25, pp. 381-386, July 1979. J. Max, “Quantizing for minimum distortion,” IRE Trans. Inform. Theory, vol. IT-6, pp. 7-12, Mar. 1960. P. A: Wintz and A. J. Kuternback, “Waveform error control in PCM telemetry,” IEEE Trans. Inform. Theory, vol. IT-14, pp. 650-661, Sept.1968. and J. D. Gibson, “Distribution of the twoR.C.Reininger dimensional DCT coefficients for image,” ZEEE Trans. Commun., vol. COM-31. June 1983. Hybrid Adaptive Quantization for Speech Coding M. E. M. NASR AND C. V.CHAKRAVARTHY Abstract-This correspondence describes a new quantization technique called hybrid adaptive quantization (HAQ) that uses instantaneous [l], [21 as well as syllabic [3] adaptation of the stepsize.Twotypes of instantaneousadaptivealgorithmshavebeenused-Jayant’sadaptive quantizer(JAQ)andtheincrementaladaptivequantizer(IAQ). Computersimulationshave tieen performedfor a sine-wave, correlated (SNR) compuGaussian signal and digitized speech. Signal-to-noise ratio tation for PCM and DPCM coders indicatesthat the hybrid technique is the same ratio superior to the normal adaptive quantizer, when both have of maximum to minimum step size. I. INTRODUCTION Waveform encoding methods based on pulse code modulation (PCM) or differential PCM (DPCM) may employ linear, nonlinear,oradaptivequantizers [ 1 ] -[ 31 . Adaptivedelta modulation (ADM) i s a special case ofa DPCM coder. Adaptive DPCM coders, which use only half the bandwidth needed by conventional PCM coders, are being standarized for voice communication. While theexisting1-lawandA-lawcoders havefixednonuniformquantizers, ADPCM codersarebased upon adaptive quantizers. Adaptive quantization isof two types-instantaneous and syllabic. The former can change its step size for every sample, while.in the latter it is much slower (following the envelope variation in speech). Although instantaneous quantizers have good attack and decay times, they are prone to a little more VOL. COM-32, NO. 12, DECEMBER 1984 quantizing noise, unlike the slower syllabic adaptive systems. Un et al. [ 4 ] havesuccessfullyemployed a combination of both methods for ADM while we have investigated the hybrid techniquefor PCM and DPCM coders.Themainidea isas follows. Aninstantaneousadaptivequantizerchangesitsstepsize at each sampling instant, based upon a certain rule, subject t o a minimum step size A, and a maximum step size Amax, resulting in a range factor R = A m a x / & . In an HAQ these limitsarenotfixed values,’ but varyproportionally t o t h e envelope of theinput. If theenvelope is considered t o b e of A, and constantovershortperiodsoftime,thevalues Amax withintheseintervals will beconstant(butdifferent from interval to interval). Anothermethodthat is somewhatsimilar totheHAQ (called the“pseudosyllabic”adaptation)hasbeenreported by Raulln’ et aZ. [ 6 ] , [ 7 1 . In the simplest case the quantizer can be a linear one with a minimum step size of A,. Whenever the magnitude 1 B , I of the binary output exceeds a predeterminedvalue, a constantvalue of 256 is fed t o a low-pass filterwithtransferfunction H ( z ) = (1 - (511/512)z-l)-’, and a zero is fed otherwise. The effect is to increase the output (denoted by S ( n ) , say) by 256 when the threshold is exceeded, or to reduce it by a factor of 511/512 otherwise. The step size A ( n ) i s given by A(?) = A02’(“) where i ( n ) is defined as the largest integer in [(s(n)/2048) - 41. This allows the step size to vary over the range of A, to 2048 A,. It is easy t o see thatthestepsizecandoubleineightsamples (1 msfor a sampling rate of 8 kHz), while the step size reduces by half in a time that varies between 4 and 16 ms, depending upon the initial step size. As an example, a 4 bit quantizer can have 15 values of B , between -7 and +7 (since the implementation reported excludestheall-zerocode). If thethreshold is selectedas 5, the filter output is increasedby256everytime I E , I 5. However,whenthestep size isdoubled,eventhoughthe threshold is 5 , the absolute value represented by this threshold is doubled. To make the transition smoother, four nonlinearlawshavebeenused,eachwithdifferentthresholds. it takes It must benoted,however,thatduringthetime i ( n ) t o change by one unit (up or down), the quantizer will be the same. While itseemsto6esimilartothehybridscheme,the pseudosyllabicsystem is differentforthefollowingreasons. During the time of eight samples required for step increase or during the time required for step size decrease, the quantizer is fixed. On the other hand, in the hybrid scheme, even during on the instantaneous this time the step size can change based algorithm. In fact, if JAQ is used, the step size can double in just one or two samples. This is particularly efficient in avoiding clipping in PCM or slope overload in DPCM. Likewise, the timetakento halve thestepsizewouldbelongerinthe pseudosyllabic system, and during this time we have the fixed stepsize. Of the different instantaneous adaptive algorithms available, we have chosen two for our study-Jayant’s adaptive quantizer (JAQ) [ 11 and the incremental adaptive quantizer (IAQ) [2]. PCM and DPCM (single tap predictor) coders have been simulated on a DEC-IO computer using sinusoidal, correlated Gaussian random signal, and digitized speech inputs. In t h e case of the IAQ, the hybrid quantizer offers a substantial increase in dynamic range, while in t h e case of JAQ, there is not any significant increase. The reasons for and the implications of this are presented in the concluding section. > Paper approved by the Editor for Signal Processing and Communication Electronics of the IEEE Communications Society for publication without oral presentation. Manuscript received April 20, 1983; revised May 4, 1983. This work was supported in pan by a fellowship fromthe Government of India and the Government of Egypt. M. E. M. NasriswiththeDepartmentofElectricalCommunication Ehgineering, Indian Institute of Science, Bangalore 560012, India, on leave from the Faculty of Electronic Engineering, Menoufia University, Menouf, EgYPt. C. V. Chakravarthy is withtheDepartment of ElectricalandComputer Engineering, Louisiana State University, Baton Rouge, LA 70803, on leave from the Departmentof Electrical and Computer Engineering, Indian Institute ‘The authors thankan anonymous referee for bringing these references to of Science, Bangalore 560012, India. their notice and for providing a copy of the CCITT Study Group Report. 0090-6778/84/1200-1358$01.00 0 1984 IEEE 1359 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. COM-32, NO. 12, DECEMBER 1984 11. QUANTIZER DESCRIPTION One of the forms of JAQ is shown in Fig. 1. The step size An of the quantizer (which need not necessarily be uniform) at the nth sampling time is OUTPUT t 5An/2 I B n P l 1. The HAQ is realized using the system shown in Fig. 2. The quantizer output is low-passfiltered toentracttheenvelope,which is used to multiply the minimum step size A,. In the simulation of JAQ [ 1 J , the value of A, used is 0.1 Aopt where Aopt is the optimum step size for a nonadaptive quantizer given by 2 ~is the variance and N the rule of thumb Aopt = 8 ~ / (u2 the number of bits). Starting with a step size of A,, if the envelope is constant for t l seconds, then at t = t l , the step size will be M beingamultiplierdependenton A n/2 I -4An I I I -3An -2An -An I An I I 2% 36, 4A, INPUT I " t -5 An11 t suchthat IT, = t l ( T , beingthesamplingperiod).During the next sampling time, if the envelope changes by a factor P, then the step size for the next sample is P A ( t , ) and not A ( t l ) . Thisassumes aninstantaneousresponse. In practice, however, the envelope change will take place in a finite time (albeitsmall)andduringthetransitionsthestepsizewould not have stabilized to the optimumvalue. The syllabic adaptation is achieved as follows. The output of the quantizer (analog samples) isrectifiedandfiltered to producetheenvelopewhichthenmultipliestheminimum step size. The filter is a two-pole LP filter with a 20 ms time constant. The adaptation rule for the incremental adaptive quantizer Fig. 1 . Jayant'sadaptivequantizer. FULL WAVE R E C T I F I E R AND ENVELOPE FILTER SYLLABIC CONTROL INPUT - QUANTIZER = IIA n - l An-1 +A,(n) S T E PS I Z E if IBn-1 ( > ( 2 N - 1 ) 3 / 4 otherwise. (3) A o ( n ) is theminimumstepsizeatthenthsamplinginstant (=A, if it is not a hybrid quantizer, and =Ao X envelope of the input at the( n - 1)th instant if it is a hybrid quantizer). The quantizer schematic was showninFig.2,andFig,4 shows the hardware realization. Both types of the HAQ have beenusedfor PCM andthefamiliarsingle tappredictor ADPCM (Fig. 3). Themultipliersusedforthe J A Q in PCM and DPCM codersaredifferent,asdiscussedbyJayant [ 11. The IAQ is shown in Fig. 5 . 111. SIMULATION AND RESULTS Figs. 6-10 highlight the performance of the hybrid quantizer.Figs.6and 7 providetheresultsfor PCM coderswith sinusoidal and random inputs, respectively, while Figs. 8-10 arefor DPCM coders,theinputsbeing,respectively,a1020 Hzsinewave,randomnoisebandlimited to within 300 and 3400Hz,andspeech.Thedigitizedspeechhasasampling rate of 10 kHz. For speech signals, the segment SNR [ 8 J is computed. All the values of SNR are computed over 10 000 samples. In thenormaladaptivequantizer,theratio of the maximum to minimum step size islimited t o 6 0 dB. In the hybrid quantizerthe ratio remains the same but with a difof the largestandsmalleststeps atany ference.Theratio instant is fixed at 100 for the instantaneous adaptation only. Therefore,whenevertheenvelope is constant,thisratio is 100 but the absolute values of the largest and smallest steps will vary. Thereisnotmuchdifferencebetweenthe JAQ andits A/ D INSTANTANEOUS CONTROL [31 is A, - LOGIC Fig. 2. Schematic of ahybridadaptivequantizer. hybridversion.Thereasonforthis is thattheadaptation algorithm is ideally suited for very quick changes in the step size, even when the starting value is grossly suboptimal. This is too is notthe case withthe IAQ. If thestepincrement small, the step size change is too slow. So the SNR degrades. On theotherhand, if theincrement is madeproportional to the envelope (hybrid version) the performance is excellent. While an ordinary version of JAQ seems enough, the hybrid version has certain advantages. While the computer simulation provides a continuum of step sizes, a hardware implementation has to restrict the values to a finite set, This disadvantage can be offset to some extent with a hybrid coder. IV. CONCLUSIONS A newmethod of realizing anadaptivequantizer, called thehybridadaptivequantizer,hasbeendescribed.Thestep size adaptation is both instantaneous as well as syllabic-the firstpartcontrolledbythedigitaloutputandthesecond partbytheenvelope of thequantizeroutput.Simulation results for deterministic, random, and speechsignals have been obtained in the case of 3 bit and 4 bit PCM and DPCM quantizers. The instantaneous adaptation algorithms used are (JAQ) and the incremental adapJayant's adaptive quantizer tive quantizer (IAQ). Although among them there is not much differenceinperformance,thehybridversionscanoffera large dynamic range and should prove to be especially useful for speech coding. IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. COM-32, NO. 1 2 , DECEMBER 1984 1360 ENVELOPE DETECTOR SYLLABIC 5TEP SIZE CONTROL (SINE WAVE INPUT) HYBRID VERSION P C M CODER QUANTIZER SAMPLES . Xn-9 STEP SIZE CONTROL LOGIC c( - = P,.Cl, -40 -30 -20 -10 0 40 30 20 10 INPUT LEVEL, dB Fig. 3. DPCM coder witha single tap predictor. COMPARATOR - Sf H INCREMENTAL ADAPTIVE QUANTIZER AND ITS HYBRIDVERSION(SINE WAVE INPUT) FF -t- PCM 30 CODER HYBRID ,&-a -4 K 5 In 4 - - - - - - - - SIT .-----___ VI 3 BIT 10 I I od -40 Ill I -30 I -20 I I -10 0 I N P U TL E V E L , SAR U/D CONTROL I SAR ; : DIGITAL/ANALOG -20 40 - 30 - 40 30 Fig. 6. Comparison of SNR versusinputforsinusoidal quantizer). b DAC IO dB signal (PCM J A Y A N T ‘ S ADAPTIVE QUANTIZER AND ITS HYBRID VERSION (RANDOM INPUT) P C M QUANTIZER w HYBRID m S U C C E S S I VAEP P R O X I M A T I ORNE G l S T E R CONVERTER Fig. 4. Suggested hardwarerealization scheme forthe HAQ (incremental version). OUTPUT 0 t 0’ -40 4 I -30 -20 -IO I 0 INPUT LEVEL, 40 An I I -An An 2An I I -:An I - 3 A- 2n A n =An -do 4 -A0 I 3An 4An - I 40 I 20 30 dB INCREMENTAL ADAPTIVE QUANTIZER AND I T 5 HYBRID VERSION (RANDOM INPUT) P C M CODER HYBRID ’0 INPUT 2o 10 INPUT L E V E L , dB Fig. 7. Comparison of SNR versusinputforaGaussianinput(PCM quantizer. 40 1361 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. COM-32, NO. 12, DECEMBER 1984 -4 DPCM QUANTIZER -40 -30 -20 PO 40 0 -10 INPUT HYBRID ‘1 3o 1 I Y 20 30 40 01 -40 JAYANT$ ADAPTIVE WANTIZER AND ITS HYBRID VERSION (SPEECH INPUT) D P C H CODER w HYBRID I -30 , I - 20 -40 0 20 10 30 40 INPUT L E V E L , dB L E V E L , dB IUCREMENTAL ADAPTIVE QUANTIZER AND ITS HVBRID VERSION (SINE WAVE INPUT) DPCM CODER CODD ER PCM -40 -20 -10 0 INPUTLEVEL, dB INPUT LEVEL, -30 W Fig. 8. ComparisonofSNRvariationwithinputlevel(sine-wave,DPCM quantizer). HYBRID $0 20 30 40 dB Fig.10.SNRvariationforspeechsignals(DPCMquantizer). ACKNOWLEDGMENT HYBRID VERSION The authors are grateful to the Computer Centre, I.I.sc., for thefacilities provided and to Ms. Bharati Devi of the School of Automationforproviding the digitized speech samples. The first author thanks the Government of India and Government of Egypt for providing a fellowship. The authors thank t h e reviewers for their suggestions. CRANDoM INPUT) DPCM QUANTI2E.R HYBRID REFERENCES -40 -30 -20 -10 0 20 10 30 40 30 40 INPUT LEVEL, dB INCREMENTAL ADAPTIVE QUANTIZER HYBRID VERSION (RANDOM INPUT) DPCH QUANTIZER 30 i.” 01 -40 M AND ITS HYBRID 1 -30 ~ 2 0 -10 0 INPUTLEVEL, Fig. 9. Comparisonof 40 20 dB SNR variationwith DPCM quantizer). inputlevel(randominput, N. S. Jayant, “Adaptive quantization with one work memory,” Bell Syst. Tech. J., vol.52,pp.1119-1144,Sept.1973. C. V. Chakravarthy, N.D. Georganas, and S. G . S. Shiva,“An incremental adaptive quantizer,”IEEE Trans. Commun., vol. COM29, pp. 1056-1061, July 1981. M. S. Sodhi, J. Das, and C. V. Chakravarthy, “An adaptive PCM with Conf. Rec.Nut. Telecommun. syllabicstepsizeadaptation,”in Conj., Washington, DC, 1979, pp. 4.3.1-4.3.6. C. K. Un,H. S. Lee, and J. S. Song,“Hybridcompandeddeltamodulation,” IEEE Trans. Commun., vol. COM-29, pp. 1337-1343, Sept.1981. B. S. Atal, N. S. Jayant, R. J. L. Flanagan, M.R.Schroeder, Crochiere, and J.M.Tribolet,“Speechcoding,” IEEE Trans. Commun., vol. COM-27, pp. 711-742, Apr. 1979. J. M. Raulin, G . Bonnerot, J. Jeandot, and R. Lacroix, “A 60 channel PCM-ADPCM converter,” ‘IEEE Trans. Commun., vol. COM-30, pp. 567-573, Apr. 1982. XVIII, no.35,“32kb/s ADPCM CCI’IT Contrib.StudyGroup codecs,”Mar.1981. P. Noll, “Adaptive quantization in speech coding systems,” in Proc. Int. Zurich Sem. Digital Commun., Zurich, Switzerland, 1974, pp. B3(1)-B3(6).