JOURNAL OF INFORMATION, KNOWLEDGE AND RESEARCH IN ELECTRONICS AND COMMUNICATION ENGINEERING A Comparative Review of Various Types of Speech Coder for Mobile Communication 1PROF. 1 R. N. RATHOD, 2PROF. M. V. MAKWANA, 3 Mr.V. C .TOGADIYA Asst.Professor Power Electronics Department, Morbi,Gujarat And Head, Power Electronics Department, Morbi,Gujarat 3 M.E. [EC] Student, R.K.University,Rajkot,Gujarat 2 Asst.Professor,, ABSTRACT: Speech coding is the process of obtaining a compact representation of voice signals for efficient transmission over band-limited wired and wireless channels and/or storage. It is also classified as the process of reducing the bit rate of digital Speech representation for transmission or storage, while maintaining a speech quality that is acceptable for the application. . To achieve high quality speech at a low bit rate, coding algorithms apply sophisticated methods to reduce the redundancies, that is, to remove the irrelevant information from the speech signal. Speech coding is an important aspect of modern telecommunications. Lower bit rate implies that a smaller bandwidth is required for transmission. in wireless and satellite communications bandwidth is limited. At the same time, multimedia communications and some other speech related applications need to store the digitized voice. Reducing the bit rate implies that less memory is needed for storage. These two applications of speech compression make speech coding an attractive field of research Keywords— Analysis by Synthesis Coding, Excitation Detector, Vocal Cords, Quasi Periodic Sound. I: INTRODUCTION Speech Coding is classified as the process of reducing the bit rate of digital speech representation for transmission or storage, while maintaining a speech quality that is acceptable for the application. Speech coding is an important aspect of modern telecommunications. Speech coding is the process of digitally representing a speech signal. The primary objective of speech coding is to represent the speech signal with the fewest number of bits, while maintaining a sufficient level of quality of the retrieved or synthesized speech with reasonable computational complexity. To achieve high quality speech at a low bit rate, coding algorithms apply sophisticated methods to reduce the redundancies, that is, to remove the irrelevant information from the speech signal. In addition, a lower bit rate implies that a smaller bandwidth is required for transmission. Although in wired communications very large bandwidths are now available as a result of the introduction of optical fiber, in wireless and satellite communications bandwidth is limited. At the same time, multimedia communications and some other speech related applications need to store the digitized voice. Reducing the bit rate implies that less memory is needed for storage. These two applications of speech compression make speech coding an attractive field of research. [5] Fig.1 depicts the block diagram of speech coding method. Fig.1. Block diagram of speech coding Fig (2) ISSN: 0975 – 6779| NOV 11 TO OCT 12 | VOLUME – 02, ISSUE - 01 Page 320 JOURNAL OF INFORMATION, KNOWLEDGE AND RESEARCH IN ELECTRONICS AND COMMUNICATION ENGINEERING Voiced signal Voiced signal shown in fig (2), shows the relationship between amplitude of speech and varying frequency, it also shows the relationship between amplitude of speech and varying time. Unvoiced signal Unvoiced signal shown in fig (3), shows relationship between amplitude of speech varying frequncy,and same fig(3) shows relationship between amplitude of speech varying time. the and the and Fig (3) II: SPEECH CODING METHODS 1. Waveform Coding 2. Source Coding 3. Hybrid Coding Waveform Coding Waveform coders attempt to code the exact shape of the speech signal waveform, without considering in detail the nature of human speech production and speech perception. Waveform coders are most useful in applications that require the successful coding of both speech and no speech signals. In the public switched telephone network (PSTN), for example, successful transmission of modem and fax signaling tones, and switching signals is nearly as important as the successful transmission of speech. The most commonly used waveform coding algorithms are uniform 16-bit PCM, commanded 8-bit PCM, and ADPCM. Source Coding Vocoders are parametric digitizers that use certain properties of human Speech production Mechanism. Human Speech is produced by emitting Sound Pressure Waves Which is radiated primarily from lips, although significant energy can also emanate from nostrils, throat, etc. In human Speech, the air Compressed by the lungs excites the Vocal in Two types of modes. [2] When generating Voice Sounds, the Vocal cords vibrate and generate quasi-periodic Voice Sound. In the Case of lower energy unvoiced Sounds, the Vocal cords do not participate in the Voice Production and the Source act like a noise generator. In the Case of Vocoders, instead of producing a close replica of an input Signal at the output, an appropriate set of Source Parameters is generated to characterize the input Signal as close as possible for a given period of time.[3] Hybrid Coding Hybrid Coding is an attractive trade-off between Waveform Coding and Vocoders, both in terms of speech quality and transmission bit rate. Although generally at the price of higher complexity. This Types of Coder is also known as analysis- bySynthesis (ABS) Codec’s. Ex. Include LPAS, CELP etc. In an LPAS Coder, the decoded Speech is produced by filtering the signal Produced by the excitation generator through both a long term and Short term predictor synthesis filter. In CELP Coder both Encoder and decoder store the same collection of codes of possible length L in a Codebook. The excitation for each frame is described completely by the index to an an appropriate Vector. This index is found by an Exhaustive Search Over all Possible Code Vector, using the one that gives the smallest error between the original and decoded signals. [6] The CELP coding operated under 8kbps and it goal is transmit the minimum amount of speech signal in codeword with minimum error is produced to synthesis the speech signal[2].the major application of compression of speech in mobile communication at side of encoder to transmitt the speech signal in low bit rate. It allows longer message into speech code, and it also allows to user share the same bandwidth.[3]. Codebook excited linear prediction (CELP) was introduced by B.S. Anal and M.A. Schroeder at the 1984[4].To increases the speed of data transmission and data rates one more coding technology introduced in wireless communication which is algebric code excited linear pridiction ACELP the speech signal transmitt with minimum 4kbps speed so we have to reduces the bit rate more & more compare to other coding technology, the MATLAB tool is used to design ACELP algorithm, The tool is user-friendly and graphical user interface (GUI) that allows the student to study and verify through graphics the various aspects of the algorithm such as: the LPanalysis, the open-loop pitch search, the adaptive codebook search (pitch search), the fixed codebook search, and the bit allocation patterns. We choose MATLAB as the implementation platform because it allows the user to easily understand the complex parts of the algorithm whose function is not a Major [10]. ISSN: 0975 – 6779| NOV 11 TO OCT 12 | VOLUME – 02, ISSUE - 01 Page 321 JOURNAL OF INFORMATION, KNOWLEDGE AND RESEARCH IN ELECTRONICS AND COMMUNICATION ENGINEERING The Adaptive Multi-Rate Wideband (AMR-WB) speech codec recently selected by the Third Generation Partnership Project (3GPP) and European Telecommunication Standards Institute (ETSI) for GSM and the third generation mobile communication WCDMA system for providing wideband speech services. The paper describes AMR-WB standardization history, algorithmic description including novel techniques for efficient ACELP wideband speech coding, and subjective quality performance. The AMR-WB speech codec algorithm was selected in December 2000, and the corresponding specifications were approved in March 2001 Standardization activity for wideband speech coding around 16 kbit/s. The adoption of AMR-WB by ITUT is of significant importance since for the first time that the same codec is adopted for wireless as well as wire line services. AMR-WB uses an extended audio bandwidth from 3.4 kHz to 7 kHz and gives superior speech quality and voice naturalness compared to existing 2nd and 3rd generation mobile communication systems. The wideband speech service provided by the AMR-WB codec will give mobile communication speech quality that also substantially exceeds (narrowband) wire line quality. A moving vehicle encounters resistance from the air. This drag is made up of pressure drag and skin friction drag. The oncoming airflow pushes against the front, creating a high-pressure region, just as it does on the wheels and the front of the semi-trailer. The car moving forward in the airflow creates a lowpressure region behind the tractor and the semi-trailer: these areas ‘suck’ the vehicle backwards, as it were. Interestingly, the high-pressure region at the front contributes just as much to drag as the low-pressure region at the rear: with or without crosswind, each accounts for about 1/3 of the overall drag. The remaining 1/3 of the overall drag is created by the vehicle’s under-body [3]. The relative importance of aerodynamic design to reducing truck fuel consumption can be seen from an overview of the various factors involved in fuel consumption. It is important here to distinguish between different types of truck. The chart below shows the components that use energy in an articulated vehicle expressed as losses. As the chart shows, 15% of the fuel is used to overcome mechanical friction in the engine, gearbox and drive shaft. 45% of the fuel is used to overcome the rolling resistance. Drag is responsible for 40% of fuel consumption [4]. III: SPEECH CODEC ATTRIBUTES Speech coders attempt to minimize the bit rate for transmission or storage of the signal while maintaining required levels of speech quality, communication delay, and complexity of implementation (power consumption). We will now provide brief descriptions of the above parameters of performance, with particular reference to speech. Speech Quality Speech quality is usually evaluated on a five-point scale, known as the mean-opinion score (MOS) scale, in speech quality testing---an average over a large number of speech data, speakers, and listeners. The five points of quality are: bad, poor, fair, good, and excellent. Quality scores of 3.5 or higher generally imply high levels of intelligibility, speaker recognition and naturalness. Transmission Bit Rate Since the speech codec shares the communications channel with other data, the peak bit rate should be as low as possible so as not to use a disproportion ate shares of the channel. codec below 64 kbps were developed to increase the capacity of equipment used for narrow bandwidth links. Communication Delay Speech coders often process speech in blocks and such processing introduces communication delay. Depending on the application, the permissible total delay could be as low as 1 msec, as in network telephony, or as high as 500 msec, as in video telephony. Communication delay is irrelevant for one-way communication, such as in voice mail. Complexity The complexity of a coding algorithm is the processing effort required to implement the algorithm, and it is typically measured in terms of arithmetic capability and memory requirement, or equivalently in terms of cost. A large complexity can result in high power consumption in the hardware. IV. CURRENT CAPABILITIES IN SPEECH CODING Fig.2. Shows the speech quality that is currently achievable at various bit rates from 2.4 to 64 kbps for narrowband telephone (300--3400 Hz) speech. The intelligibility of coded speech is sufficiently high at these bit rates and is not an important issue. The speech quality is expressed on the five-point MOS scale along the ordinate in Figure. PCM (pulse-code modulation) is the simplest coding system, a memory less quantizer, and provides essentially transparent coding of telephone speech at 64 kbps. With a simple adaptive predictor, adaptive differential PCM(ADPCM) provides highquality speech at 32 kbps. The speech quality is slightly inferior to that of 64 kbps PCM, although the telephone handset receiver tends to minimize the difference. ADPCM at 32 kbps is widely used for expanding the number of speech channels by a ISSN: 0975 – 6779| NOV 11 TO OCT 12 | VOLUME – 02, ISSUE - 01 Page 322 JOURNAL OF INFORMATION, KNOWLEDGE AND RESEARCH IN ELECTRONICS AND COMMUNICATION ENGINEERING factor of two, particularly in private networks and international circuits. It is also the basis of lowcomplexity speech coding in several proposals for personal communication networks, including CT2 (Europe), UDPCS (USA) and Personal Handy phone (Japan).[8] For rates of 16 kbps and lower, high speech quality is achieved by using more complex adaptive prediction, such as linear predictive coding (LPC) and pitch prediction, and by exploiting auditory masking and the underlying perceptual limitations of the ear. Important examples of such coders are multi-pulse excitation, regular-pulse excitation, and code-excited linear prediction (CELP) coders.[1] The CELP algorithm combines the high quality potential of waveform coding with the compression efficiency of model-based vocoders. At present, the CELP technique is the technology of choice for coding speech at bit rates of 16 kbps and lower. At 16 kbps, a low-delay CELP (LD-CELP) algorithm provides both high quality, close to PCM, and low communication delay and has been accepted as an international standard for transmission of speech over telephone networks. [1] At 8 kbps, which is the bit rate chosen for firstgeneration digital cellular telephony in North America speech quality is good, although significantly lower than that of the 64 kbps PCM speech. Both North American and Japanese first generation digital standards are based on the CELP technique. The first European digital cellular standard is based on regular-pulse excitation algorithm at 13.2 kbps. Fig.2. The speech quality mean opinion score for various bit rates. The present research is focused on meeting the critical need for high quality speech transmission over digital cellular channels at 4 and 8 kbps. Low bit rate speech coders are fairly complex, but the advances in VLSI and the availability of digital signal processors have made possible the implementation of both encoder and decoder on a single chip[7] V: EXPERIMENTAL DATA Speech file Algorithm used File size Results The wave file is 5.84 sec long The wave file is 3.34 sec long The wave file is 2.85 sec long a_rajni LPC,RELP 257KB i_rajni LPC,RELP 147 KB s1 of wb LPC,RELP 256kbps s2of wb LPC,RELP 256kbps t_8k_2s LPC,RELP 128kbps a_bina LPC,RELP 352kbps ISSN: 0975 – 6779| NOV 11 TO OCT 12 | VOLUME – 02, ISSUE - 01 The wave file is 2.94 sec long The wave file is 1.95 sec long The wavefile is 4.75 sec long Page 323 JOURNAL OF INFORMATION, KNOWLEDGE AND RESEARCH IN ELECTRONICS AND COMMUNICATION ENGINEERING Simulation Results Based on LPC Simulation Results Based on RELP Original signal = "s1ofwb" Original signal = "t 8k2s" 0.4 0.4 0.2 0.2 0 0 -0.2 -0.2 -0.4 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 -0.4 4 x 10 0 2000 4000 synthesized speech of "s1ofwb" using LPC algo 6000 8000 10000 12000 14000 16000 RELP compressed output 0.2 4 0.1 2 0 0 -0.1 -0.2 -2 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 2000 4000 6000 8000 10000 12000 14000 16000 4 x 10 Original signal = "s2ofwb" 0.4 Original signal = "dantaleabrr" 0.6 0.2 0.4 0 0.2 -0.2 0 -0.2 -0.4 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 -0.4 4 x 10 0 synthesized speech of "s2ofwb" using LPC algo 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 RELP compressed output 0.4 5 0.2 0 0 -0.2 -0.4 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 -5 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 4 x 10 ISSN: 0975 – 6779| NOV 11 TO OCT 12 | VOLUME – 02, ISSUE - 01 Page 324 JOURNAL OF INFORMATION, KNOWLEDGE AND RESEARCH IN ELECTRONICS AND COMMUNICATION ENGINEERING VI: CONCLUSION From the above experiment & result analysis we can say that we can take the benefit of less memory & data rate reduction if we utilize coding algorithm. We can also obtain length of wave file by utilizing various algorithms In this paper the simulation is performed for various wave files by applying various coding algorithms on it and concluded that the synthesized speech of all the coding algorithms is different, but decoded speech of all the coding algorithms is same. Simulation result of SUB BAND coder of Wave file “dantaleabrr.wav” REFRENCES [1] A. M. Kondoz, “Digital Speech-Coding for Low Bit Rate Communication Systems”, by JohnWiley & Sons Ltd., 2001. [2] L. R. Rabiner and R. W. Schafer, Digital Processing of Speech Signals", [3] B. S. Atal, M. R. Schroeder, and V. Stover, "Voice-Excited Predictive Coding Systetm for Low Bit-Rate Transmission of Speech", [4] C. J. Weinstein, "A Linear Predictive Vocoder with Voice Excitation", [5] M. H. Johnson and A. Alwan, "Speech Coding: Fundamentals and Applications", to appear as a chapter in the Encyclopedia of Telecommunications, Wiley, December 2002. [6] wireless communication by Theodore S. Rappaport [7] Springer Handbook of Speech Processing by Jacob Benesty, M. M. Sondhi, Yiteng Huang [8] V. K. Garg “Principles and Applications of GSM ”. [9] ITU-T Recommendation G.728, Coding of Speech at 16 kb/s Using Low-Delay CELP, Sep. 1992. [10].ITU-T G.729/G.729A CS-ACELP 8kbps Speech Coder by: Lior Shadhan ISSN: 0975 – 6779| NOV 11 TO OCT 12 | VOLUME – 02, ISSUE - 01 Page 325