MELP Vocoder Page 0 of 23 Outline Introduction MELP Vocoder Features Algorithm Description Parameters & Comparison Page 1 of 23 Introduction Traditional pitched-excited LPC vocoders use either a periodic train or white noise for synthesis filter intelligible speech at very low bit rates But sometimes results in mechanical or buzzy sound and are prone to tonal noise Page 2 of 23 Introduction These problems arise from: Inability of a simple pulse train to reproduce all kind of voiced speech MELP vocoder uses a mixed-excitation model and it represents a richer ensemble of speech characteristic Produce more natural sounding speech Page 3 of 23 MELP vocoder Robust in background noise environments Based on traditional LPC model, also includes additional features Mixed excitation Aperiodic pulses Pulse dispersion Adaptive spectral enhancement Page 4 of 23 وكدر MELP كد كننده محاسبه نهايي گام فيلترخطاي پيشگويي گفتار ورودي پنجره گذاري خطاي پيشگويي محاسبه دامنه تبديل فوريه كوانتيزاسيون دامنههاي تبديل فوريه محاسبه شدت صدايي و ميزان پراكندگي نقاط اوج MSVQ تبديل LPC به LSF آرايه اي ازبردارها مرتب سازي بردار كوانتيزه شده الگوريتم ايجاد فاصله حداقل 50هرتز LSF بردار همينگ محاسبه گام محاسبه LPC توسعه پهناي باند LPC LSF الگوريتم ايجاد فاصله حداقل 50هرتز صفحه 5از 54 وكدر MELP موقعيت پنجرههاي آناليز صفحه 6از 54 وكدر MELP محاسبه دامنههاي تبديل فوريه • فيلترايجاد پالس وظيفه توليد قطارپالس را دارد .اين كار با استفاده ازFFT و 200نمونه از سيگنال و استخراج پوش پاسخ ضربه صورت ميگيرد. صفحه 7از 54 وكدر MELP محاسبه شدتهاي صدايي و تعيين پرچم غير پريوديك -1مرحله اول تخمين ()L=40,41,…,160 -2تعيبن شدت صدايي باند پايين -3تعيين شدت صدايي 4باند ديگر صفحه 8از 54 وكدر MELP ميزان پراكندگي نقاط اوج P=12.64 P=6.77 1 79 ]e 2 [ n n 80 160 1 79 ]e[ n n 80 160 P=1.1 p P=1.16 صفحه 9از 54 وكدر MELP ميزان پراكندگي نقاط اوج 1 79 ]e 2 [ n n 80 160 1 79 ]e[ n n 80 160 p صفحه 10از 54 وكدر MELP جدول اختصاص بيت حالت بيصدا حالت صدادار پارامتر 25 25 ضرايب LSF - 8 دامنههاي تبديل فوريه 8 8 بهره (2باربه ازاي هرفريم) 7 7 دوره گام VS1 + - 4 شدتهاي صدايي - 1 پرچم غيرپريوديك 13 - محافظت ازخطا 1 1 بيت سنكرونيزاسيون 54 54 كل بيتهاي اختصاص ي صفحه 11از 54 Mixed Excitation Mixed-excitation is implemented using a multi-band mixing model This model can simulate frequency dependent voicing strength Using a mixture of Aperiodic/periodic and white noise as excitation Primary effect of this unit is to reduce the buzz in broadband acoustic noise Page 12 of 23 Aperiodic pulses When input signal is voiced, MELP vocoder can synthesize speech using either aperiodic or periodic pulses. Aperiodic pulses used during transition regions between voiced and unvoiced segments of speech signal Producing erratic glottal pulses without tonal noise Page 13 of 23 Pulse Dispersion Pulse dispersion is implemented using fixed pulse dispersion filter based on a flattened triangle pulse The pulse dispersion filter improves the match of bandpass filtered synthetic and natural speech waveforms in frequency bands which do not contain a formant resonance. Spreading the excitation energy with a pitch period Reduce harsh quality of the synthetic speech Page 14 of 23 Adaptive spectral enhancement filter Based on the poles of the vocal tract filter Is used to enhance the formant structure in the synthetic speech This filter improves the match between synthetic and natural bandpass waveforms more natural speech output Page 15 of 23 MELP Algorithm Description (Encoder) filter out any low frequency noise .1 This filtered speech is again filtered in .2 order to perform the initial pitch search for the pitch estimation The next step is to perform the .3 Bandpass voicing analysis - In this step we decide to use periodic/Aperiodic train or white noise model Page 16 of 23 MELP Algorithm Description (Encoder) cont’d In this stage A voice degree parameter is estimated in each band, based on the normalized correlation function of the speech signal and the smoothed rectified signal in the non-DC band Let sk(n) denote the speech signal in band k, uk(n) denote the DC-removed smoothed rectified signal of sk(n). The correlation function: N 1 x ( n) x ( n p ) R x ( p) N 1 n 0 N 1 [ x 2 (n) x 2 (n p )]1 / 2 n 0 n 0 P – the pitch of current frame N – the frame length k – the voicing strength for band (defined as max(Rsk(P),Ruk(P))) Page 17 of 23 MELP Algorithm Description (Encoder ) cont’d The jittery state is determined by the peakiness of the fullwave rectified LP residue e(n): Peakiness 1 [ N N 1 2 1/ 2 e ( n ) ] n 0 N 1 1 N e( n ) n 0 If peakiness is greater than some threshold, the speech frame is then flagged as jittered (Aperiodic flag will be set) Page 18 of 23 MELP Algorithm Description (Encoder) cont’d Applying a LPC analysis Calculating final pitch estimate Calculating Gain estimate quantize the LPC coefficients, pitch, gain and bandpass voicing Fourier magnitudes are determined and quantized The information in these coefficients improves the accuracy of the speech production model at the perceptually-important lower frequencies Page 19 of 23 4. 5. 6. 7. .8 MELP Encoder Input Pre filter signal LPC Analysis Filter Pitch Search Final Pitch And voicing Decision Fourier Magnitude calculation Page 20 of 23 Bandpass Voicing Decision LSF quantization Apply Forward Error Correction Gain Calculator Quantize Gain, pitch, Voicing, jitter Transmitted Bitstream MELP Algorithm (Decoder) Decoding the pitch .1 Applying gain attenuation .2 Interpolating linearly all of the synthesis .3 parameters pitch-synchronously Generating mixed-excitation .4 Page 21 of 23 MELP Algorithm (Decoder) cont’d Applying an adaptive spectral .5 enhancement filter LPC synthesis and applying gain factor .6 Dispersion filtering .7 Page 22 of 23 MELP Decoder Received Bitstream Decode parameters Noise Generator Pulse Generator Pulse Position Jitter Noise Shaping Filter Pulse Shaping Filter LPC Synthesis Filter Page 23 of 23 + Adaptive Spectral Enhancement gain Pulse Dispersion Filter Synthesized Speech Parameter Quantization Page 24 of 23 Parameters Voiced Unvoiced LSF parameters 25 25 Fourier magnitudes 8 - Gain (2 per frames) 8 8 Pitch. overall voicing 7 7 Bandpass voicing 4 - Aperiodic flag 1 - Error protection - 13 Sync bit 1 1 Total bits / 22.5 ms frame 54 54 Bit transmission order Page 25 of 23 Comparison of the 2400 BPS MELP with other Standard Coders Diagnostic Acceptability Measure Two Conditions Quiet Office Continuously Variable Slope Delta Modulation (CVSD) 16,000 bps ○ Code Excited Linear Prediction (CELP) 4800 bps ○ FS1016 ○ Mixed Excitation Linear Prediction (MELP) 2400 bps ○ FIPS Publication 137 ○ Linear Predictive Coding (LPC) 2400 bps ○ Page 26 of 23 Comparison of the 2400 BPS MELP with other Standard Coders (cont’d) Mean Opinion Score in Six Conditions Quiet Anechoic Sound Chamber Dynamic Microphone Quiet - H250 Anechoic Sound Chamber H250 Microphone 1% Random Bit Errors Anechoic Sound Chamber Dynamic Microphone 0.5% Random Block Errors Anechoic Sound Chamber Dynamic Microphone 50% Errors within a 35ms block Office Modern Office Environment Dynamic Microphone Mobile Command Environment Field Shelter EV M87 Microphone Page 27 of 23 Comparison of the 2400 BPS MELP with other Standard Coders (cont’d) Complexity with three Measurements RAM ROM MIPS Page 28 of 23 Voice samples LPC 10 Page 29 of 23 Voice samples Original Sound MELP 1800 MELP 2000 MELP 2200 Page 30 of 30