ch5.4 (MELP).ppt

advertisement
MELP Vocoder
Page 0 of 23
Outline
Introduction 
MELP Vocoder Features 
Algorithm Description 
Parameters & Comparison 
Page 1 of 23
Introduction
Traditional pitched-excited LPC 
vocoders use either a periodic train or
white noise for synthesis filter
 intelligible speech at very low bit rates
But sometimes results in mechanical or 
buzzy sound and are prone to tonal
noise
Page 2 of 23
Introduction
These problems arise from: 
Inability of a simple pulse train to reproduce 
all kind of voiced speech
MELP vocoder uses a mixed-excitation 
model and it represents a richer
ensemble of speech characteristic
 Produce more natural sounding speech
Page 3 of 23
MELP vocoder
Robust in background 
noise environments
Based on traditional LPC 
model, also includes
additional features
Mixed
excitation
Aperiodic
pulses
Pulse dispersion
Adaptive spectral
enhancement
Page 4 of 23
‫وكدر ‪MELP‬‬
‫كد كننده‬
‫محاسبه نهايي گام‬
‫فيلترخطاي‬
‫پيشگويي‬
‫گفتار ورودي‬
‫پنجره گذاري‬
‫خطاي پيشگويي‬
‫محاسبه دامنه‬
‫تبديل فوريه‬
‫كوانتيزاسيون‬
‫دامنههاي تبديل‬
‫فوريه‬
‫محاسبه شدت صدايي‬
‫و ميزان‬
‫پراكندگي نقاط اوج‬
‫‪MSVQ‬‬
‫تبديل ‪LPC‬‬
‫به ‪LSF‬‬
‫آرايه اي‬
‫ازبردارها‬
‫مرتب سازي‬
‫بردار‬
‫كوانتيزه شده‬
‫الگوريتم ايجاد فاصله‬
‫حداقل ‪ 50‬هرتز‬
‫‪LSF‬‬
‫بردار‬
‫همينگ‬
‫محاسبه گام‬
‫محاسبه ‪LPC‬‬
‫توسعه پهناي‬
‫باند ‪LPC‬‬
‫‪LSF‬‬
‫الگوريتم ايجاد فاصله‬
‫حداقل ‪ 50‬هرتز‬
‫صفحه ‪ 5‬از ‪54‬‬
‫وكدر ‪MELP‬‬
‫موقعيت پنجرههاي آناليز‬
‫صفحه ‪ 6‬از ‪54‬‬
‫وكدر ‪MELP‬‬
‫محاسبه دامنههاي تبديل فوريه‬
‫• فيلترايجاد پالس وظيفه توليد‬
‫قطارپالس را دارد‪ .‬اين كار با‬
‫استفاده از‪FFT‬‬
‫و ‪ 200‬نمونه از سيگنال و استخراج‬
‫پوش پاسخ ضربه صورت ميگيرد‪.‬‬
‫صفحه ‪ 7‬از ‪54‬‬
‫وكدر ‪MELP‬‬
‫محاسبه شدتهاي صدايي و تعيين پرچم غير پريوديك‬
‫‪ -1‬مرحله اول تخمين (‪)L=40,41,…,160‬‬
‫‪ -2‬تعيبن شدت صدايي باند پايين‬
‫‪ -3‬تعيين شدت صدايي ‪ 4‬باند ديگر‬
‫صفحه ‪ 8‬از ‪54‬‬
‫وكدر ‪MELP‬‬
‫ميزان پراكندگي نقاط اوج‬
‫‪P=12.64‬‬
‫‪P=6.77‬‬
‫‪1‬‬
‫‪79‬‬
‫]‪e 2 [ n‬‬
‫‪‬‬
‫‪n  80‬‬
‫‪160‬‬
‫‪1‬‬
‫‪79‬‬
‫]‪e[ n‬‬
‫‪‬‬
‫‪n  80‬‬
‫‪160‬‬
‫‪P=1.1‬‬
‫‪p‬‬
‫‪P=1.16‬‬
‫صفحه ‪ 9‬از ‪54‬‬
‫وكدر ‪MELP‬‬
‫ميزان پراكندگي نقاط اوج‬
‫‪1‬‬
‫‪79‬‬
‫]‪e 2 [ n‬‬
‫‪‬‬
‫‪n  80‬‬
‫‪160‬‬
‫‪1‬‬
‫‪79‬‬
‫]‪e[ n‬‬
‫‪‬‬
‫‪n  80‬‬
‫‪160‬‬
‫‪p‬‬
‫صفحه ‪ 10‬از ‪54‬‬
‫وكدر ‪MELP‬‬
‫جدول اختصاص بيت‬
‫حالت بيصدا‬
‫حالت صدادار‬
‫پارامتر‬
‫‪25‬‬
‫‪25‬‬
‫ضرايب ‪LSF‬‬
‫‪-‬‬
‫‪8‬‬
‫دامنههاي تبديل فوريه‬
‫‪8‬‬
‫‪8‬‬
‫بهره (‪2‬باربه ازاي هرفريم)‬
‫‪7‬‬
‫‪7‬‬
‫دوره گام ‪VS1 +‬‬
‫‪-‬‬
‫‪4‬‬
‫شدتهاي صدايي‬
‫‪-‬‬
‫‪1‬‬
‫پرچم غيرپريوديك‬
‫‪13‬‬
‫‪-‬‬
‫محافظت ازخطا‬
‫‪1‬‬
‫‪1‬‬
‫بيت سنكرونيزاسيون‬
‫‪54‬‬
‫‪54‬‬
‫كل بيتهاي اختصاص ي‬
‫صفحه ‪ 11‬از ‪54‬‬
Mixed Excitation
Mixed-excitation is implemented using a 
multi-band mixing model
This model can simulate frequency 
dependent voicing strength
Using a mixture of Aperiodic/periodic 
and white noise as excitation
Primary effect of this unit is to reduce 
the buzz in broadband acoustic noise
Page 12 of 23
Aperiodic pulses
When input signal is voiced, MELP 
vocoder can synthesize speech using
either aperiodic or periodic pulses.
Aperiodic pulses used during transition 
regions between voiced and unvoiced
segments of speech signal
 Producing erratic glottal pulses without tonal
noise
Page 13 of 23
Pulse Dispersion
Pulse dispersion is implemented using fixed pulse
dispersion filter based on a flattened triangle pulse

The pulse dispersion filter improves the match of
bandpass filtered synthetic and natural speech
waveforms in frequency bands which do not
contain a formant resonance.

 Spreading the excitation energy with a pitch period
Reduce harsh quality of the synthetic speech
Page 14 of 23
Adaptive spectral enhancement filter
Based on the poles of the vocal tract 
filter
Is used to enhance the formant structure 
in the synthetic speech
This filter improves the match between 
synthetic and natural bandpass
waveforms  more natural speech
output
Page 15 of 23
MELP Algorithm Description (Encoder)
filter out any low frequency noise .1
This filtered speech is again filtered in .2
order to perform the initial pitch search
for the pitch estimation
The next step is to perform the .3
Bandpass voicing analysis
- In this step we decide to use periodic/Aperiodic train or
white noise model
Page 16 of 23
MELP Algorithm Description (Encoder)
cont’d
In this stage A voice degree parameter is estimated in each
band, based on the normalized correlation function of the
speech signal and the smoothed rectified signal in the non-DC
band
Let sk(n) denote the speech signal in band k, uk(n) denote the
DC-removed smoothed rectified signal of sk(n). The correlation
function:
N 1
x ( n) x ( n  p )

R x ( p)  N 1 n 0 N 1
[ x 2 (n) x 2 (n  p )]1 / 2
n 0
n 0
P – the pitch of current frame
N – the frame length
k – the voicing strength for band (defined as max(Rsk(P),Ruk(P)))
Page 17 of 23


MELP Algorithm Description (Encoder
) cont’d
The jittery state is determined by the peakiness of
the fullwave rectified LP residue e(n):
Peakiness 

1
[
N
N 1
2 1/ 2
e
(
n
)
]

n 0
N 1
1
N
 e( n )
n 0
If peakiness is greater than some threshold, the speech
frame is then flagged as jittered (Aperiodic flag will be set)
Page 18 of 23

MELP Algorithm Description (Encoder)
cont’d
Applying a LPC analysis
Calculating final pitch estimate
Calculating Gain estimate
quantize the LPC coefficients, pitch, gain and
bandpass voicing
Fourier magnitudes are determined and
quantized
The information in these coefficients improves 
the accuracy of the speech production model
at the perceptually-important lower
frequencies
Page 19 of 23
4.
5.
6.
7.
.8
MELP Encoder
Input
Pre filter
signal
LPC
Analysis
Filter
Pitch
Search
Final Pitch
And voicing
Decision
Fourier
Magnitude
calculation
Page 20 of 23
Bandpass
Voicing
Decision
LSF
quantization
Apply
Forward
Error Correction
Gain
Calculator
Quantize
Gain, pitch,
Voicing,
jitter
Transmitted
Bitstream
MELP Algorithm (Decoder)
Decoding the pitch .1
Applying gain attenuation .2
Interpolating linearly all of the synthesis .3
parameters pitch-synchronously
Generating mixed-excitation .4
Page 21 of 23
MELP Algorithm (Decoder) cont’d
Applying an adaptive spectral .5
enhancement filter
LPC synthesis and applying gain factor .6
Dispersion filtering .7
Page 22 of 23
MELP Decoder
Received
Bitstream
Decode
parameters
Noise
Generator
Pulse
Generator
Pulse
Position
Jitter
Noise
Shaping
Filter
Pulse
Shaping
Filter
LPC
Synthesis
Filter
Page 23 of 23
+
Adaptive
Spectral
Enhancement
gain
Pulse
Dispersion
Filter
Synthesized
Speech
Parameter Quantization
Page 24 of 23
Parameters
Voiced
Unvoiced
LSF parameters
25
25
Fourier magnitudes
8
-
Gain (2 per frames)
8
8
Pitch. overall voicing
7
7
Bandpass voicing
4
-
Aperiodic flag
1
-
Error protection
-
13
Sync bit
1
1
Total bits / 22.5 ms
frame
54
54
Bit transmission order
Page 25 of 23
Comparison of the 2400 BPS MELP with other
Standard Coders
Diagnostic Acceptability
Measure
Two Conditions


Quiet 
Office 
Continuously Variable Slope Delta Modulation
(CVSD)
16,000 bps
○
Code Excited Linear Prediction (CELP)
4800 bps
○
FS1016
○
Mixed Excitation Linear Prediction (MELP)
2400 bps
○
FIPS Publication 137
○
Linear Predictive Coding (LPC)
2400 bps
○
Page 26 of 23




Comparison of the 2400 BPS MELP with
other Standard Coders (cont’d)
Mean Opinion Score in Six

Conditions
Quiet
Anechoic Sound Chamber
Dynamic Microphone


Quiet - H250
Anechoic Sound Chamber
H250 Microphone


1% Random Bit Errors
Anechoic Sound Chamber
Dynamic Microphone


0.5% Random Block Errors
Anechoic Sound Chamber
Dynamic Microphone
50% Errors within a 35ms block



Office
Modern Office Environment
Dynamic Microphone


Mobile Command Environment
Field Shelter
EV M87 Microphone
Page 27 of 23


Comparison of the 2400 BPS MELP with
other Standard Coders (cont’d)
Complexity with
three
Measurements
RAM 
ROM 
MIPS 
Page 28 of 23

Voice samples
LPC 10
Page 29 of 23
Voice samples
Original Sound
MELP 1800
MELP 2000
MELP 2200
Page 30 of 30
Download