Speech Processing (Vocoders)

advertisement
Vocoders
1
The Channel Vocoder (analyzer):

The channel vocoder employs a bank of
bandpass filters,
 Each
having a bandwidth between 100 HZ and 300
HZ.
 Typically, 16-20 linear phase FIR filter are used.

The output of each filter is rectified and lowpass
filtered.
 The
bandwidth of the lowpass filter is selected to
match the time variations in the characteristics of the
vocal tract.

For measurement of the spectral magnitudes, a
voicing detector and a pitch estimator are
included in the speech analysis.
2
The Channel Vocoder (analyzer block diagram):
Rectifier
Lowpass
Filter
A/D
Converter
Bandpass
Filter
Rectifier
Lowpass
Filter
A/D
Converter
S(n)
Encoder
Bandpass
Filter
To
Channel
Voicing
detector
Pitch
detector
3
The Channel Vocoder (synthesizer):

At the receiver the signal samples are passed
through D/A converters.

The outputs of the D/As are multiplied by the
voiced or unvoiced signal sources.

The resulting signal are passed through
bandpass filters.

The outputs of the bandpass filters are summed
to form the synthesized speech signal.
4
The Channel Vocoder (synthesizer block diagram):
D/A
Converter
Bandpass
Filter
∑
Channel
Decoder
From
D/A
Converter
Output
speech
Bandpass
Filter
Voicing
Information
Switch
Pitch
period
Random
Noise
generator
Pulse
generator
5
The Phase Vocoder :

The phase vocoder is similar to the
channel vocoder.

However, instead of estimating the pitch,
the phase vocoder estimates the phase
derivative at the output of each filter.

By coding and transmitting the phase
derivative, this vocoder destroys the phase
information .
6
The Phase Vocoder (analyzer block diagram):
cos k n
Short-term
magnitude
cos  n
Lowpass
Filter
k
ak n
Differentiator
Differentiator
k
sin k n
bk n
Compute
Short-term
Magnitude
And
Phase
Derivative
Encoder
S(n)
Lowpass
cos  n
Filter
sin k n
Decimator
To
Channel
Decimator
Short-term phase
derivative
7
The Phase Vocoder
(synthesizer block diagram, kth channel):
Decimate
Short-term
amplitude
cos k n
Channel
Decoder
From
Cos
Interpolator
∑
Integrator
Decimate
Sin
Interpolator
Short-term
Phase
sin k n
derivative
8
The Formant Vocoder :

The formant vocoder can be viewed as a
type of channel vocoder that estimate the
first three or four formants in a segment of
speech.

It is this information plus the pitch period
that is encoded and transmitted to the
receiver.
9
The Formant Vocoder :

Example of formant:
: The spectrogram of the utterance “day one”
showing the pitch and the harmonic structure of
speech.
 (b) : A zoomed spectrogram of the fundamental and
the second harmonic.
 (a)
(a)
(b)
10
The Formant Vocoder (analyzer block diagram):
F3
F3
B3
F2
F2
B2
F1
F1
B1
Input
Speech
Pitch
And
V/U
Decoder
V/U
F0
Fk :The frequency of the kth formant
Bk :The bandwidth of the kth formant
11
The Formant Vocoder (
synthesizer block diagram)
F3
B3
F2
B2
F1
B1
V/U
F0
:
F3
F2
∑
F1
Excitation
Signal
12
Linear Predictive Coding :

The objective of LP analysis is to estimate
parameters of an all-pole model of the vocal
tract.

Several methods have been devised for
generating the excitation sequence for speech
synthesizes.

LPC-type of speech analysis and synthesis are
differ primarily in the type of excitation signal that
is generated for speech synthesis.
13
LPC 10 :

This methods is called LPC-10 because of
10 coefficient are typically employed.

LPC-10 partitions the speech into the 180
sample frame.

Pitch and voicing decision are determined
by using the AMDF and zero crossing
measures.
14
Residual Excited LP Vocoder :

Speech quality in speech quality can be
improved at the expense of a higher bit
rate by computing and transmitting a
residual error, as done in the case of DPCM.

One method is that the LPC model and
excitation parameters are estimated from
a frame of speech.
15
Residual Excited LP Vocoder :

The speech is synthesized at the transmitter and
subtracted from the original speech signal to
form the residual error.

The residual error is quantized, coded, and
transmitted to the receiver

At the receiver the signal is synthesized by
adding the residual error to the signal generated
from the model.
16
RELP Block Diagram :
S(n)
Buffer
And
window
∑
LP
Parameters
Encoder
LP
analysis
Excitation
To
Channel
parameters
LP
Synthesis
model
17
Code Excited LP :

CELP is an analysis-by-synthesis method
in which the excitation sequence is
selected from a codebook of zero-mean
Gaussian sequence.

The bit rate of the CELP is 4800 bps.
18
CELP (analysis-by-synthesis coder) :
Speech samples
LP
Gain
Gaussian
Excitation
codebook
Buffer and
LP
analysis
parameters
Pitch
Synthesis
filter
Spectral
Envelope
(LP)
Synthesis filter
Side
information
+
-
∑
Perceptual
Weighting
Filter W(z)
Compute
Index of
Energy of Error
Excitation
(square and sum)
sequence
19
CELP (synthesizer) :
From
Channel
decoder
Buffer
And
controller
Gaussian
Excitation
codebook
Pitch
Synthesis
filter
LP
Synthesis
filter
LP parameters,
gain and pitch
estimate
updates
20
Vector Sum Excited LP :

The VSELP coder and decoder basically differ in
method by which the excitation sequence is
formed.

In next block diagram of the VSELP, there are
three excitation source.

One excitation is obtained from the pitch period
state.

The other two excitation source are obtained
from two codebook.
21
Vector Sum Excited LP :

The bit rate of the VSELP is about 8000 bps.
 Bit
allocations for 8000-bps VSELP
Parameters
Bits/5-ms Frame
Bits/20ms
10 LPC coefficients
Average speech energy
Excitation codewords
from two VSELP
codebooks
Gain parameters
Lag of pitch filter
-
38
5
14
8
7
56
32
28
Total
29
159
22
VSELP Decoder :
Long-term
Filter state
0
Codebook
1
∑
Pitch
synthesis
filter
Spectral
envelop
(LP)
synthesis
filter
Spectral Synthetic
post filter Speech
1
Codebook
2
2
23
Download