A Simple Regular Pulse Excited and Multi Pulse Excited LPC

advertisement
A Simple Regular Pulse Excited and Multi Pulse Excited LPC encoder-decoder
Introduction
In this project, regular pulse excited and multi pulse excited LPC encoder-decoder algorithms were
implemented and applied on a typical Sine wave and a real speech file. All the files, speech samples are
enclosed.
Mainly, this project consists of four sub parts.
1) A simple multi pulse excitation algorithm for a Sine wave
2) A simple regular pulse excitation algorithm for a Sine wave
3) Application of step 1 to a speech wave
4) Application of step 2 to a speech wave
There are two kinds of multi pulse algorithms. The first one (step 1), recursively calculates gain and pulse
positions, finds the error signal and repeats the same operation on the error signal. The aim is to find the
excitation signal that minimises the error.
The second algorithm (step 2), pulse positions for every single excitation frame is predetermined. Pulse
positions are as follows.
Pulse locations for 10 excitation pulses (Frame size is 160).
L10= [1 17 33 49 66 82 98 114 130 146]
Pulse locations for 23 excitation pulses (Frame size is 160).
L23= [1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106 113 120 127 134 141 148 155]
As the locations are known, correlation matrix is solved.
Algorithm
For this project, LPC synthesis and analysis filters were needed. These are 1/A(z) and A(z) filters. The Matlab
codes are given below:
function sn=az160(a,e)
% 1/A(z)
% this is the synthesising block
%
% a=lpc parameters
% e=excitation signal
% signal output->sn
%
N=160;
if N==0
N=160;
end
function e=iaz160(a,s)
% A(z)
% this is the analysing block
%
% a=lpc parameters
% s=original signal
% excitation output->e
%
N=160;
if N==0
N=160;
end
a=-a(2:11);
sn(1:N)=0;
a=-a(2:11);
e(1:N)=0;
for n=1:N,
temp=0;
for p=1:10,
if (n-p)>0
temp=temp+a(p)*sn(n-p);
end
end
sn(n)=e(n)+temp;
end
az160.m (1/A(z) filter)
for n=1:N,
temp=0;
for p=1:10,
if (n-p)>0
temp=temp+a(p)*s(n-p);
end
end
e(n)=s(n)-temp;
end
iaz160.m (A(z) filter)
After az160 and iaz160 codes were implemented, they were tested. The excitation frame from A(z)
filter was fed into 1/A(z) filter. The synthesised waveform was subtracted from the original signal to find the
difference. Then excitation frame’s first 10 parameters were used to synthesise waveform and the difference is
plotted.
Top to bottom: 1) Original signal , 2)Excitation from Analysis Filter, 3)Synthesised waveform for the excitation signal
above. 4) Difference between 1 and 3. 5) First 10 parameters of the excitation, 6) Synthesised waveform from the excitation
signal given in figure 5. 7) Difference between original signal and the figure 6.
The MATLAB code for this figure is code1.m
After implementing LPC filters, the gain and positioning algorithms were decided. In Digital Speech
book ( A. M. Kondoz) there are two algorithms. First one is the recursive calculations of gain and position. The
other one is finding the suitable pulse positions and exploiting the gain values from correlation matrix. (Page
163).
Algorithm 1:
The input waveform is a sine wave, with a period of 160 samples.
1) Find the pulse position that minimises the square of error. (Error is the difference between original
and the synthesised waveform)
2) For this pulse position find the corresponding gain value.
3) Given the pulse position and gain, calculate sn, using az160.m and subtract
4) Repeat the same operation (1-3) for 20 times.
Top to bottom: 1) Original signal , 2) Excitation calculated using the algorithm 1. 3) Original signal (green) and synthesised
signal (blue) 4) Difference between the two signals.
The MATLAB code for this figure is code2.m
Algorithm 2:
The main problem of algorithm 1 is its inaccuracy when number of pulses per frame increases.
(Kondoz,Page 163). As a result the waveform shapes match but the gains don’t fit. In the second algorithm,
pulse positions were given as below.
L10= [1 17 33 49 66 82 98 114 130 146]
The corresponding gain values are calculated using correlation matrix. Algorithm consists of 2 steps. Again the
input waveform is a sine wave, with a period of 160 samples.
1) Extract the values of the correlation matrix for given pulse positions
2) Find the synthesised signal from the az160 code.
Top to bottom: 1) Original signal (green circles) and synthesised signal (blue), 2) Excitation calculated using the algorithm 2.
3) Difference between the two signals. As seen from the figures the original and the synthesised waveforms match quite good.
The MATLAB code for this figure is code3.m
Application of Algorithms to the Speech
After applying these algorithms on sine waves, speech samples were used to test the implementations.
Firstly, single frames were tested, afterwards whole speech was companded.
Algorithm 1
Input waveform is ‘miners.au’, the speech segment is between the 950th and 1109 th sample. 20
excitation pulses were calculated.
Top to bottom: 1) Original signal 2) Excitation calculated using the algorithm 1. 3) Original signal(green) and synthesised
signal (blue circles) 4) Difference between the two signals.
The MATLAB code for this figure is code4.m
The main problem of algorithm 1 appears in this figure. Algorithm 1 exaggerates peaks and valleys.
Algorithm 2
Input waveform is ‘miners.au’, the speech segment is between the 950th and 1109th sample. 10
excitation pulses were calculated.
Top to bottom: 1) Original signal (blue), synthesised (black circles) 2) Excitation calculated using the algorithm 2.
3) Difference between the two signals.
The MATLAB code for this figure is code5.m
The synthesised waveforms from each algorithm are different. Obviously the algorithm 2 works better
than algorithm 1. In the next part the algorithms will be compared.
Comparisons of the Algorithms
Although the differences between the algorithms are obvious, another code was implemented to
understand the similarities and differences of the algorithms. In the first set of figures, consecutively original
signal, signal form algorithm 1, signal from algorithm2 and the differences between the synthesised signals were
given. The performances of the algorithms are quite different. The first algorithm (recursive) is very slow. The
second algorithm (correlation matrix) is at least 10 folds faster than the first algorithm
Top to bottom: 1) Original signal 2) Original signal (green), synthesised (blue circles) signal for algorithm 1 (recursive)
3) Original signal (green), synthesised (blue circles) for algorithm 2 (correlation matrix) 3) Difference between the two
synthesised signals. All the calculations are for 20 excitation pulses per frame.
The MATLAB code for this figure is code6.m
In the next set of figures, excitation signals are compared. The results are quite different as well.
Top to bottom: 1) Excitation signal derived from algorithm 1.
2) Excitation signal derived from algorithm 2
The MATLAB code for this figure is code6.m
From the figures, it is obvious that excitation signals are different. In the first algorithm, the pulse
positions are the locations where the square of error is minimised, but in the second algorithm pulse positions are
given and the gain values for these positions were calculated. The second algorithm is better, because it
considers the interaction between the pulses. (Kondoz, Page 163).
Summary and Conclusion
In this project two different algorithms for multi pulse were examined. Firstly, the LPC filters were
implemented and then the algorithms were coded. Both algorithms were tested on sine wave and speech signals.
The results were compared. The main difference between the algorithms is the consideration of interaction
between the pulses. This is maintained in Algorithm 2 (Correlation matrix). The mathematical results show that
algorithm 2 works better both in terms of performance and waveform matching.
Appendix
The Application of Algorithms on Speech Files
The algorithms described were used on a whole speech file. Because of MATLAB Student Version’s
limitations, segment sizes were kept below 15000. There are 4 extra m files. They are
code7.m: This file reads the ‘miners.au’ and performs algorithm 2 on this file. The process takes 2
minutes. The synthesised speech is intelligible. The waveforms this program plots are given below. The name of
the wave file is “code7-algorithm2-N82-E23.wav”
The synthesised speech (above) and the original speech(below).
code8.m: This file reads the ‘miners.au’ and performs algorithm 1 on this file. The process takes 8
minutes. The synthesised speech is intelligible. But the quality is very low. This code was run for 62 frames. The
name of the wave file is: “code8-algorithm1-N62-E23.wav”
code9.m: This code reads the ‘boys.au’ and performs algorithm 2 on this file. At the end it writes the
data to a file named “code9-N82-EXCITATION23.bin”.The first 2 bytes of the file indicates the number
of frames. The second 2 bytes are the number of excitation pulses per frame. The rest are the lpc parameters and
excitation gains. (Because the pulse locations are constant)
code10.m: This code reads the “code9-N82-EXCITATION23.bin”, extracts the necessary
parameters, constructs the waveform and plays it. The waveform generated from the binary file is
“code9-boys-algorithm2-N83-E23.wav”.
Download