Assignment 9: LPC Analysis, re-synthesis and formant extraction

advertisement
Linguistics 582
Basics of Digital Signal Processing
Fall 2007
Assignment 9: LPC Analysis, re-synthesis and
formant extraction
Reading:
K. Johnson, Sections 2.3.1, 2.3.4, 4.5
Wakita, H. (1976). Instrumentation for the study of speech acoustics. In Lass, N. ,
Contemporary Issues in Experimental Phonetics, NY: Academic Press., pp. 3-37.
READ:pp. 3-20, leaving out section on Cepstrum Method (pp. 7-10)
pp. 29-33
Exercises:
(1) Record yourself producing a short phrase or sentence (with mostly voiced
consonants), using whatever audio software is available on the machine you are
using to do this assignment. Save the audio data in WAV format. Read this audio
file into MATLAB using wavread:
[myphrase, srate] = wavread(‘myfile’);
Make sure you can play it back using:
soundsc(myphrase, srate);
(2) Write a MATLAB function to extract the LPC filter coefficients and Gain for an
input signal.
function [A, G]
= get_lpc(signal, srate, M, window_ms, slide_ms)
Output arguments:
A
filter coefficients (M+1 rows x nframes columns)
G
Gain coefficients (vector length = nframes)
Input arguments:
signal
input signal
srate
sampling rate in Hz
M
LPC filter order
window_ms
duration of analysis window (milliseconds)
slide_ms
number of milliseconds between successive analyses
(frames)
Create a loop to go through the signal starting from the beginning. On each pass
through the loop, analyze a sequence of signal values corresponding to window_ms.
A typical value for speech is 20 milliseconds. Analyze that sequence of signal values
using the MATLAB function lpc:
[A,G] = LPC(X,M)
A
G
X
M
vector of filter coefficients (length M+1; A(1)=1, see Eqs.3,4)
gain
input signal
filter order
Take the real part of the vector of A coefficients, transpose it, and append it as a new
column to a matrix that you are building up. Take the real part of G and append to a
vector of Gains that you are building up.
On the next pass through the loop, advance the beginning of the analysis window by
slide_ms. A typical value for this would be 10ms. Each pass through the loop can
be thought of as determining the lpc coefficients for one frame of data. If window_ms
is longer than slide_ms (which is usually the case), then the successive frames are
overlapping.
One wrinkle. Usually, the analysis window is weighted so it is influenced more by
points near the middle of the window, rather than at the ends (to avoid
discontinuities at the ends). So before performing the lpc analysis on a window of
data, multiply the window of data by a Hamming weighting function, using the
MATLAB hamming function. If data is the sequence of data points you wish to
analyze, and its length (in samples) is window_pts, then transform the data as
follows:
Hamming_data = hamming(window_pts) .* data
Note: the hamming function returns a column vector, so if data is a row vector
(which it usually is), you will need to tranpose the hamming window:
Hamming_data = hamming(window_pts)’ .* data
(3) Write a MATLAB function to synthesize a signal using the filter coefficients derived
from the an LPC analysis.
function [signal, t] = syn_lpc (srate,f0,frame_dur,A,G)
Output arguments:
signal
generated signal
t
time value of each sample
Input arguments:
srate
sampling rate (Hz)
f0
fundamental frequency (Hz)
frame_dur
duration of a synthesis frame (milliseconds)
A
filter coefficient matrix (M+1 rows x nframes cols)
G
gain vector (length = nframes)
The structure of this function should be almost identical to syn.m that you wrote for
Assignment 8. The major difference from syn.m is that the filter coefficients are
supplied by the A and G matrices, instead of being computed from formant
frequencies. (The A matrix contains the a coefficients of the filter, the denominator
of the transfer function, while the b coefficient, the numerator of the transfer
function is set to the Gain parameter, G). As in syn.m, use make_buzz to synthesize
a source function of the appropriate duration (based on the number of frames and the
frame_dur). On each pass through the loop, take a sequence of source samples
corresponding to one frame, filter it through the lpc-determined filter and
concatenate it with the output of previous frames. Note: the laryngeal filter and the
radiation filter are all built into the lpc filter; you do not need to do those steps.
Test your function on the phrase you recorded (with voiced consonants) by
analyzing it with get_lpc, and re-synthesizing it with syn_lpc. You should set the
value of M for get_lpc based on the length of your vocal tract. For a relatively long
vocal tract, there will be approximately one resonance every 1000 Hz, so dividing
the Nyquist frequency by 1000 gives the number of formants. Since each formant
has two coefficients (poles), this number should be multiplied by 2. Thus, the basic
number of coefficients is srate/1000 or (2*Nyquist/1000). Add two more coefficient
pairs to that—one for shaping the laryngeal pulse, another for cases (like
nasalization) where there are essentially “extra” poles. For a shorter vocal tract,
reduce the number coefficients by 2.
(4) Write a MATLAB function that calculates formant frequencies and bandwidths from
a vector of filter coefficients, acoef.
function [F, BW]
= formants (acoef, srate)
Output arguments:
F
formant frequencies in Hz
BW
formant bandwidths in Hz
Input arguments:
acoef
vector of a filter coefficients
srate
sampling rate
Calculate formant frequencies as the angle of the roots of the filter polynomial.
Convert from radians per sample to Hz. Sort into ascending order, and remove all the
negative angles. To sort into ascending order, use the MATLAB sort function.
[Y,I] = SORT(X) sorts into ascending order and also returns an
index matrix I.
If X is a vector, then Y = X(I).
Calculate the formant bandwidths from the magnitude of the roots. Use the sort
index I returned from sorting the formant frequencies to reorder the bandwidths.
(5) Write a MATLAB function to plot the formants from an lpc analysis.
function fplot (signal,sr,A,G)
Input arguments:
signal
sr
A
G
signal
sampling rate in Hz
filter coefficient matrix (M+1 rows x nframes cols)
gain vector (length = nframes)
Use the formants function you wrote in (4) to calculate the formant frequencies for
each frame. Plot the waveform and the first four formants in separate subplots of a
figure window. Try to get them time-aligned.
% To plot in the upper of plot of a figure with two rows of plots
% and one colum:
subplot (2,1,1)
plot (…) % your plot command% To plot in the lower one:
subplot (2,1,2)
plot(…) % your plot commannd
It should look something like this:
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
Download