Linguistics 582 Basics of Digital Signal Processing Fall 2007 Assignment 9: LPC Analysis, re-synthesis and formant extraction Reading: K. Johnson, Sections 2.3.1, 2.3.4, 4.5 Wakita, H. (1976). Instrumentation for the study of speech acoustics. In Lass, N. , Contemporary Issues in Experimental Phonetics, NY: Academic Press., pp. 3-37. READ:pp. 3-20, leaving out section on Cepstrum Method (pp. 7-10) pp. 29-33 Exercises: (1) Record yourself producing a short phrase or sentence (with mostly voiced consonants), using whatever audio software is available on the machine you are using to do this assignment. Save the audio data in WAV format. Read this audio file into MATLAB using wavread: [myphrase, srate] = wavread(‘myfile’); Make sure you can play it back using: soundsc(myphrase, srate); (2) Write a MATLAB function to extract the LPC filter coefficients and Gain for an input signal. function [A, G] = get_lpc(signal, srate, M, window_ms, slide_ms) Output arguments: A filter coefficients (M+1 rows x nframes columns) G Gain coefficients (vector length = nframes) Input arguments: signal input signal srate sampling rate in Hz M LPC filter order window_ms duration of analysis window (milliseconds) slide_ms number of milliseconds between successive analyses (frames) Create a loop to go through the signal starting from the beginning. On each pass through the loop, analyze a sequence of signal values corresponding to window_ms. A typical value for speech is 20 milliseconds. Analyze that sequence of signal values using the MATLAB function lpc: [A,G] = LPC(X,M) A G X M vector of filter coefficients (length M+1; A(1)=1, see Eqs.3,4) gain input signal filter order Take the real part of the vector of A coefficients, transpose it, and append it as a new column to a matrix that you are building up. Take the real part of G and append to a vector of Gains that you are building up. On the next pass through the loop, advance the beginning of the analysis window by slide_ms. A typical value for this would be 10ms. Each pass through the loop can be thought of as determining the lpc coefficients for one frame of data. If window_ms is longer than slide_ms (which is usually the case), then the successive frames are overlapping. One wrinkle. Usually, the analysis window is weighted so it is influenced more by points near the middle of the window, rather than at the ends (to avoid discontinuities at the ends). So before performing the lpc analysis on a window of data, multiply the window of data by a Hamming weighting function, using the MATLAB hamming function. If data is the sequence of data points you wish to analyze, and its length (in samples) is window_pts, then transform the data as follows: Hamming_data = hamming(window_pts) .* data Note: the hamming function returns a column vector, so if data is a row vector (which it usually is), you will need to tranpose the hamming window: Hamming_data = hamming(window_pts)’ .* data (3) Write a MATLAB function to synthesize a signal using the filter coefficients derived from the an LPC analysis. function [signal, t] = syn_lpc (srate,f0,frame_dur,A,G) Output arguments: signal generated signal t time value of each sample Input arguments: srate sampling rate (Hz) f0 fundamental frequency (Hz) frame_dur duration of a synthesis frame (milliseconds) A filter coefficient matrix (M+1 rows x nframes cols) G gain vector (length = nframes) The structure of this function should be almost identical to syn.m that you wrote for Assignment 8. The major difference from syn.m is that the filter coefficients are supplied by the A and G matrices, instead of being computed from formant frequencies. (The A matrix contains the a coefficients of the filter, the denominator of the transfer function, while the b coefficient, the numerator of the transfer function is set to the Gain parameter, G). As in syn.m, use make_buzz to synthesize a source function of the appropriate duration (based on the number of frames and the frame_dur). On each pass through the loop, take a sequence of source samples corresponding to one frame, filter it through the lpc-determined filter and concatenate it with the output of previous frames. Note: the laryngeal filter and the radiation filter are all built into the lpc filter; you do not need to do those steps. Test your function on the phrase you recorded (with voiced consonants) by analyzing it with get_lpc, and re-synthesizing it with syn_lpc. You should set the value of M for get_lpc based on the length of your vocal tract. For a relatively long vocal tract, there will be approximately one resonance every 1000 Hz, so dividing the Nyquist frequency by 1000 gives the number of formants. Since each formant has two coefficients (poles), this number should be multiplied by 2. Thus, the basic number of coefficients is srate/1000 or (2*Nyquist/1000). Add two more coefficient pairs to that—one for shaping the laryngeal pulse, another for cases (like nasalization) where there are essentially “extra” poles. For a shorter vocal tract, reduce the number coefficients by 2. (4) Write a MATLAB function that calculates formant frequencies and bandwidths from a vector of filter coefficients, acoef. function [F, BW] = formants (acoef, srate) Output arguments: F formant frequencies in Hz BW formant bandwidths in Hz Input arguments: acoef vector of a filter coefficients srate sampling rate Calculate formant frequencies as the angle of the roots of the filter polynomial. Convert from radians per sample to Hz. Sort into ascending order, and remove all the negative angles. To sort into ascending order, use the MATLAB sort function. [Y,I] = SORT(X) sorts into ascending order and also returns an index matrix I. If X is a vector, then Y = X(I). Calculate the formant bandwidths from the magnitude of the roots. Use the sort index I returned from sorting the formant frequencies to reorder the bandwidths. (5) Write a MATLAB function to plot the formants from an lpc analysis. function fplot (signal,sr,A,G) Input arguments: signal sr A G signal sampling rate in Hz filter coefficient matrix (M+1 rows x nframes cols) gain vector (length = nframes) Use the formants function you wrote in (4) to calculate the formant frequencies for each frame. Plot the waveform and the first four formants in separate subplots of a figure window. Try to get them time-aligned. % To plot in the upper of plot of a figure with two rows of plots % and one colum: subplot (2,1,1) plot (…) % your plot command% To plot in the lower one: subplot (2,1,2) plot(…) % your plot commannd It should look something like this: QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture.