Linear Prediction Coding (LPC) • History: Originally developed to compress (code) speech • Broader Implications – – – – Models the harmonic resonances of the vocal tract Provide features useful for speech recognition Part of speech synthesis algorithms IIR Filter, to eliminate noise from a signal • Concept – Predict samples with linear combinations of previous values to create a LPC signal – The residue or error is the difference between the signal and the prediction LPC Calculations • Predict the values of the next sample Ŝ[n] = ∑ k=1,P ak s[n−k] – P is the LPC order – Accurate vocal tract model: P = sample rate*1000 + 2 – The LPC algorithm computes the ak coefficients • The error signal (e[n]) is called the LPC residual e[n]=s[n]− ŝ[n] = s[n]− ∑ k=1,p ak s[n−k] • Goal: find ak coefficients that minimize the LPC residual Linear Predictive Compression (LPC) Concept • For a frame of the signal, find the optimal coefficients, predicting the next values using sets of previous values • Instead of outputting the actual data, output the residual, plus the coefficients • Less bits are needed, which results in compression Pseudo Code WHILE not EOF READ frame of signal x = prediction(frame) error = x – s[n] WRITE LPC coefficients WRITE error Linear Algebra Background • N linear independent equations; P unknowns • If N<P, ∞ number of potential solutions x + y = 5 // one equation, two unknowns Solutions are along the line y = 5-x • If N=P, there is at most one unique solution x + y = 5 and x – y = 3, solution x=4, y=1 • If N>P, there are no solutions No solutions for: x+y = 4, x – y = 3, 2x + 7 = 7 The best we can do is find the closes fit Least Squares: minimize error • First Approach: Linear algebra – find orthogonal projections of vectors onto the best fit • Second Approach: Calculus – Use derivative with zero slope to find best fit Solving n equations and n unknowns Numerical Algorithms • Gaussian Elimination – Complexity: O(n3) • Successive Iteration – Complexity varies • Cholskey Decomposition – More efficient, still O(n3) • Levenson-Durbin – Complexity: O(n2) – Symmetric Toeplitz matrices Definitions for any matrix, A Transpose (AT): Replace all aij by aji Symmetric: AT = A Toeplitz: Descending diagonals to the right have equal values Lower/Upper triangular: No non zero values above/below diagonal Symmetric Toeplitz Matrices Example • Flipping rows and columns produces the same matrix • Every diagonal to the right contains the same value Levinson Durbin Algorithm or Step 0 E0 = 1 [r0 Initial Value] Step 1 E1 = -3 [ (1-k12)E0] k1 = 2 [r1/E0] Step 2 E2 = -8/3 [ (1-k22)E1] k2 = 1/3 [(r2 – a11r1)/E1] Step 3 E3 = -5/2 [(1-k32)E2 k3 = 1/4 [(r3 – a21r2 – a22r1)/E2] Step 4 E4 = -12/5 [(1-k42) E3] k4 = 1/5 [r4 – a31r3 – a32r2 – a33r1)/E3] a11=2 [k1] a21=4/3 [a11-k2a11] a22=1/3[k2] a31=5/4 [a21-k3a22] a32=0 [a22-k3a21] a33=1/4 [k3] a41=6/5 [a31-k4a33] a42=0 [a32-k4a32] a43=0[a33-k4a31] a44=1/5[k4] Verify results by plugging a41, a42, a43, a44 back into the equations 6/5(1) + 0(2) + (0)3 + 1/5(4) = 2, 6/5(2) + 0(1) + 0(2) + 1/5(3) = 3 6/5(3) + 0(2) + 0(1) + 1/5(2) = 4, 6/5(4) + 0(3) + 0(2) + 1/5(1) = 5 Levinson-Durbin Pseudo Code E0 = r 0 FOR step = 1 TO P kstep = ri FOR i = 1 TO step-1 THEN kstep -= ai-1,i * rstep-i kstep /= Estep-1 Estep = (1 – k2step)Estep-1 astep,step = kstep-1 For i = 1 TO step-1 THEN astep,i = astep-1,I – kstep*astep-1, step-i Note: ri are the row 1 matrix coefficients Cholesky Decomposition • Requirements: – Symmetric (same matrix if flip rows and columns) – Positive definite matrix Matrix A is real positive definite if and only if for all x ≠ 0, xTAx > 0 • Solution – Factor matrix A into: A = LLT where L is lower triangular – Perform forward substitution to solve: L(LT[ak]) = [bk] – Use the resulting vector, [xi], in the above step to perform a backward substitution to solve for LT[ak] = [xi] • Complexity – Factoring step: O(n3/3) – Forward and Backward substitution: O(n2) Cholesky Factorization Result: Cholesky Factorization Pseudo Code FOR k=1 TO n-1 lkk = a½kk FOR j = k+1 TO n ljk = ajk/ lkk FOR j = k+1 TO n FOR i = j TO n aij = aij – lik ljk lnn = ann • • • • Column index: k Row index: j Elements of matrix A: aij Elements of matrix L: l Illustration: Linear Prediction {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16} Goal: Estimate yn using the three previous values yn ≈ a1 yn-1 + a2 yn-2 + a3 yn-3 Three ak coefficients, Frame size of 16 Thirteen equations and three unknowns LPC Basics • Predict x[n] from x[n-1], … , x[n-P] – en = yn - ∑k=1,P ak yn-k – en is the error between the projection and the actual value – The goal is to find the coefficients that produce the smallest en value • Concept – – – – – Square the error Take the partial derivative with respect to each ak Optimize (The minimum slope has a derivative of zero) Result: P equations and P unknowns Solve using either the Cholesky or Levinson-Durbin algorithms LPC Derivation • One linear prediction equation: en = yn - ∑k=1,P ak yn-k Over a whole frame we have n equations and k unknowns • Sum en over the entire frame: E = ∑n=0,N-1(yn - ∑k=1,P ak yn-k) • Square the total error: E2 = ∑n=0,N-1 (yn - ∑k=1,P ak yn-k)2 • Partial derivative with respect to each aj; generates P equations (Ej) Like a regular derivative treating only aj as a variable 2Ej = 2(∑n=0,N-1 (yn - ∑k=1,P akyn-k)yn-j) Calculus Chain Rule: if y = y(u(x)) then dy/dx = dy/du * du/dx • Set each Ej to zero (zero derivative) to find the minimum P errors for j = 1 to P then 0 = ∑n=0,N-1 (yn - ∑k=1,P akyn-k)yn-j (j indicates the equation) • Rearrange terms: for each j of the P equations, ∑n=0,N-1 ynyn-j = ∑n=0,N-1∑k=1,Pakyn-kyn-j = ∑k=1,P∑n=1,Nakyn-kyn-j • Yule Walker equations: IF φ(j,0)= ∑n=0,N-1 ynyn-j, THEN φ(j,0) = ∑k=1,Pakφ(j,k) • Result: P equations and P unknowns (ak), one solution for the best prediction LPC: the Covariance Method • Result from previous: φ(j,k) = ∑k=1,P∑n=0,N-1yn-kyn-j • Equation j: φ(j,0)=∑k=1,Pakφ(j,k) • • • • Now we have P equations and P unknowns Because φ(j,k) = φ(k,j), the matrix is symmetric Solution requires O(n3) iterations (ex: Cholskey’s decomposition) Why covariance? It’s not probabilistic, but the matrix looks similar Covariance Example Recall: φ(j,k) = ∑n=start,start+N-1 yn-kyn-j Where equation j is: φ(j,0) = ∑k=1,Pakφ(j,k) • • • • • • • • • • • • • • Signal: { … , 3, 2, -1, -3, -5, -2, 0, 1, 2, 4, 3, 1, 0, -1, -2, -4, -1, 0, 3, 1, 0, … } Frame: {-5, -2, 0, 1, 2, 4, 3, 1}, Number of coefficients: 3 φ(1,1) = -3*-3 +-5*-5 + -2*-2 + 0*0 + 1*1 + 2*2 + 4*4 + 3*3 = 68 φ(2,1) = -1*-3 +-3*-5 + -5*-2 + -2*0 + 0*1 + 1*2 + 2*4 + 4*3 = 50 φ(3,1) = 2*-3 +-1*-5 + -3*-2 + -5*0 + -2*1 + 0*2 + 1*4 + 2*3 = 13 φ(1,2) = -3*-1 +-5*-3 + -2*-5 + 0*-2 + 1*0 + 2*1 + 4*2 + 3*4 = 50 φ(2,2) = -1*-1 +-3*-3 + -5*-5 + -2*-2 + 0*0 + 1*1 + 2*2 + 4*4 = 60 φ(3,2) = 2*-1 +-1*-3 + -3*-5 + -5*-2 + -2*0 + 0*1 + 1*2 + 2*4 = 36 φ(1,3) = -3*2 +-5*-1 + -2*-3 + 0*-5 + 1*-2 + 2*0 + 4*1 + 3*2 = 13 φ(2,3) = -1*2 +-3*-1 + -5*-3 + -2*-5 + 0*-2 + 1*0 + 2*1 + 4*2 = 36 φ(3,3) = 2*2 +-1*-1 + -3*-3 + -5*-5 + -2*-2 + 0*0 + 1*1 + 2*2 = 48 φ(1,0) = -3*-5 +-5*-2 + -2*0 + 0*1 + 1*2 + 2*4 + 4*3 + 3*1 = 50 φ(2,0) = -1*-5 +-3*-2 + -5*0 + -2*1 + 0*2 + 1*4 + 2*3 + 4*1 = 23 φ(3,0) = 2*-5 +-1*-2 + -3*0 + -5*1 + -2*2 + 0*4 + 1*3 + 2*1 = -12 Auto-Correlation Method • • • • Assume: all signal values outside of the frame (0<j<N-1) assumed zero Correlate from -∞ to ∞ (most values are 0) The LPC formula for φ becomes: φ(j,k)=∑n=0,N-1-(j-k) ynyn+(j-k)=R(j-k) The Matrix is now in the Toplitz format – The Levinson Durbin algorithm applies – Implementation complexity: O(n2) Auto Correlation Example Recall: φ(j,k)=∑n=0,N-1-(j-k) ynyn+(j-k)=R(j-k) Where equation j is: R(j) = ∑k=1,P R(j-k)ak • Signal: {…, 3, 2, -1, -3, -5, -2, 0, 1, 2, 4, 3, 1, 0, -1, -2, -4, -1, 0, 3, 1, 0, …} • Frame: {-5, -2, 0, 1, 2, 4, 3, 1}, Number of coefficients: 3 • • • • R(0) = -5*-5 + -2*-2 + 0*0 + 1*1 + 2*2 + 4*4 + 3*3 + 1*1 = 60 R(1) = -5*-2 + -2*0 + 0*1 + 1*2 + 2*4 + 4*3 + 3*1 = 35 R(2) = -5*0 + -2*1 + 0*2 + 1*4 + 2*3 + 4*1 = 12 R(3) = -5*1 + -2*2 + 0*4 + 1*3 + 2*1 = -4 LPC Transfer Function • Predict the values of the next sample Ŝ[n] = ∑ k=1,p ak s[n−k] • The error signal (e[n]), is the LPC residual e[n]=s[n]− ŝ[n] = s[n]− ∑ k=1,p ak s[n−k] • Perform a Z-transform of both sides E(z)=S(z)− ∑k=1,pak S(z)z−k • Factor S(z) E(z) = S(z)[ 1−∑k=1,p ak z−k ]=S(z)A(z) • Compute the transfer function: S(z) = E(z)/A(z) • Conclusion: LPC is an all pole IIR filter Speech and the LPC model • LPC all-pole IR filter: yn = Gxn - ∑k=1,N ak yn – The residual models the glottal source – The summation approximates the vocal tract harmonics • Challenges (Problems in synthesis) – The residual does not accurately model the source (glottis) – The filter does not model radiation from the lips – The filter does not account for nasal resonances • Possible solutions – Additional poles can somewhat increase the accuracy • 1 pole pair for each 1k of sampling rate • 2 more pairs can better estimate the source and lips – Introduce zeroes into the model – More robust analysis of the glottal source and lip radiation Vocal Tract Tube Model • Series of short uniform tubes connected in series – Each slice has a fixed area – Add slices for the model to become more continuous • Analysis – Using physics of gas flow through pipes – LPC turns out to be equivalent to this model – Equations exist to compute pipe diameters using LPC coefficients The LPC Spectrum 1. Perform a LPC analysis 2. Find the poles 3. Plot the spectrum around the z-Plane unit circle What do we find concerning the LPC spectrum? 1. Adding poles better matches speech up to about 18 for a 16k sampling rate 2. The peaks tend to be overly sharp (“spiky”) because small radius changes greatly alters pole skirt widths