5, -2, 0, 1, 2, 4, 3, 1

advertisement
Linear Prediction Coding (LPC)
• History: Originally developed to compress (code) speech
• Broader Implications
–
–
–
–
Models the harmonic resonances of the vocal tract
Provide features useful for speech recognition
Part of speech synthesis algorithms
IIR Filter, to eliminate noise from a signal
• Concept
– Predict samples with linear combinations of previous
values to create a LPC signal
– The residue or error is the difference between the signal
and the prediction
LPC Calculations
• Predict the values of the next sample
Ŝ[n] = ∑ k=1,P ak s[n−k]
– P is the LPC order
– Accurate vocal tract model: P = sample rate*1000 + 2
– The LPC algorithm computes the ak coefficients
• The error signal (e[n]) is called the LPC residual
e[n]=s[n]− ŝ[n] = s[n]− ∑ k=1,p ak s[n−k]
• Goal: find ak coefficients that minimize the LPC residual
Linear Predictive Compression (LPC)
Concept
• For a frame of the signal,
find the optimal
coefficients, predicting the
next values using sets of
previous values
• Instead of outputting the
actual data, output the
residual, plus the
coefficients
• Less bits are needed, which
results in compression
Pseudo Code
WHILE not EOF
READ frame of signal
x = prediction(frame)
error = x – s[n]
WRITE LPC coefficients
WRITE error
Linear Algebra Background
• N linear independent equations; P unknowns
• If N<P, ∞ number of potential solutions
x + y = 5 // one equation, two unknowns
Solutions are along the line y = 5-x
• If N=P, there is at most one unique solution
x + y = 5 and x – y = 3, solution x=4, y=1
• If N>P, there are no solutions
No solutions for: x+y = 4, x – y = 3, 2x + 7 = 7
The best we can do is find the closes fit
Least Squares: minimize error
• First Approach: Linear algebra – find orthogonal
projections of vectors onto the best fit
• Second Approach: Calculus – Use derivative with
zero slope to find best fit
Solving n equations and n unknowns
Numerical Algorithms
• Gaussian Elimination
– Complexity: O(n3)
• Successive Iteration
– Complexity varies
• Cholskey Decomposition
– More efficient, still O(n3)
• Levenson-Durbin
– Complexity: O(n2)
– Symmetric Toeplitz
matrices
Definitions for any matrix, A
Transpose (AT): Replace all aij by aji
Symmetric: AT = A
Toeplitz: Descending diagonals to the right have equal values
Lower/Upper triangular: No non zero values above/below diagonal
Symmetric Toeplitz Matrices
Example
• Flipping rows and columns produces the same matrix
• Every diagonal to the right contains the same value
Levinson Durbin
Algorithm
or
Step 0
E0 = 1 [r0 Initial Value]
Step 1
E1 = -3 [ (1-k12)E0]
k1 = 2 [r1/E0]
Step 2
E2 = -8/3 [ (1-k22)E1]
k2 = 1/3 [(r2 – a11r1)/E1]
Step 3
E3 = -5/2 [(1-k32)E2
k3 = 1/4 [(r3 – a21r2 – a22r1)/E2]
Step 4
E4 = -12/5 [(1-k42) E3]
k4 = 1/5 [r4 – a31r3 – a32r2 – a33r1)/E3]
a11=2 [k1]
a21=4/3 [a11-k2a11]
a22=1/3[k2]
a31=5/4 [a21-k3a22]
a32=0 [a22-k3a21]
a33=1/4 [k3]
a41=6/5 [a31-k4a33]
a42=0 [a32-k4a32]
a43=0[a33-k4a31]
a44=1/5[k4]
Verify results by plugging a41, a42, a43, a44 back into the equations
6/5(1) + 0(2) + (0)3 + 1/5(4) = 2, 6/5(2) + 0(1) + 0(2) + 1/5(3) = 3
6/5(3) + 0(2) + 0(1) + 1/5(2) = 4, 6/5(4) + 0(3) + 0(2) + 1/5(1) = 5
Levinson-Durbin Pseudo Code
E0 = r 0
FOR step = 1 TO P
kstep = ri
FOR i = 1 TO step-1 THEN kstep -= ai-1,i * rstep-i
kstep /= Estep-1
Estep = (1 – k2step)Estep-1
astep,step = kstep-1
For i = 1 TO step-1 THEN astep,i = astep-1,I – kstep*astep-1, step-i
Note: ri are the row 1 matrix coefficients
Cholesky Decomposition
• Requirements:
– Symmetric (same matrix if flip rows and columns)
– Positive definite matrix
Matrix A is real positive definite if and only if for all x ≠ 0, xTAx > 0
• Solution
– Factor matrix A into: A = LLT where L is lower triangular
– Perform forward substitution to solve: L(LT[ak]) = [bk]
– Use the resulting vector, [xi], in the above step to perform a
backward substitution to solve for LT[ak] = [xi]
• Complexity
– Factoring step: O(n3/3)
– Forward and Backward substitution: O(n2)
Cholesky Factorization
Result:
Cholesky Factorization Pseudo Code
FOR k=1 TO n-1
lkk = a½kk
FOR j = k+1 TO n
ljk = ajk/ lkk
FOR j = k+1 TO n
FOR i = j TO n
aij = aij – lik ljk
lnn = ann
•
•
•
•
Column index: k
Row index: j
Elements of matrix A: aij
Elements of matrix L: l
Illustration: Linear Prediction
{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16}
Goal: Estimate yn using the three previous values
yn ≈ a1 yn-1 + a2 yn-2 + a3 yn-3
Three ak coefficients, Frame size of 16
Thirteen equations and three unknowns
LPC Basics
• Predict x[n] from x[n-1], … , x[n-P]
– en = yn - ∑k=1,P ak yn-k
– en is the error between the projection and the actual value
– The goal is to find the coefficients that produce the
smallest en value
• Concept
–
–
–
–
–
Square the error
Take the partial derivative with respect to each ak
Optimize (The minimum slope has a derivative of zero)
Result: P equations and P unknowns
Solve using either the Cholesky or Levinson-Durbin algorithms
LPC Derivation
• One linear prediction equation: en = yn - ∑k=1,P ak yn-k
Over a whole frame we have n equations and k unknowns
• Sum en over the entire frame: E = ∑n=0,N-1(yn - ∑k=1,P ak yn-k)
• Square the total error: E2 = ∑n=0,N-1 (yn - ∑k=1,P ak yn-k)2
• Partial derivative with respect to each aj; generates P equations (Ej)
Like a regular derivative treating only aj as a variable
2Ej = 2(∑n=0,N-1 (yn - ∑k=1,P akyn-k)yn-j)
Calculus Chain Rule: if y = y(u(x)) then dy/dx = dy/du * du/dx
• Set each Ej to zero (zero derivative) to find the minimum P errors
for j = 1 to P then 0 = ∑n=0,N-1 (yn - ∑k=1,P akyn-k)yn-j (j indicates the equation)
• Rearrange terms: for each j of the P equations,
∑n=0,N-1 ynyn-j = ∑n=0,N-1∑k=1,Pakyn-kyn-j = ∑k=1,P∑n=1,Nakyn-kyn-j
• Yule Walker equations: IF φ(j,0)= ∑n=0,N-1 ynyn-j, THEN φ(j,0) = ∑k=1,Pakφ(j,k)
• Result: P equations and P unknowns (ak), one solution for the best prediction
LPC: the Covariance Method
• Result from previous: φ(j,k) = ∑k=1,P∑n=0,N-1yn-kyn-j
• Equation j: φ(j,0)=∑k=1,Pakφ(j,k)
•
•
•
•
Now we have P equations and P unknowns
Because φ(j,k) = φ(k,j), the matrix is symmetric
Solution requires O(n3) iterations (ex: Cholskey’s decomposition)
Why covariance? It’s not probabilistic, but the matrix looks similar
Covariance Example
Recall: φ(j,k) = ∑n=start,start+N-1 yn-kyn-j
Where equation j is: φ(j,0) = ∑k=1,Pakφ(j,k)
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Signal: { … , 3, 2, -1, -3, -5, -2, 0, 1, 2, 4, 3, 1, 0, -1, -2, -4, -1, 0, 3, 1, 0, … }
Frame: {-5, -2, 0, 1, 2, 4, 3, 1}, Number of coefficients: 3
φ(1,1) = -3*-3 +-5*-5 + -2*-2 + 0*0 + 1*1 + 2*2 + 4*4 + 3*3 = 68
φ(2,1) = -1*-3 +-3*-5 + -5*-2 + -2*0 + 0*1 + 1*2 + 2*4 + 4*3 = 50
φ(3,1) = 2*-3 +-1*-5 + -3*-2 + -5*0 + -2*1 + 0*2 + 1*4 + 2*3 = 13
φ(1,2) = -3*-1 +-5*-3 + -2*-5 + 0*-2 + 1*0 + 2*1 + 4*2 + 3*4 = 50
φ(2,2) = -1*-1 +-3*-3 + -5*-5 + -2*-2 + 0*0 + 1*1 + 2*2 + 4*4 = 60
φ(3,2) = 2*-1 +-1*-3 + -3*-5 + -5*-2 + -2*0 + 0*1 + 1*2 + 2*4 = 36
φ(1,3) = -3*2 +-5*-1 + -2*-3 + 0*-5 + 1*-2 + 2*0 + 4*1 + 3*2 = 13
φ(2,3) = -1*2 +-3*-1 + -5*-3 + -2*-5 + 0*-2 + 1*0 + 2*1 + 4*2 = 36
φ(3,3) = 2*2 +-1*-1 + -3*-3 + -5*-5 + -2*-2 + 0*0 + 1*1 + 2*2 = 48
φ(1,0) = -3*-5 +-5*-2 + -2*0 + 0*1 + 1*2 + 2*4 + 4*3 + 3*1 = 50
φ(2,0) = -1*-5 +-3*-2 + -5*0 + -2*1 + 0*2 + 1*4 + 2*3 + 4*1 = 23
φ(3,0) = 2*-5 +-1*-2 + -3*0 + -5*1 + -2*2 + 0*4 + 1*3 + 2*1 = -12
Auto-Correlation Method
•
•
•
•
Assume: all signal values outside of the frame (0<j<N-1) assumed zero
Correlate from -∞ to ∞ (most values are 0)
The LPC formula for φ becomes: φ(j,k)=∑n=0,N-1-(j-k) ynyn+(j-k)=R(j-k)
The Matrix is now in the Toplitz format
– The Levinson Durbin algorithm applies
– Implementation complexity: O(n2)
Auto Correlation
Example
Recall: φ(j,k)=∑n=0,N-1-(j-k) ynyn+(j-k)=R(j-k)
Where equation j is: R(j) = ∑k=1,P R(j-k)ak
• Signal: {…, 3, 2, -1, -3, -5, -2, 0, 1, 2, 4, 3, 1, 0, -1, -2, -4, -1, 0, 3, 1, 0, …}
• Frame: {-5, -2, 0, 1, 2, 4, 3, 1}, Number of coefficients: 3
•
•
•
•
R(0) = -5*-5 + -2*-2 + 0*0 + 1*1 + 2*2 + 4*4 + 3*3 + 1*1 = 60
R(1) = -5*-2 + -2*0 + 0*1 + 1*2 + 2*4 + 4*3 + 3*1 = 35
R(2) = -5*0 + -2*1 + 0*2 + 1*4 + 2*3 + 4*1 = 12
R(3) = -5*1 + -2*2 + 0*4 + 1*3 + 2*1 = -4
LPC Transfer Function
• Predict the values of the next sample
Ŝ[n] = ∑ k=1,p ak s[n−k]
• The error signal (e[n]), is the LPC residual
e[n]=s[n]− ŝ[n] = s[n]− ∑ k=1,p ak s[n−k]
• Perform a Z-transform of both sides
E(z)=S(z)− ∑k=1,pak S(z)z−k
• Factor S(z)
E(z) = S(z)[ 1−∑k=1,p ak z−k ]=S(z)A(z)
• Compute the transfer function: S(z) = E(z)/A(z)
• Conclusion: LPC is an all pole IIR filter
Speech and the LPC model
• LPC all-pole IR filter: yn = Gxn - ∑k=1,N ak yn
– The residual models the glottal source
– The summation approximates the vocal tract harmonics
• Challenges (Problems in synthesis)
– The residual does not accurately model the source (glottis)
– The filter does not model radiation from the lips
– The filter does not account for nasal resonances
• Possible solutions
– Additional poles can somewhat increase the accuracy
• 1 pole pair for each 1k of sampling rate
• 2 more pairs can better estimate the source and lips
– Introduce zeroes into the model
– More robust analysis of the glottal source and lip radiation
Vocal Tract Tube Model
• Series of short uniform tubes
connected in series
– Each slice has a fixed area
– Add slices for the model to
become more continuous
• Analysis
– Using physics of gas flow through
pipes
– LPC turns out to be equivalent to
this model
– Equations exist to compute pipe
diameters using LPC coefficients
The LPC Spectrum
1. Perform a LPC analysis
2. Find the poles
3. Plot the spectrum around
the z-Plane unit circle
What do we find concerning the LPC spectrum?
1. Adding poles better matches speech up to about 18 for a 16k sampling rate
2. The peaks tend to be overly sharp (“spiky”) because small radius changes
greatly alters pole skirt widths
Download
Related flashcards

Algebra

20 cards

Abstract algebra

19 cards

Group theory

35 cards

Representation theory

13 cards

Create Flashcards