Approach 1

advertisement
Warped Linear Prediction
• Concept: Warp the spectrum to emulate human
perception; then perform linear prediction on the result
• Approaches to warp the spectrum:
1. Fourier transform, warp, and transform back
2. Bank of overlapping band-pass filters. We seen this in
one of the VAD algorithms
3. All-pass time-domain filters; all frequencies through
but spectrum and phases are warped
•
•
Why? To hopefully be able to more closely model
human speech with smaller residues.
Applications: Speech coding, recognition, synthesis
All-pass filter
• A pole of an all-phase filter lies inside the
unit circle and the matching zero is outside.
• The magnitudes of the matching poles and
zeros cancel along the unit circle
• They lie on the same radius line, so the polar
coordinate angle is the same.
• First order all pass filter transfer function:
H(z) = B(z)/A(z) = (z-1 – p0*)/ (1-p0z-1)
= (z-1- s e-jφ)/ (1- s ejφz-1) = (z-1- λ)/(1 - λz-1)
• Example: if p0 = ½ + ½i, then the zero is at
1/p0* = 1/(½ - ½ i) = ( ½ + ½ i)/(1/2) = 1 + i
s
p0*
r
Φ
Note: p0*
= conjugate of p
• Higher order all pass filter
H(z) =
p0
an + an-1z-1 + an-2 z-2 + … + a1 z-n+1 + z-n
1 + a1z-1 + a2 z-2 + … + an-1z-n+1 + anz-n
All pass Filter Visualization
All-pass Filter Phase Response
• Real coefficients
– λ, controls the location of the
pole (p) and the zero (1/p).
– No phase shift at frequencies 0,
π, 2π; only a signal delay
• Complex coefficients
– Similar phase responses
– Coefficients alter diagonal
crossing frequency: fx
fx = fs/2π arccos(λ) where fs is
the sampling rate
– Phase response:
w+2arctan(λsin(w)/(1- λcos(w))
2π
λ= 0.8
π
2π
Note: The cross over point is
where there is no frequency
warping, only a delay
Frequency Warping
• All pass filter: magnitude remains constant, but the
phase and frequency warped
• Group delay
– Definition: change of phase with respect to change of
frequency
– Interpretation: Different frequencies pass through a filter
at different speeds. Therefore, a frequency warping
operation occurs.
2)sin(w)
(1λ
– Formula: w’ = arctan
(1- λ2) cos(w) - 2λ
Where w is angle of original frequency, w’ is the angle of the
warped frequency, λ is the all-pass coefficient
Illustration
Application to LPC
• Warping to the match hearing auditory system
– λ = 1.0674(2/π arctan(0.06583 fs/100) ½ -0.1916
– Significant at higher sampling rates: > 8k hz
– CELP coding:
• Degradation Mean Opinion Score (DMOS): 0.3 < λ < 0.4
• Best Bark Scale match: λ = 0.57
• Modified LPC: x’n = d * f; yn ≈ ∑k=1,N ak x’n
– Convolute the frame, f, with all-pass filter, d
– Apply linear prediction to warped frequency signal
Evaluation
• Extra processing is minimal
• The LPC estimate is more accurate than when
warping is not used
• For coding operations
– Save one bit per sample at 48 kHz and 32 kHz
– Save 0.6 bits per sample at 16kHz
– Save 0.3 bits per sample at 8kHz
• Less peaky residue spectrum than standard methods
• Insignificant improvement for more than 30 LPC
coefficients
Matlab Toolbox: http://www.acoustics.hut.fi/software/warp
Inverse LPC Filter
• Transfer function: Yz = Hz Xz
– Xz is the original signal
– Hz is the LPC filter ( G / (1-∑i=1,P ai z-i)
– Yz is the filtered signal (residue)
• Inverse filter: Yz / Hz = Xz
– Yz is the filtered output
– Hz is the LPC filter
– Xz is the restored signal
• Convolute the filtered signal with 1/Hz to restore the
original signal from the residue
Click Detection using WLP
• Definition: A click is a short localized discontinuity
typically less than 1ms, which corrupts a signal
• Clicked Detection with both Warped and Standard
linear prediction
– LPC: yk = ∑n=1,P an xk-n + rk + ck
– rk is the residue and ck is the energy introduced by clicks
– Looking for spikes (ck), can find click points
• The warped linear prediction coefficient: λ
– A value of 0.0 reverts to standard linear prediction
– Positive values increase higher frequency resolution
– Negative values increase lower frequency resolution
Click Detection Algorithm
• Compute the standard deviation (σ) of the audio signal LPC
residue (ex: the amount of residue that we expect to remain)
• FOR each frame
– Perform the Linear prediction with various λ values
– Consider a click present in the frame when K σ > threshold, where K
is an empirically set gain factor.
– Approach 1
• Throw away frames determined to contain clicks
• Disadvantage: some distortion is present
– Approach 2
• Use interpolation to smooth the residue signal of clicks
• Restore signal: Convolute the inverse LPC filter with the residue
Does WLP have an affect?
• Prediction Gain (improvement in signal to noise ratio)
–
–
–
–
Divide clean signal energy by residue energy
Note: The residue is computed applying WLP to the noisy signal
The higher the result, the better the detection
Gp = 10 log (∑n=1,N |xn|2 / ∑n=1,n |rn|2)
• Experiment
– 44 kHz sample rate, 215 frames of 1024 samples, musical signal
corrupted with known click points, λ values varied between -0.8
and +0.8
– Result: choice of λ affects the ratio between clean signal and
residue with clicks
λ
-0.8
-0.4
0.0
0.4
0.8
Gp
35.51
27.49
22.37
17.30
11.04
Experiment
• Approach 1: Throw away click frames
• Approach 2: Interpolate click frames
• Results:
 Both LPC and WLP can detect clicks
 WLP with warping coefficient -0.7 reduces false detects
 LPC and WLP miss approximately the same number of clicks
Download