Lecture 6: Linear Prediction (LPC)

advertisement
ELEN E4896 MUSIC SIGNAL PROCESSING
Lecture 6:
Linear Prediction (LPC)
1.
2.
3.
4.
Resonance & The Source-Filter Model
Linear Prediction (LPC)
LP Representations
LP Synthesis & Modification
Dan Ellis
Dept. Electrical Engineering, Columbia University
dpwe@ee.columbia.edu
E4896 Music Signal Processing (Dan Ellis)
http://www.ee.columbia.edu/~dpwe/e4896/
2013-02-25 - 1 /16
1. Resonance
• Resonance is ubiquitous
in physical systems
e.g. plucked/struck
string, drum head
room “coloration”
vocal tract
A
B
C
D
Poles
easily implemented
in LTI filters
E4896 Music Signal Processing (Dan Ellis)
0
Position
Time
• Resonances
0
Velocity
2013-02-25 - 2 /16
Singe Pole-Pair Resonance
•
imag
Simple resonance =
Second-order IIR
Band Pass Filter
1
0.5
+
y[n]
z-1
2rcosωc
z-1
-r2
r
ωc
0
x[n]
-0.5
-1
z-plane
|H(ejw)|
-1
0
ωc
0
-0.5
0
0.5
1 real
ω
Q= c
B
B ≈ 2(1-r)
0.1
0.2
E4896 Music Signal Processing (Dan Ellis)
0.3
ω/π
impulse_bpf.pd
2013-02-25 - 3 /16
Resonances in Speech
• Vocal Tract (throat + tongue + lips)
acts as variable resonator
freq / Hz
resonances = “formants”
4000
3000
2000
1000
hε z
e
0
0.5
w
has a
E4896 Music Signal Processing (Dan Ellis)
tcl
^
0
1
c θ
watch
1.5
n
time
I z/ s I
thin
as a
I
d
ay
m
dime
2013-02-25 - 4 /16
Source-Filter Model
• Separation of:
Source: fine structure in time/frequency
Filter: subsequent shaping by physical resonances
Formants
Pitch
t
Glottal pulse
train
Voiced/
unvoiced
Vocal tract
resonances
+
Radiation
characteristic
Speech
f
Frication
noise
t
Source
•
Filter
Advantages
Good match to real signals
Salient pieces
E4896 Music Signal Processing (Dan Ellis)
2013-02-25 - 5 /16
2. Linear Prediction (LPC)
• LPC = Linear Predictive Coding
remove redundancy in signal
try to predict next point
as linear combination of previous values
p
s[n] =
ak s[n
k] + e[n]
k=1
{ak } are pth order linear predictor coefficients
e[n] is residual “innovation” a/k/a prediction error
• Transfer function
S(z)
=
E(z)
1
1
p
k=1
ak z
k
1
=
A(z)
all-pole “autoregressive” (AR) modeling
E4896 Music Signal Processing (Dan Ellis)
2013-02-25 - 6 /16
Voice Modeling & LPC
• Direct expression of source-filter model
p
s[n] =
ak s[n
k] + e[n]
k=1
Pulse/noise
excitation
e[n]
H(z)
Vocal tract
H(z) = 1/A(z)
s[n]
|H(ejω)|
z-plane
f
acoustic tube model of vocal tract is all-pole
vocal tract resonances change slowly ~ 10-20ms
but: nasals
E4896 Music Signal Processing (Dan Ellis)
2013-02-25 - 7 /16
• You can “see”
resonances in
a spectral slice:
energy / dB
Estimating LPC Models
-40
-60
-80
-100
0
1000
2000
• We can find LPC coefficients {ae[n]}
to minimize energy of residual
k
n
s[n]
n
freq / Hz
:
2
p
e2 [n] =
3000
ak s[n
k]
k=1
differentiate w.r.t. ak & solve
end up with p linear equations involving
s[n j]s[n
autocorrelations rss (|j k|) =
k]
n
E4896 Music Signal Processing (Dan Ellis)
2013-02-25 - 8 /16
LPC Illustration
0.1
0
windowed original
-0.1
-0.2
LPC residual
-0.3
0
dB
50
100
150
200
250
300
350
400
time / samp
original spectrum
0
LPC spectrum
-20
-40
-60
residual spectrum
0
1000
2000
3000
4000
5000
6000
7000
freq / Hz
• Actual
poles:
z-plane
E4896 Music Signal Processing (Dan Ellis)
E4896_L06_lpc_diary
2013-02-25 - 9 /16
Short-Time LP Analysis
freq / kHz
• Solve LPC for each ~20 ms frame
8
6
4
2
0
20
Imaginary Part
1
10
5
12
0
0
-0.5
-1
freq / kHz
15
0.5
8
-5
-10
-1
0
Real Part
1
-15
0
0.2
0.4
0.6
0.8
1
6
4
2
0
0.5
1
E4896 Music Signal Processing (Dan Ellis)
1.5
2
2.5
3
time / s
2013-02-25 - 10/16
3. LP Representations
• Can interpret LPC filter fit many ways:
• Picking out resonances
if signal was source + resonances, should find them
approximation
• Low-ordere spectral
[n]
|E(e )|
j
2
minimizing 2 also minimizes
different from e.g. Fourier approximation...
• Finding & removing smooth spectrum
1
A(z)
is smooth approximation of S(z)
S(z)
1
=
E(z)
A(z)
E(z) = S(z)A(z)
is “unsmoothed” S(z)
• Signal whitening
removing linear dependence makes residual
like white noise (iid, flat spectrum)
E4896 Music Signal Processing (Dan Ellis)
2013-02-25 - 11/16
•
Alternative Forms
Many formulations for pth order all-pole IIR:
predictor coefficients {ak } / polynomial A(z)
roots {
i
= ri ej i } of A(z)
reflection coefficients (for lattice filter structure)
Line Spectral Frequencies (LSF)
• Choice depends on:
mathematical convenience
numerical stability
statistical properties (e.g. for coding)
opportunities for modification
E4896 Music Signal Processing (Dan Ellis)
2013-02-25 - 12/16
4. LPC Synthesis
• LP analysis on ~20ms frames gives
prediction filter A(z) and residual e[n]
recombining them should yield perfect s[n]
coding applications further compress e[n]
Encoder
Decoder
Filter coefficients {ai}
|1/A(ej!)|
Input
s[n]
Represent
& encode
f
LPC
analysis
Excitation
generator
Represent
& encode
Residual
e[n]
^
e[n]
All-pole
filter
H(z) =
t
Output
^
s[n]
1
1 - "aiz-i
e.g. simple pitch tracker ➝ “buzz-hiss” encoding
Pitch period values
16 ms frame boundaries
100
50
0
-50
1.3
1.35
E4896 Music Signal Processing (Dan Ellis)
1.4
1.45
1.5
1.55
1.6
1.65
1.7
1.75
time / s
2013-02-25 - 13/16
LPC Warping
• Replacing delays
with allpass elements
warps frequencies but not magnitudes
z-1
z+
z+1
Original
W
Frequency
8000
P
0.8
A = 0.6
6000
4000
2000
0
0.6
0.5
1
1.5
2
Time
Warped LPC resynth,
0.4
2.5
3
2.5
3
= -0.2
A = -0.6
0
0
0.2
0.4
0.6
0.8
^
W
P
Frequency
8000
0.2
6000
4000
2000
0
0.5
1
1.5
2
Time
http://www.ee.columbia.edu/~dpwe/resources/matlab/polewarp/
E4896 Music Signal Processing (Dan Ellis)
2013-02-25 - 14/16
Cross-Synthesis
• Mix residual (source) of one signal
with resonances (filter) of another
or: just use white noise as excitation
formants carry phonemes ➝ vocoder
E4896 Music Signal Processing (Dan Ellis)
2013-02-25 - 15/16
Summary
• Resonances (poles) color sound
• Source + Filter model
decouples excitation and resonances
• Linear Prediction is a simple way to model
and implement resonances (filter)
• Many interpretations, representations,
modifications
E4896 Music Signal Processing (Dan Ellis)
2013-02-25 - 16/16
Download