Adaptive Filtering Algorithms - Home pages of ESAT

advertisement
Digital Audio Signal Processing
Lecture-4: Acoustic Echo Cancellation
Marc Moonen
Dept. E.E./ESAT-STADIUS, KU Leuven
marc.moonen@esat.kuleuven.be
homes.esat.kuleuven.be/~moonen/
Outline
• Introduction
– Acoustic echo cancellation (AEC) problem & applications
– Acoustic channels
• Adaptive filtering algorithms for AEC
– NLMS
– Frequency domain adaptive filters
– Affine projection algorithm (APA)
•
•
•
•
Control algorithm
Post-processing
Loudspeaker non-linearity
Stereo AEC
Digital Audio Signal Processing
Version 2013-2014
Lecture-4: Acoustic Echo Cancellation
p. 2
Introduction
AEC problem/applications
Suppress echo
– to guarantee normal conversation conditions
– to prevent the closed-loop system from becoming unstable
Applications
– Teleconferencing
– Hands-free telephony
– Handsets
– …
Digital Audio Signal Processing
Version 2013-2014
Lecture-4: Acoustic Echo Cancellation
p. 3
Introduction
AEC standardization
ITU-T recommendations (G.167) on acoustic echo controllers state that
– Input/output delay of the AEC should be smaller than 16 ms
– Far-end signal suppression should reach 40..45 dB (depending on
application), if no near-end signal is present
– In presence of near-end signals the suppression should be at least
25 dB
– Many other requirements …
Digital Audio Signal Processing
Version 2013-2014
Lecture-4: Acoustic Echo Cancellation
p. 4
Introduction
Room Acoustics (I)
• Propagation of sound waves in an acoustic environment results in
– signal attenuation
– spectral distortion
• Propagation can be modeled
quite well as a linear filtering
operation
• Non-linear distortion mainly stems from the loudspeakers. This is often
a second order effect and mostly not taken into account explicitly
Digital Audio Signal Processing
Version 2013-2014
Lecture-4: Acoustic Echo Cancellation
p. 5
Introduction
Room Acoustics (II)
The linear filter model of the acoustic path between loudspeaker and
microphone is represented by the acoustic impulse response
Observe that :
– First there is a dead time
– Then come the direct path impulse
and some early reflections, which
depend on the geometry of the room
– Finally there is an exponentially decaying tail called reverberation, coming
from multiple reflections on walls, objects,...
Reverberation mainly depends on ‘reflectivity’ (rather than geometry) of the room…
Digital Audio Signal Processing
Version 2013-2014
Lecture-4: Acoustic Echo Cancellation
p. 6
Introduction
Room Acoustics (III)
To characterize the ‘reflectivity’ of a recording room the reverberation
time ‘RT60’ is defined
– RT60 = time which the sound pressure level or intensity needs
to decay to -60dB of its original value
– For a typical office room RT60 is between 100 and 400 ms,
for a church RT60 can be several seconds
ESAT speech laboratory :
Begijnhofkerk Leuven :
RT60  120 ms
RT60  3730 ms
Original speech signal :
Acoustic room impulse responses are highly time-varying !!!!
Digital Audio Signal Processing
Version 2013-2014
Lecture-4: Acoustic Echo Cancellation
p. 7
Introduction
Acoustic Impulse Response : FIR or IIR ?
• If the acoustic impulse response is modeled as
– an FIR filter  many hundreds to several thousands of filter taps
are needed
– an IIR filter  filter order can be reduced, but still hundreds of filter
coeffs (num. + denom.) may be needed (sigh!)
• Hence FIR models are typically used in practice because...
– these are guaranteed to be stable
– in a speech comms set-up the acoustics are highly time-varying,
hence adaptive filtering techniques are called for (see DSP-CIS):
• FIR adaptive filters : simple adaptation rules, no local minima,..
• IIR adaptive filters : more complex adaptation, local minima
Digital Audio Signal Processing
Version 2013-2014
Lecture-4: Acoustic Echo Cancellation
p. 8
Introduction
`Conventional’ AEC Techniques
•
•
•
•
Directional loudspeakers and microphones
Voice controlled switching, loss control
Howling control : stability margin improvement of the closed loop by
– frequency shifting
– using comb filters
– removing resonant peaks
Non-linear post-processing, e.g. center clipping
Digital Audio Signal Processing
Version 2013-2014
Lecture-4: Acoustic Echo Cancellation
p. 9
Outline
• Introduction
– Acoustic echo cancellation (AEC) problem & appls
– Acoustic channels
• Adaptive filtering algorithms for AEC
– NLMS
– Frequency domain adaptive filters
– Affine projection algorithm (APA)
•
•
•
•
Control algorithm
Post-processing
Loudspeaker non-linearity
Stereo AEC
Digital Audio Signal Processing
Version 2013-2014
Lecture-4: Acoustic Echo Cancellation
p. 10
Adaptive filtering algorithms for AEC
Basic set-up:
• Adaptive filter produces a model for acoustic room impulse response +
an estimate of the echo contribution in microphone signal, which is
then subtracted from the microphone signal
• Thanks to adaptivity
– time-varying acoustics can be tracked
– performance superior to performance of `conventional’ techniques
Digital Audio Signal Processing
Version 2013-2014
Lecture-4: Acoustic Echo Cancellation
p. 11
Adaptive filtering algorithms for AEC
• Algorithms to be discussed
– Normalized LMS
– Frequency-domain adaptive filter (FDAF)
& partitioned block freq-domain adaptive filter (PB-FDAF)
– Affine projection algorithm (APA)
& fast affine projection algorithm
Digital Audio Signal Processing
Version 2013-2014
Lecture-4: Acoustic Echo Cancellation
p. 12
Adaptive Filtering Algorithms: NLMS
• NLMS update equations
y[k ]  xTk w k
e[k ]  d [k ]  y[k ]
w k 1  w k 

x xk  
T
k
x k e[k ]
in which
x[k ] 

,
x k  


 x[k  N  1]
 w[0] 
w k    
 w[ N  1]
N is the adaptive filter length,  is the adaptation stepsize,
 is a regularization parameter and k is the discrete-time index
Digital Audio Signal Processing
Version 2013-2014
Lecture-4: Acoustic Echo Cancellation
p. 13
Adaptive Filtering Algorithms : NLMS
• Pros and cons of NLMS
+ cheap algorithm : O(N)
+ small input/output delay (= 1 sample)
– for colored far-end signals (such as speech) convergence of the
NLMS algorithm is slow
(cfr lambda_max versus lambda_min, etc…., see DSP-CIS)
– large N then means even slower convergence
¤ NLMS is thus often used for the cancellation of short echo paths
Digital Audio Signal Processing
Version 2013-2014
Lecture-4: Acoustic Echo Cancellation
p. 14
Adaptive Filtering Algorithms
• As some input/output delay is acceptable in AEC (cfr ITU..),
algorithms can be derived that are even cheaper than
NLMS, by exchanging implementation cost for extra
processing delay, sometimes even with improved
performance :
• Frequency-domain adaptive filtering (FDAF)
• Partitioned Block FDAF (PB-FDAF)
+ cost reduction
+ optimal (stepsize) tuning for each subband/frequency bin
separately results in improved performance
Digital Audio Signal Processing
Version 2013-2014
Lecture-4: Acoustic Echo Cancellation
p. 15
Adaptive Filtering Algorithms: Block-LMS
• To derive the frequency-domain adaptive filter the BLMS
algorithm is considered first
en  dn  X w n
T
n
w n 1  w n  X n e n
in which

x[(n  1) L] 
 x[nL  1]
 d [nL  1] 
 w[0] 
, d  
, w   

X n  




n
n





 x[nL  N  2]  x[(n  1) L  N  1]
d [(n  1) L]
 w[ N  1]
N is # filter taps, L is block length, n is block time index
Digital Audio Signal Processing
Version 2013-2014
Lecture-4: Acoustic Echo Cancellation
p. 16
Adaptive Filtering Algorithms: Block-LMS
• Both the BLMS convolution and correlation operation are
computationally demanding. They can be implemented
more efficiently in the frequency domain using fast
convolution techniques, i.e. overlap-save/overlap-add :
en  dn  X w n
T
n
convolution
w n 1  w n   X ne n correlation
with
X
(n)
  x[(n  1) L  M  1] 

 ,
 diagF 


 


x
[(
n

1
)
L
]

 
Digital Audio Signal Processing
Version 2013-2014
w
(n)
0 M  L 0  1 ( n ) ( n )
F X w
 0

IL 

overlap-save
0  1 ( n )*  0 
I N
0 0
 F X F e 
M N 

 n
w n 
 F  ,
0
F(m, n)  e
Lecture-4: Acoustic Echo Cancellation
j
2
mn
M
DFT matrix
p. 17
Adaptive Filtering Algorithms: FDAF
Overlap-save FDAF
X
(n)
  x[( n  1) L  M  1] 


 diag F 



 
x[( n  1) L]
 
 
0
y (n )   M L
 0
0  1 ( n ) ( n )
F X w

IL 
0 
d ( n )   M  L ,
 dn 
Will only work if
 d [nL  1] 

dn  



d [( n  1) L]
M  N  L 1
(M is FFT-size)
e( n )  d( n )  y ( n )
w
( n 1)
Digital Audio Signal Processing
w
(n)
I N
 F 
0
Version 2013-2014
0  1 ( n )* ( n )
F X Fe

0M N 
Lecture-4: Acoustic Echo Cancellation
p. 18
Adaptive Filtering Algorithms: FDAF
¤ Typical parameter setting for the FDAF :
N  L,
M  2L,
M  2 , p
p
¤ FDAF is functionally equivalent to BLMS
+ FDAF is significantly cheaper than (B)LMS
for a typical parameter setting
cost LMS
» 20
If N=1024 :
cost FDAF
(=estimate only, in practice <20)
- Input/output delay is equal to 2L-1=2N-1, which may be
unacceptably large for realistic parameter settings : e.g. if
N=1024 and fs=8000Hz  delay is 256 ms !
Digital Audio Signal Processing
Version 2013-2014
Lecture-4: Acoustic Echo Cancellation
p. 19
Adaptive Filtering Algorithms: PB-FDAF
• Overlap-save PB-FDAF : N-tap full-band filter split into (N/P)
filter sections, P-taps each, then apply overlap-save to each section,
etc. (`P takes the place of N’).
(n)
Xp
Will only work if y
(n)
M  P  L 1
  x[( n  1) L  pP  M  1] 

 ,
 diag F 


 


x
[(
n

1
)
L

pP
]




0 M  L

 0
0
d ( n )   ,
d n 
p  0:
N
1
P
N
P
0  1 1 ( n ) ( n )
F Xp wp
I L 
p 0
 d [nL  1] 

d n  


d [( n  1) L]
e(n)  d (n)  y (n)
( n 1)
(n)
p
p
Version 2013-2014
w
Digital Audio Signal Processing
w
0  1 ( n )* ( n )
I P
 F 
F X p Fe ,

0 M  P Echo
Lecture-4:
 0 Acoustic
 Cancellation
p  0:
N
1
P p. 20
Adaptive Filtering Algorithms: PB-FDAF
¤ Typical parameter setting :
P  L,
M  2L,
M  2 , q
q
¤ PB-FDAF is intermediate between LMS and FDAF (P/N=1)
+ PB-FDAF is functionally equivalent to BLMS
+ PB-FDAF is cheaper than LMS :
If N=1024, P=L=128, M=256 :
cost LMS
6
cost PBFDAF
(estimate)
+ Input/output delay is 2L-1 which can be chosen small, in
the example above the delay is 32 ms, if fs=8000Hz
¤ used in commercial AECs
Digital Audio Signal Processing
Version 2013-2014
Lecture-4: Acoustic Echo Cancellation
p. 21
Adaptive Filtering Algorithms : PB-FDAF
• PS: Instead of a simple stepsize , subband dependent
stepsizes can be applied
– stepsizes dependent on the subband energy (`subband
normalization’)
– convergence speed increased at only a small extra cost
• PS: PB-FDAF algorithm can be simplified by leaving
I P
F
0
0  1
F

0M P 
out of the weight updating equation (=`unconstrained updating’)
Digital Audio Signal Processing
Version 2013-2014
Lecture-4: Acoustic Echo Cancellation
p. 22
Adaptive Filtering Algorithms: APA
Affine Projection Algorithm
=intermediate between RLS and NLMS, complexity- as well as performance-wise
NLMS (delta=0) :
APA :
X k  x k
ek   d k   wTk x k
x k 1  x k  P 1 
e k  d k  XTk w k
w k 1  w k  x k (xTk x k ) 1 ek 
w k 1  w k  X k ( XTk X k ) 1 e k
if =1
e' k   d k   wTk 1x k
0
a-posteriori error is 0
Digital Audio Signal Processing
Version 2013-2014
e'k  dk  X wk 1  0
T
k
P last a-posteriori errors are 0
Lecture-4: Acoustic Echo Cancellation
p. 23
Adaptive Filtering Algorithms: APA
Problem with APA : near-end noise amplification
dk
d k noisy  d k  n k
nk
is echo-signal
is near-end noise
U k , Vk
Xk  U k Σk V
T
k
Σk
orthogonal
contains sorted singular
values on diagonal
nk , multiplied by 1  i , appears as `noise in the filter weights ’
w k 1    U k diag(
1
,
1
1  2
, ,
1
P
)VkT n k
Solution : replace ( XTk X k ) 1 by ( XTk X k  I) 1 in update formula
(=`regularization’, similar to delta in NLMS-formula)
Digital Audio Signal Processing
Version 2013-2014
Lecture-4: Acoustic Echo Cancellation
p. 24
Adaptive Filtering Algorithms: APA
Effect on near-end noise amplification
w k 1    U k diag(
1
,
2
 12    22  
, ,
P
 P2  
)VkT n k
Smaller if more regularization
Effect on adaptation speed

1
2
P
T
w k 1  I  U k diag( 2
, 2
, , 2
)U k  w k
1    2  
P 


 
Slower if more regularization
Digital Audio Signal Processing
Version 2013-2014
Lecture-4: Acoustic Echo Cancellation
p. 25
Adaptive Filtering Algorithms: Fast-APA
APA complexity, i.e.O(P.N), may be reduced to (roughly)
LMS complexity, i.e. O(N) :
1. `Recursive ’ error vector calculation (delta=0) :
e k   ek  
ek  d k  X w k     

(
1


)
e
*
  
k 1 
Ignore steps 2 & 3
T
k
Ex: mu=1, then lower components
were already nulled @ time k-1
2. Delayed filter vector update :
accumulate filter adaptations based
on vector x_k, apply only when x_k
`leaves ’ the X_k matrix (at time k+P-1)
3. Recursive updating scheme for inverse in
ε k  ( XTk X k ) 1 e k
Digital Audio Signal Processing
Version 2013-2014
Lecture-4: Acoustic Echo Cancellation
p. 26
Outline
• Introduction
– Acoustic echo cancellation (AEC) problem & appls
– Acoustic channels
• Adaptive filtering algorithms for AEC
– NLMS
– Frequency domain adaptive filters
– Affine projection algorithm (APA)
•
•
•
•
Control algorithm
Post-processing
Loudspeaker non-linearity
Stereo AEC
Digital Audio Signal Processing
Version 2013-2014
Lecture-4: Acoustic Echo Cancellation
p. 27
Control Algorithm
• Adaptation speed ( ) should be adjusted…
– to the far-end signal power, in order to avoid instability
of the adaptive filter
 stepsize normalization (e.g. NLMS)
– to the amount of near-end activity, in order to prevent
the filter to move away from the optimal solution (see
DSP-II)
 double-talk detection
Double talk refers to the situation where both the far-end and the
near-end speaker are active.
Digital Audio Signal Processing
Version 2013-2014
Lecture-4: Acoustic Echo Cancellation
p. 28
Control Algorithm
3 modes of operation:
1. Near-end activity (single or double talk) (Ed large)
 e  d  y,   small or 0  FILT
2. No near-end activity, only far-end activity (Ex large, Ed small)
 e  d  y,    max
 FILT+ADAPT
3. No near-end activity, no far-end activity (Ex small, Ed small)
 e  d,   0
 NOP
•Ex is short-time energy of
the far-end signal (p.36)
•Ed is short-time energy of
the desired signal
Digital Audio Signal Processing
Version 2013-2014
Lecture-4: Acoustic Echo Cancellation
p. 29
Control Algorithm
Double-talk Detection (DTD)
• Difficult problem: detection of speech during speech
• Desired properties
– Limited number of false alarms
– Small delay
– Low complexity
• Different approaches exist in the literature which are
based on
–
–
–
–
Energy
Correlation
Spectral contents
…
Digital Audio Signal Processing
Version 2013-2014
Lecture-4: Acoustic Echo Cancellation
p. 30
Control Algorithm
Energy-based DTD
Compare short-time energy of far-end and near-end
channel Ex and Ed :
L
2
Ex

x
[
k

i
]
, etc.
– Method 1 :

i 0
If Ed >  Ex  double talk
 is a well-chosen threshold
– Method 2 :
Ex Ee
 2
2
Ex  Ey
If  > 1  double talk
Digital Audio Signal Processing
Version 2013-2014
Lecture-4: Acoustic Echo Cancellation
p. 31
Post-processing
• Error suppression obtained in practice will
be limited to +/- 30 dB, due to
–
–
–
–
–
–
non-linearities in the signal path (loudspeakers)
time-variations of the acoustic impulse responses
finite length of the adaptive filter
local background noise
failing double-talk detection
…
• A post-processing unit is added to further reduce the
residual signal, e.g. `center clipping’
eout
Digital Audio Signal Processing
Version 2013-2014
ein  

 0
e  
 in
ein  
   ein  
  ein
Lecture-4: Acoustic Echo Cancellation
p. 32
Loudspeaker Non-linearity
If loudspeaker non-linearity is significant (e.g. consumer applications),
then this should be compensated for
• Solution-1: Non-linear model (fixed) in cancellation path
x
Non-linear model
Adaptive filter
y
d
e
Digital Audio Signal Processing
Version 2013-2014
Lecture-4: Acoustic Echo Cancellation
p. 33
Loudspeaker Non-linearity
• Solution-2: Inverse non-linear model in forward path
Advantage = if successful, also improves loudspeaker
characteristic/sound quality..
x
Inverse non-linear model
Adaptive filter
y
d
e
Digital Audio Signal Processing
Version 2013-2014
Lecture-4: Acoustic Echo Cancellation
p. 34
Outline
• Introduction
– Acoustic echo cancellation (AEC) problem & appls
– Acoustic channels
• Adaptive filtering algorithms for AEC
– NLMS
– Frequency domain adaptive filters
– Affine projection algorithm (APA)
•
•
•
•
Control algorithm
Post-processing
Loudspeaker non-linearity
Stereo AEC
Digital Audio Signal Processing
Version 2013-2014
Lecture-4: Acoustic Echo Cancellation
p. 35
S-AEC Problem Statement
Multi-microphone/multi-loudspeaker
systems : complexity for ‘prewhitening’ (APA, RLS) of x can be
shared amongst microphone
channels. Apart from this, different
microphone signals are processed
independently
Hence from now on consider S-AEC
on one microphone only.
Other microphone(s) similarly
(but independently) processed
Digital Audio Signal Processing
Version 2013-2014
Lecture-4: Acoustic Echo Cancellation
p. 36
S-AEC Problem Statement
Conditioning Problem:
S-AEC input vectors are

x k  xTk ,1

2 N 1
 x1[k ]
x Tk , 2

T
x1[k  1]
...
x1[k  N  1]
| x2 [k ]
...
x2 [k  N  1]
T
Mono : autocorrelation of x-signal (e.g. speech) has
an impact on convergence (see DSP-CIS)
Stereo : also cross-correlation between signals x1 and x2
plays a role now…
Large(r) eigenvalue spread (large(r) condition number)
of correlation matrix -> large(r) impact on convergence !
Digital Audio Signal Processing
Version 2013-2014
Lecture-4: Acoustic Echo Cancellation
p. 37
S-AEC Problem Statement
Non-uniqueness Problem:
Consider transmission room
impulse responses G1,G2 (length Q)
Assume N  Q then :
x
T
k ,1
x
T
k ,2

 G2 
.
0

  G1 
Hence filter input data matrix X will be singular (with `null-space’)
-> LS solution non-unique, and solutions depend on (changes
in) transmission room (G1,G2) !
Digital Audio Signal Processing
Version 2013-2014
Lecture-4: Acoustic Echo Cancellation
p. 38
S-AEC Problem Statement
In practice : N  Q
Hence
x
T
k ,1
truncated


G2
T
x k , 2 .
  0
truncated 
  G1


So that X will be (only) ill-conditioned
(instead of rank-deficient)
which however is still bad news…
Digital Audio Signal Processing
Version 2013-2014
Lecture-4: Acoustic Echo Cancellation
p. 39
S-AEC Fixes
-Reduce correlation between the loudspeaker signals by…
• Complementary comb filters
• White noise insertion (naive solution - large distortion)
• Colored (masked) noise insertion
• Non-linear processing
Disadvantages :
• Signal distortion
• Stereo perception may be affected
-In addition : use algorithms that are less sensitive to the
condition number than NLMS, e.g. RLS, APA, ...
Digital Audio Signal Processing
Version 2013-2014
Lecture-4: Acoustic Echo Cancellation
p. 40
S-AEC Fixes: Complementary comb filters
Comb-1 for x1, comb-2 for x2
Two channels are decorrelated, BUT stereo image is
distorted if applied below 1 kHz (=psycho-acoustics)
Can be combined with another technique below 1 kHz
Digital Audio Signal Processing
Version 2013-2014
Lecture-4: Acoustic Echo Cancellation
p. 41
S-AEC Fixes: Noise insertion
Remove all signal content below the masking threshold
Fill with noise (both channels independently)
Correlation between input channels decreases
• Poor performance for speech
• Good performance for music
• Computationally intensive
Digital Audio Signal Processing
Version 2013-2014
Lecture-4: Acoustic Echo Cancellation
p. 42
S-AEC Fixes: Non-linear processing
xi' (k )  xi (k )  f i ( xi (k )) i  1..2
f i () is often a half wave rectifier

f1 () 
2
 
f 2 () 
2
  0.5 is necessary for good performance, but audible
Good results for speech, audible artifacts in music
Digital Audio Signal Processing
Version 2013-2014
Lecture-4: Acoustic Echo Cancellation
p. 43
S-AEC Fixes: Non-linear processing
Loudspeakers play original signal
Mismatch
Loudspeakers play processed signal
Time
Digital Audio Signal Processing
Version 2013-2014
Lecture-4: Acoustic Echo Cancellation
p. 44
Download