ch9 (enhancement).ppt

advertisement
Speech Enhancement
1
2
3
4
Wiener Filtering:
A linear estimation of clean signal from the noisy signal Using
MMSE criterion
y t Clean Speech K  1
vt Noise
K 1
z t Noisy Speech K  1
z t  y t  vt
For additive noise
yˆ t  Ey t z t   az t  a y t  vt 
5
Projection Theorem :
The Mean Square Error

E y t  yˆ t
2
 is minimum
if
a is selected such that the error
  y t  yˆ t  y t  a  y t  vt 
is orthogonal to the noisy signal. i.e. :
y t  a  y t  vt   y t  vt

E y t  y t  vt 
H
  Ea y
t
 vt  y t  vt 
H

H : Hermitian transposit ion
6
Assuming v t and y t to be
zero - mean and uncorrelat ed ,

i.e., E y t vt
H
  Ev
yt
t
H
 0
Then we' ll have :

  aEy y  v v 
a  E y y   E y y  v v 
E yt yt
H
H
t
H
t
t
H
t
t
t
H
t
t
H
t
t
1
7
Since y and v are zero mean:
  
E v v   
E yt yt
H
yt
H
t
t

vt
a   yt  yt   vt

1
This is called the time domain Wiener filter
8
We are looking for a frequency-domain Wiener
filter, called the non-causal Wiener filter such that:

yˆ t   ht    z   d


  z t    h  d

According to the projection
theorem, for the error
E  y t   yˆ t  
to be minimum, the difference
 t   y t   yˆ t 
has to be orthogonal to the noisy input
2
E  y t   yˆ t   z t   0
9



or E  y t    z t    h  d  z    0







Ey t  z    E   z t    h  d  z  

 


or R yz t      R zz t      h  d

:t    

R yz     R zz     h  d

10
t   :
R yz    R zz  * h  (Convoluti on)
S yz    S zz   H  j 
 H  j  
S yz  
S zz  
S zz  : Spectrum af z t 
S yz   : cross  Spectrum between y t  and z t 
S yz    S yy  
(since : R yz    Ey t  y t   vt  
 Ey y    Ey  v  )
11
S zz    S yy    S vv  
H  j  
S yy  
S yy    S vv  
S zz    S vv  
H  j  
S zz  
Popular form of Wiener filter
12
13
Spectral Subtraction
z t  y t  vt
Z t  Yt  Vt
14
15
Vˆt
2
 n Vˆt ,old
2
 1  n  Z t
2
Z t  Vt
Yˆt   Z t

2
 Vˆt 

2
 Z 2  Vˆ
 t
t
Ht  
2
Zt


Yˆ  H .Z
t
t
2
1
2





e
1
j z
2
t
16
1  Maximum Like lihood ( ML )
PObservations Parameters
2  Maximum a Posteriori ( MAP )
PParameters Observations 
3  MMSE :
EParameters Observation s
z  z t , t  0,1, , T  1
y  y t , t  0,1, , T  1
1  ML : Pz y 
2  MAP : P y z 
3  MMSE : Ey z
17
18
MAP Speech Enhancement
yt  R K
vt  R K
zt  R K
y  yt , t  0,1,  , T  1
v  vt , t  0,1,  , T  1
z  zt , t  0,1,  , T  1
s  st , t  0,1,  , T  1, st 1,  , M
m  mt , t  0,1,  , T  1, mt 1,  , L
q , y k 
Weight Seq. 1    M  qt  , y k , t  0,  , T  1
1   L
qt  , y k :
M
L
max ln P y v  y, z   max ln  P y v s, m, y, z 
y
y
s 1 m 1
19
max ln P y v  y z 
y
k








y k  y t k , t  0,1, , T  1 , y t k  R
  y k  1   P  s, m, y k 
y v
s ,m
ln P y v s, m, y k  1 z 
  y k  1
ln P  y k  1 z   ln P  y k  z 
1

1 
y t k  1   qt  , y k H  ,  .z t
  ,

 ,

H  , 
 ,  v
0t T
20
21
max ln pyv s, m, y, z 
s ,m, y
max ln pyv s, m, y z 
s ,m, y
max ln pyv s, m, y, z 
s ,m 
22
MMSE Speech Enhancement
We try to optimize the function:

gˆ  y t   E g  y t  z
t
0

g(.) is a function on Rk and
z  z 0 ,, z t 
t
0
23
M
L
N

P

g  y t    Wt  ,,  ,  z 0t .
 1 1  1  1
Eg  y t  z t , st   , mt  , nt   , pt   

 
Wt  ,,  ,  z  P st   , mt  , nt   , pt   z
t
0


Gt  ,,  ,  , z
t
0

t
0
G  ,,  ,  , z 


 
M
L
N
1 1 1
P
1
t
t
0
24


  


G 0  ,, , , z0
Gt  ,, , , z0t 
t

s0t 1:st  


 c|     c |  b z0  ,, ,


m0t 1:mt  n0t 1:nt 




p0t 1: pt 



 a
.c
.an
.c 
.b zc sc , mc , nc , pc
s
n
sc 1 c pc nc
  0 st 1 c mc sc
25



 
exp  1 z tTr st ,mt  nt , pt
2
bz t st , mt , nt , pt  
2 k 2 det st,mt nt , pt
 z

1
t
1
2
Eg  y t  z t , st , mt , nt , pt    g  y t  p yv  y t z t , st , mt , nt , pt  dy t
The computation of Eqn1 is generally difficult.
For some specific functions, Eqn1 has been derived.
For instance, when g(.) is defined to be:
g1  y t   Yt k , k  0,1,, K  1
Where Yt (k ) is the kth coefficient of the DFT of yt ,
Eqn1 is equivalent to the popular Wiener filter
26
27
Recursive Formula For G:
M L N P

Gt  ,,  ,  , z 0t   Gt 1  ,,  ,  , z 0t 1
 11  1  1
 a    a   c|  c |  bz t  ,,  ,  




28
29
30
31
32
33
34
35
36
37
38
39
Automatic Noise Type Selection
40
41
42
Nonstationary State HMM
K  1
yt  g t  N t
g t : Determinis tic Function
N t : Stationary Residual (assumed to be an iid
zero - mean Gaussian source
NS - HMM Parameters :
   , a, c, , 
i  1,2,  , M

   i ,m ,
g t  i ,m 
m  1,2,  , L

Covariance of N t
43
Nonstationary-State HMM
For example, if the determinis tic function is
assumed to be polynomial ,
y t   Bi ,m r  hr t   i   N t  i ,m 
R
m  1,2,  , M
r 0
 i : The starting time to visit the ith state
hr : an rth order polynomial (usually orthogonal )
bt  j , m, d   
1
2 
K
2
 j ,m
1
.
2
Tr
R
R




1 

exp  1  y t   B j ,m r  hr d   j ,m  y t   B j ,m r  hr d  
r 0
r 0
 2 


 
44
Segmentation Algorithm in NS-HMM
s0 , s1 ,, sT 1  : state sequence
y 0 , y1 ,, yT 1  : observatio n sequence
d 0 , d1 ,, d T 1  : duration sequence
 t  j , m, d   max ps 0 , s1 ,, st  j , mt  m, d t  d , y 0 , y1 ,, y t  
s0 , s1 ,, st 1
 t  j , m, d   arg max ps 0 , s1 ,, st  j , mt  m, d t  d , y 0 , y1 ,, y t  
s0 , s1 ,, st 1
 i, v, 
45
Segmentation Algorithm in NS-HMM
1 - Initializa tion :
 0  j , m,0    j .c m j .bt  j , m,0  
1  m  L ,1  j  M
2 - Recursion for d  0 (entering a new Markov state)
L
t 1
 1
 0
 t  j , m,0   max max max
i j
 t 1 i, v, .aij .c m| j .bt  j , m,0, |  
 t  j , m,0   arg max  t 1 i, v, .aij
i , v ,
for 0  t  T ,
1 j  M ,
1 m  L
46
3 - Recursion step for d  0 (self looping)
 t  j , m, d    t  j , m, d  1.a jj .c m| j .bt  j , m, d |  
 t  j , m, d    j , m, d  1
for 0  t  T , 1  j  M , 1  m  L , 0  d  t
(assuming the mixture is not changed within a state)
47
4 - Terminatio n
M
L
T 1
i 1
m 1
d 0
p*  max max max  T 1 i, m, d 
s *T 1 , m *T 1 , d *T 1   arg max
M
i 1
L
T 1
m 1
d 0
max max  T 1 i, m, d 
5 - Backtracki ng
s *t , m *t , d *t    t 1 s *t 1 , m *t 1 , d *t 1 
for t  T  2, T  3, ,0
48
Now we generalize MMSE formulae for NS-HMM


M
L
N
P

T

E g  y t  | z 0t   Wt  ,,  ,  , d , z 0t .
 1 1  1  1 d 1

 gy  f y

t
y
t
| st   , mt  , nt   , pt   , d t  d  dy t
  Wt  Eg  y t  | st   , mt  , nt   , pt   , d t  d 


t
G

,

,

,

,
d
,
z
t
0
Wt  ,,  ,  , d , z 0t 
t
G

,

,

,

,
d
,
z
 t
0








d
for 1    M , 1    L , 1    N , 1    P , 1  d  t
49
For the calculatio n of E{g | ....} ,
g  y t  has to be specified. It has been shown that for :
g  y t   y t k , k  0,, k  1
( y t k  : k th component of the DFT of y t )
the computatio n cost is less than other functions.
50
A linear estimation using the MMSE criterion has
shown that the expectatio n of the kth component of
g is Gaussian. i.e.,
Eg k  | z t , st , mt , nt , pt , d t 
  Yt k  f yv Yt k  | z t , st , mt , nt , pt , d t  d Yt k 
~
is Gaussian w ith mean : H st ,mt ,nt , pt ,d t k  Z t k 
Where Z t k  is the kth component of the DFT of z t
~
and H st ,mt ,nt , pt ,d t k  is the kth component of the
Wiener filter for the correspond ing state, mixture and
duration of speech and noise.
51
Recursive calculatio n of G, with duration constraint s,
For entering a new state :


Gt  ,,  ,  ,0, z 
t
0
M L N P t
t 1







G

,

,

,

,
d
,
z
    t 1
0 .a   .a  .
  11  1  1 d 0
c| .c | .bz t |  ,,  ,  ,0 


For staying in the old state :
N P


t
t 1
Gt  ,,  ,  , d , z 0   Gt 1  ,,  ,  , d  1, z 0 .a  .a  
 1  1

.c| .c | .b z t |  ,,  ,  , d  1  d  t




52
53
54
55
56
57
58
59
60
Download