3. NPCA-based Variable Forgetting Factor RLS Algorithm

advertisement
A Nonlinear PCA-based Variable Forgetting
Factor RLS Algorithm for Blind Source
Separation
Liu Yang, Hang Zhang
Institute of Communication Engineering, PLA University of Science and
Technology, Nanjing, 210007, China
Abstract
The forgetting factor plays an important role in the recursive least-squares (RLS)
algorithm. This parameter leads to a compromise between the tracking capability
and the stability. In this paper, the influence aroused by the forgetting factor is
discussed and a variable forgetting factor RLS algorithm based on nonlinear
principal component analysis (NPCA) is proposed for blind source separation
(BSS). Simulations are performed in static and non-stationary environments,
respectively. The results show that the new RLS algorithm has faster
convergence rate and better tracking capability than the fixed forgetting factor
one. And in the meantime, it has good anti-jamming performance.
Keywords:variable forgetting factor, recursive least-squares (RLS), blind source
separation (BSS), nonlinear principal component analysis (NPCA)
1.
Introduction
Recent years, BSS has gained great popularity due to its adaptation to a wider
range of applications. For it has the ability to recover a set of source signals only
from the output observed signals without any training data or priori information.
BSS techniques have found research in a variety of areas including wireless
communication, speech processing, bioelectricity processing and image
processing. One of the valuable applications of BSS is the communication antijamming which is just what will be discussed in this paper.
The convergence rate, tracking capability and stability of BSS problems have
always been receiving considerable attention by people. And lots of efforts have
been made to promote the performance. The RLS algorithm, as one typical
adaptive algorithm known for its faster convergence rate and better stability than
the LMS algorithm has been extensively used. By incorporating nonlinear PCA
(NPCA) with the project approximation subspace tracking (PAST) algorithm [1],
Pajunen et al. [2] presented the RLS algorithm for BSS. By using the natural
gradient on the Stiefel manifold to minimize a NPCA criterion, X. Zhu [3]
proposes a new adaptive RLS algorithm with prewhitening for BSS. In addition,
Kalman filtering-based RLS algorithms have also been discussed in [4, 5].
However, the papers are lack of the detailed discussion about the influence
aroused by the forgetting factor. The forgetting factor represents the degree of
forgetfulness for past data, if it is large, the algorithm “forgets” the past data
slowly, which leads to a better stability but slower convergence rate. And if it is
small, the case is on the contrary. Then the question arises, if there is a way to
make the forgetting factor change adaptively according to the separating matrix
and observed signals, the performance of convergence speed and tracking
capability may be improved under the condition of the guaranteed stability. As a
result, in this paper, we propose a RLS algorithm with variable forgetting factor,
and the mechanism that controls the forgetting factor is based on the NPCA
criterion. It has been proved that both in static and non-stationary environment,
the variable forgetting factor RLS algorithm has faster convergence rate and
better tracking capability than the fixed forgetting factor one. In addition, it is
demonstrated that the new algorithm has good anti-jamming ability.
The rest of the paper is organized as follows: Section 2 introduces the BSS
model for instantaneous mixture and the NPCA criterion. Section 3 presents the
NPCA-based variable forgetting factor RLS algorithm and the simulation results
are given in section 4. Finally, some conclusions are drawn in section 5.
2.
Problem Formulation and the NPCA Criterion
2.1 The BSS model
We consider the simple instantaneous mixture of BSS and its universal model is
shown in Fig. 1.
Sources
n1 (t ) Mixtures
x1 (t )
s1 (t )
s2 (t )
+
…
Mixing
system
A(t)
+
n2 (t ) …
+
sN (t )
x2 (t )
nN (t )
xN (t )
Whitened Mixtures
v1 (t )
Whitening v (t )
2
system
V(t)
…
Separating
system
W(t)
vN (t )
Estimated
Sources
y1 (t )
y2 (t )
…
yN (t )
Unkown
Fig. 1 Instantaneous mixture model of BSS
Mathematically, the BSS problem is formulated as
x(t )  A(t )s(t )  n(t )
(1)
where s(t )=[s1 (t),s2 (t),... ,s N (t)]T is the vector consists of N unknown source
signals which are assumed to be mutually independent. x(t )=[x1 (t),x 2 (t),... ,x N (t)]T
and n(t )=[n1 (t),n 2 (t),... ,n N (t)]T represent the observed signals and noise,
respectively. A (t ) denotes a n  n mixing matrix and if A (t ) changes by time,
then the system is time-varying.
In general, especially for the NPCA criterion which we will introduce later
needs a pre-whitening for the mixtures x(t ) . We adopt the RLS-type whitening
algorithm to prewhiten x(t ) on-line. Define v(t)=[v1 (t), v2 (t),... , vN (t)]T as the
whitened mixture, then
v(t )  V (t )x(t )
(2)
where V (t) is the whitening matrix which is adaptively updated in the form of
V (t +1)=
where  

1 
1

 [V (t ) 
v(t ) vT (t )
V (t )]
  vT (t ) v(t )
(3)
, and  denotes the forgetting factor we will discuss later.
The purpose of BSS is to recover the source signals from the mixtures by the
following model
y (t )=W(t)v(t )
(4)
T
where W(t) denotes the separating matrix, and y(t )=[y1 (t),y2 (t),... ,y N (t)] is an
estimation of source signals. As a matter of the uncertainty of the amplitude and
the order of BSS, it is accounted to be an effective BSS only if the overall
separating matrix B  WVA is a generalized permutation matrix in which only
one nonzero element is at each row or each column.
2.2 The NPCA Criterion
Once the observed signals are pre-whitened, it is easy to prove that the
separating matrix is orthogonal [6]. Then we can use the NPCA criterion to
determine the orthogonal separating matrix. The contrast function of NPCA is
formulated as follow:

J (W(t))=E v(t )  WT (t ) f ( W(t ) v(t ))
2

(5)
where E  denotes the expectation operation and f () is a nonlinear activation
function which is determined by the distribution of the sources. And through the
nonlinear function, high order statistics information is lead in to solve the BSS
problem by principle component analysis.
3.
NPCA-based Variable Forgetting Factor RLS Algorithm
Since the expectation operation in (5) is not easy for computation, and in order to
consider the influence aroused by both of the current data and the past data,
forgetting factor is introduced. Replacing the mean-square error in (5) by
exponentially weighted sum of error squares, and approximating the unknown
vector f (W (t)v (t)) by f (W(t-1)v(t)) [1], we have the least-squares type criterion
[2]
t
 W J (W(t))=  t -i v(i)  WT (t ) f ( W(t  1) v(i ))
(6)
i=1
in which  is the positive number less than unity equals to which we have just
mentioned in the pre-whitening part.
The ordinary gradient of J (W(t)) in (6) with respect to W(t) is
t
 W J (W(t))=  t -i f  y (i)  f [y (i)]T W(t)  v (i)T 
(7)
i=1
It is widely recognized that for Euclidean space, stochastic gradient denotes the
steepest descent direction. However, in Riemannian space, the natural gradient
represents the true steepest descent direction [7] .As a result, we use the natural
gradient instead of the stochastic gradient in classical RLS algorithm. Then the
natural gradient with orthogonality constraint is given by
~
W J (W(t))=W(t)WT (t )W J (W(t))-WW J (W(t))W
(8)
Computing the above natural gradient, and letting it equal to zero, we will get the
optimal weighting matrix at time t described by
t
t
i 1
i 1
W(t )opt  [  t  i y (i)zT (i)]1  [  t  i z(i) vT (i)]
(9)
Applying the matrix inverse lemma to the solution of the optimal separating
matrix, the RLS algorithm can be summarized as follows:
z (t )  f ( W(t  1) v(t ))  f (y (t ))
(10)
G (t ) 
H(t  1)y (t )
  zT (t )H(t  1)y (t )
H(t )   1[H(t  1)  G(t )zT (t )H(t 1)]
(11)
(12)
(13)
W(t )  W(t  1)  H(t )z(t ) vT (t )  G(t )zT (t )W(t 1)
The forgetting factor  is a key parameter for the above iteration. A small 
leads to a quick forgetfulness for the past data but the stability is declined. On
the contrary, a large  makes it “forget” the past data slowly and the stability
will increase simultaneously. To accelerate the convergence speed and tracking
capability of the RLS algorithm under the condition of not declining the stability,
we make the forgetting factor  change adaptively in accordance with the cost
function in (5). To simplify the computing, we use the current observed data
instead of the expectation data. Then the cost function of W(t ) can be written as
1
J (W (t))= ||  (t ) ||2
2
Where  (t )  v(t )  WT (t  1)z(t ) .
The gradient of J (W(t)) in (14) with respect to  (t ) is
 (t )
 J (W(t))=
  (t )  zT (t )ψ(t  1) (t )
 (t )
in which
ψ (t  1) 
W(t  1)
 (t  1)
(14)
(15)
(16)
Let S (t ) denotes the derivative of H (t ) with respect to  (t )
S(t ) 
H(t )
 (t )
(17)
Substituting (12) into (11), we can obtain
G (t )  H(t )y (t )
then using (13), (17), and (18) in (16) yields
(18)
ψ(t )  [I  G(t )zT (t )]ψ(t  1)  S(t )z(t ) vT (t )  S(t )y(t )zT (t )W(t 1)
For the recursion to compute S (t ) , we first use (11) and (12) to write
H (t )   1 (t ) H(t  1) 
 1 (t )H(t  1) y (t ) z T (t ) H(t  1)
 (t )  z T (t ) H(t  1) y (t )
(19)
(20)
Hence, S (t ) can be updated by the following equation
S(t )   1 (t )[I  G(t )zT (t )]S(t  1)[I  z(t )GT (t )]   1(t )G(t )GT (t )   1(t )H(t ) (21)
As a result, the forgetting factor may be adaptively updated by the following
equation
(22)
 (t )   (t  1)   J (W(t ))   (t  1)   zT (t )ψ(t  1) (t )
Where  is a small, positive learning-rate. Thus, we may summarize the RLS
algorithm with variable forgetting factor as follows:
Start with the initial values W(1) , P(1) , S (1) , ψ (1) and  (1) , compute for
t  1 :the first four steps are from (10) to (13), and the following three steps are
(19), (21) and (22). We should notice that to avoid the case that  (t ) is too large
or too small, we should give a upper level  and a lower level  of truncation
through experimentation.
4.
Simulation results
In this part, section A shows the performance of the NPCA-based variable
forgetting factor RLS algorithm compared with the fixed forgetting factor one
both in static and non-stationary environment; and section B shows the antijamming performance of the algorithms in non-stationary environment. For
convenience, the two algorithms are called V-RLS and F-RLS, respectively. The
sources include one multi-tone interference and one BPSK signal, in which the
sample frequency is f s =76.8kHz , the carrier frequency is fc =9.6kHz , the bit rate
is fb =2.4kb/s , the shaping filter is raised cosine filter with the roll off factor of
  0.5 , the up sampling rate is nsamp  4 of the BPSK signal; And the
frequencies of the multi-tone interference are f1  8.4kHz , f 2  9.6kHz and
f3  10.6kHz .
In order to evaluate the performance of the algorithms, we use the
performance index (PI) which is a function of the overall separating matrix B
given as

 1 n  n
bij
bij
1 n  n
 1    
 1
(30)



 n j 1  i 1 max bkj
n i 1  j 1 max bik
k


k


in which bij is the ij th element of B . Just as we have mentioned above, only
PI(B)=
when B  WVA is a generalized permutation matrix, the algorithm achieves the
perfect performance. In other words, the smaller of PI(B) , the better of the
performance.
A. PI Curves in Static and Non-stationary Environment
We set the forgetting factor of F-RLS at   0.995 ,   0.985 and   0.975 for
comparison, the initial forgetting factor of V-RLS is  (1)  0.975 , and the upper
level is   0.9999 , the lower level is   0.96 .The learning rate for V-RLS is
  1  106 . Both of the two algorithms use the same initial matrix: W (1)  I ,
P (1)  I , S(1)  I and ψ (1)  I . Furthermore, the same nonlinear function
f (t )  tanh(t ) is used, and the JSR is 5dB, SNR is 18dB for both. We perform
100 independent runs in each simulation.
Fig. 2 shows the PI curves for static environment, in which the mixing matrix
A is constant over each simulation. We can see from the figure that the forgetting
factor heavily affects the convergence rate and the stability. Firstly, when
  0.995 for F-RLS, its stability is worse than or almost the same as V-RLS, but
it converges much slower than V-RLS; secondly, when   0.985 for F-RLS, its
convergence speed is accelerated but in the meantime, the stability is decreased.
Thirdly, when   0.975 for F-RLS, which is the same as the initial value of VRLS, it is clear that the two algorithms have almost the same convergence speed,
but the stability of F-RLS is the worst. Then we can conclude that when the same
stability is achieved, V-RLS has a faster convergence speed than F-RLS.
1.4
F-RLS(=0.995)
F-RLS(=0.985)
1.2
F-RLS(=0.975)
V-RLS((1)=0.975)
1
0.8
PI
0.08
0.06
0.6
0.04
0.02
0.4
0
2000
2100
2200
2300
0.2
0
0
500
1000
1500
2000
2500 3000
Iterations
3500
4000
4500
5000
Fig.2 PI curves in static environment
Fig. 3 and Fig. 4 show the PI curves for non-stationary environment, in which
the mixing matrix A (t ) changes by the way in [8]:
A(t )  A 0  R (t )
R (t )   R (t  1)   σ
(31)
where R (t ) represents a small perturbation matrix having the same size as A 0 ,
and the initial value of R (t ) is set as zeros with dimension n  n ,   0.9 ,  is
the parameter that controls the changing speed of the environment, and it is set at
  0.01 and   0.05 in Fig. 3 and Fig. 4, respectively. σ denotes a normal
distribution random variable with zero mean and standard deviation one. We can
see that the larger of  , the worse of the stability. Although the stability is worse
than that of the static environment, we can achieve the similar conclusion as that
of Fig. 2. Furthermore, we can conclude that the proposed V-RLS algorithm has
better tracking capability in time-varying system than the F-RLS one.
1.4
F-RLS(=0.995)
F-RLS(=0.985)
1.2
F-RLS(=0.975)
V-RLS((1)=0.975)
1
0.8
PI
0.1
0.08
0.6
0.06
0.4
0.04
1800
1900
2000
2100
0.2
0
0
500
1000
1500
2000
2500 3000
Iterations
3500
4000
4500
5000
Fig.3 PI curves in non-stationary environment (  =0.01)
1.4
F-RLS(=0.995)
F-RLS(=0.985)
1.2
F-RLS(=0.975)
V-RLS((1)=0.975)
1
0.24
0.8
PI
0.22
0.2
0.6
0.18
0.16
2900
0.4
3000
3100
3200
0.2
0
0
500
1000
1500
2000
2500 3000
Iterations
3500
4000
4500
5000
Fig.4 PI curves in non-stationary environment (  =0.05)
B. Anti-jamming Performance
Concerning that the anti-jamming performance in steady environment is surely
better than that for non-stationary environment, for simplicity, we just give the
performance of BER versus SNR in non-stationary environment. The other
parameters are the same as that of Section A,   0.01 . The SNR is in [0, 10] dB.
From Fig. 5, we can see that both the two algorithms perform better than the
unhandled mixed signal. And the performance is not quite sensitive to the
forgetting factor. Then we can come to the conclusion that under the condition of
faster convergence rate and better tracking capability, the V-RLS blind source
separation algorithm has good anti-jamming performance.
0
10
-1
10
-2
BER
10
-3
10
-4
Mixed signal
10
BPSK separated by V-RLS((1)=0.975)
BPSK separated by F-RLS( =0.995)
-5
BPSK separated by F-RLS( =0.985)
10
BPSK separated by F-RLS( =0.975)
-6
10
0
1
2
3
4
5
SNR(dB)
6
7
8
9
10
Fig.5 Anti-jamming performance in non-stationary environment
5.
Summary
In this paper, a NPCA-based variable forgetting factor RLS algorithm is
proposed. Simulations using sources of communication signal and multi-tone
interference in both static environment and non-stationary environment show
that it has faster convergence rate and better tracking capability than the fixed
forgetting factor one. In addition, the proposed algorithm has good anti-jamming
performance.
References
[1] B. Yang, “Projection approximation subspace tracking”, IEEE Trans. Signal
Processing, vol. 43, no. 1, pp. 95-107, Jan. 1995.
[2] P. Pajunen and J. Karhunen, “Least-squares methods for blind source
separation based on nonlinear PCA”, Int. J. Neural Syst., vol. 8, no. 5-6, pp.
601-612, Dec. 1998.
[3] X. Zhu and X. Zhang, “Adaptive RLS algorithm for blind source separation
using a natural gradient”, IEEE Signal Processing Letters, vol. 9, no. 12, pp.
432-435, Dec. 2002.
[4] Q. Lv, X. Zhang and Y. Jia, “Kalman filtering algorithm for blind source
separation”, IEEE Intel. Conf. on ICASSP, pp. 257-260, March 18-23, 2005.
[5] F. Gu, H. Zhang and Y. Xiao, “Kalman filtering algorithm for blind
separation of convolutive mixtures”, Intel. Conf. on ISSPA, pp. 1045-1049,
July 2, 2012.
[6] A. Hyvarinen, J. Karhunen and E. Oja, “Independent component analysis”,
Beijing: Prentice Hall, pp. 103-104, 2007.
[7] S. Amari, “Natural gradient works efficiently in learning”, Neural Comput,
vol. 10, pp. 251-276, 1998.
[8] F. Gu, H. Zhang, X. Tan et al., “RLS algorithm for blind source separation in
non-stationary environments”, IEEE Intel. Conf. on CIICT, July 4-5, 2012.
Author A was born in Jiangsu Province, China, in 1978. He received
the B.S. degree from the University of Science and Technology of
China (USTC), Hefei, in 2001 and the M.S. degree from the
University of Florida (UF), Gainesville, in 2003, both in electrical
engineering. He is currently pursuing the Ph.D. degree with the
Department of Electrical and Computer Engineering, UF. His
research interests include spectral estimation, array signal processing,
and information theory
Firstname Lastname includes the biography here.
Download