Matrix Inversion Lemma and Information Filter Mohammad Emtiyaz Khan Honeywell Technology Solutions Lab, Bangalore, India. Abstract In this paper, we discuss two important matrix inversion lemmas and it’s application to derive information filter from Kalman filter. Advantages of Information filter over Kalman filter are also discussed. 1 The Model Consider the following model state-space model: xt+1 = At xt + Bt wt yt = Ct xt + vt (1) where, xt ∈ ℜn is the state vector, yt ∈ ℜm is the output vector, wt ∈ ℜn and vt ∈ ℜm are the state and measurement noise vectors respectively. We assume that the initial state vector and the noise vectors are i.i.d Gaussian random variables, x0 ∼ N(µ 0 , Σ0 ), wt ∼ N(0, Qt ), vt ∼ N(0, R¡t ), where, Σ0 ,¡Qt and ¢ ¢ Rt are symmetric, ¡ ¢ positive definite matrices. For simplicity, E vt wtT = 0, E x0 wtT = 0 and E x0 vtT = 0, where E( ) is the Expectation operator. We define, Ys ≡ {y1 , . . . , ys }. Further, we use the following definitions for the conditional expectations ¡ of the states and the¢corresponding error covariances: x̂t|s = E (xt |Ys ) and Pt|s = E (xt − x̂t|s )(xt − x̂t|s )T |Ys . 2 Matrix-Inversion Lemma Consider P ∈ ℜn×n . Assuming the inverses to exist, we have the following Matrix inversion lemmas: 1. (I + PCT R−1C)−1 P = (P−1 +CT R−1C)−1 = P − PCT (CPCT + R)−1CP (2) T −1 −1 T −1 −1 T −1 −1 T −1 T T −1 2. (I + PC R C) PC R = (P +C R C) C R = PC (CPC + R) (3) The second equation is a variant of Eq. (2). Proof of these are given in Appendix. 1 3 Classical Kalman filter The filtered state estimates are given as, = x̂t|t−1 + Pt|t−1CT (CPt|t−1CT + R)−1 (yt −Cx̂t|t−1 ) x̂t|t T T −1 = Pt|t−1 − Pt|t−1C (CPt|t−1C + R) CPt|t−1 Pt|t (4) (5) where the Kalman gain is, Kt = Pt|t−1CT (CPt|t−1CT + R)−1 . The one step ahead state prediction is obtained as, x̂t+1|t = At x̂t|t , Pt+1|t = At Pt|t AtT + Bt Qt BtT (6) with initial condition x1|0 = 0 and P1|0 = Σ0 . 4 Information Filter −1 −1 x̂t|s , called information matrix and information state and zt|s = Pt|s Define Zt|s = Pt|s vector. As described in (Mutambara 1998), for a Gaussian case, inverse of the covariance matrix (also called Fisher information) provides the measure of information about the state present in the observations. Direct application of matrix inversion lemma given by Eq. (2) and (3) on the Kalman filter gives us the following correction and prediction equations for information filter: zt|t = zt|t−1 +CtT Rt−1 yt , zt+1|t (I − Jt BtT )At−T zt|t , = Zt|t = Zt|t−1 +CtT Rt−1Ct (7) Zt+1|t = (I − Jt BtT )St (8) −1 −1 where St = At−T Pt|t At and Jt = St Bt (BtT St Bt + Qt−1 )−1 . The Kalman gain is given as Kt = Zt|t CtT Rt−1 . For detailed proof see (Anderson and Moore 1979) or (Mutambara 1998). Another form as given in (Mutambara 1998), is as follows ẑt|t ẑt+1|t = ẑt|t−1 + it , = Lt+1|t ẑt+1|t , Zt|t = Zt|t−1 + It −1 T Zt+1|t = [At Zt|t At + Qt ]−1 (9) (10) −1 . Also it = CtT Rt−1 yt and It = CtT Rt−1Ct , and are called “inwhere Lt+1|t = Zt+1|t At Zt|t formation vector and matrix associated with the observations yt ”. Some advantages of the Information filter over Kalman filter are listed here: 1. Computationally simpler correction step. 2. Although the prediction equations are more complex, prediction depends on a propagation coefficient which is independent of the observations. It is thus easy to decouple and decentralize. 2 3. There is no gain or innovation covariance matrices and the maximum dimension of a matrix to be inverted is the state dimension, which is usually smaller than the observation dimensions in a multi-sensor system; (CPt|t−1CT + R) in Kalman filter and (BtT St Bt + Qt−1 ) in Information filter. Hence it is always preferable to employ information filter and invert smaller information matrices than use the Kalman filter and invert the larger innovation covariance matrices. A few more points are discussed in (Anderson and Moore 1979), page 140. A Proof for matrix inversion lemma For Eq. 2, first step is immediate. For second step we verify, (I + PCT R−1C)[I − PCT (CPCT + R)−1C] = I, L.H.S. = (I + PCT R−1C)[I − PCT (CPCT + R)−1C] = (I + PCT R−1C) − (I + PCT R−1C)[PCT (CPCT + R)−1C] = (I + PCT R−1C) − (I + PCT R−1C)(I +C−1 RC−T P−1 )−1 = (I + PCT R−1C) − PCT R−1C(C−1 RC−T P−1 + I)(I +C−1 RC−T P−1 )−1 = I which completes the proof of Eq. (2). For Eq. (3), multiply Eq. (2) by C T R−1 , (I + PCT R−1C)−1 PCT R−1 = (P−1 +CT R−1C)−1CT R−1 = P − PCT (CPCT + R)−1CPCT R−1 (11) Now note that CPCT R−1 = (CPCT + R)R−1 − I, substituting which in the above equation, we get Eq. (3). References Anderson, D.O. Brian and John B. Moore (1979). Optimal Filtering. 1st ed.. Prentice Hall,Inc.. Englewood Cliffs, N.J. Mutambara, G.O. (1998). Decentralized Estimation and Control for Multisensor Systems. 1st ed.. CRC press LLC. 2000 Corporate Blvd., N.W., Boca Raton, Florida 33431. 3