Cholesky-Based Reduced-Rank Square-Root Kalman Filtering J. Chandrasekar, I. S. Kim, D. S. Bernstein, and A. Ridley Abstract— We consider a reduced-rank square-root Kalman filter based on the Cholesky decomposition of the state-error covariance. We compare the performance of this filter with the reduced-rank square-root filter based on the singular value decomposition. I. I NTRODUCTION The problem of state estimation for large-scale systems has gained increasing attention due to computationally intensive applications such as weather forecasting [1], where state estimation is commonly referred to as data assimilation. For these problems, there is a need for algorithms that are computationally tractable despite the enormous dimension of the state. These problems also typically entail nonlinear dynamics and model uncertainty [2], although these issues are outside the scope of this paper. One approach to obtaining more tractable algorithms is to consider reduced-order Kalman filters. These reducedcomplexity filters provide state estimates that are suboptimal relative to the classical Kalman filter [3–7]. Alternative reduced-order variants of the classical Kalman filter have been developed for computationally demanding applications [8–11], where the classical Kalman filter gain and covariance are modified so as to reduce the computational requirements. A comparison of several techniques is given in [12]. A widely studied technique for reducing the computational requirements of the Kalman filter for large scale systems is the reduced-rank filter [13–16]. In this method, the errorcovariance matrix is factored to obtain a square root, whose rank is then reduced through truncation. This factorizationand-truncation method has direct application to the problem of generating a reduced ensemble for use in particle filter methods [17, 18]. Reduced-rank filters are closely related to the classical factorization techniques [19, 20], which provide numerical stability and computational efficiency, as well as a starting point for reduced-rank approximation. The primary technique for truncating the error-covariance matrix is the singular value decomposition (SVD) [13–18], wherein the singular values provide guidance as to which components of the error covariance are most relevant to the accuracy of the state estimates. Approximation based on the SVD is largely motivated by the fact that error-covariance This research was supported by the National Science Foundation, under grants CNS-0539053 and ATM-0325332 to the University of Michigan, Ann Arbor, USA. J. Chandrasekar, I. S. Kim, and D. S. Bernstein are with Department of Aerospace Engineering, and A. Ridley is with Department of Atmospheric, Oceanic and Space Sciences, University of Michigan, Ann Arbor, MI, dsbaero@umich.edu truncation is optimal with respect to approximation in unitarily invariant norms, such as the Frobenius norm. Despite this theoretical grounding, there appear to be no theoretical criteria to support the optimality of approximation based on the SVD within the context of recursive state estimation. The difficult is due to the fact that optimal approximation depends on the dynamics and measurement maps in addition to the components of the error covariance. In the present paper we begin by observing that the Kalman filter update depends on the combination of terms Ck Pk , where Ck is the measurement map and Pk is the error covariance. This observation suggests that approximation of Ck Pk may be more suitable than approximation based on Pk alone. To develop this idea, we show that approximation of Ck Pk leads directly to truncation based on the Cholesky decomposition. Unlike the SVD, however, the Cholesky decomposition does not possess a natural measure of magnitude that is analogous to the singular values arising in the SVD. Nevertheless, filter reduction based on the Cholesky decomposition provides state-estimation accuracy that is competitive with, and in many cases superior to, that of the SVD. In particular, we show that, in special cases, the accuracy of the Choleskydecomposition-based reduced-rank filter is equal to the accuracy of the full-rank filter, and we demonstrate examples for which the Cholesky-decomposition-based reduced-rank filter provides acceptable accuracy, whereas the SVD-based reduced-rank filter provides arbitrarily poor accuracy. A fortuitous advantage of using the Cholesky decomposition in place of the SVD is the fact that the Cholesky decomposition is computationally less expensive than the SVD, specifically, O(n3 /6) [21], and thus an asymptotic computational advantage over SVD by a factor of 12. An additional advantage is that the entire matrix need not be factored; instead, by arranging the states so that those states that contribute directly to the measurement correspond to the initial columns of the lower triangular square root, then only the leading submatrix of the error covariance must be factored, yielding yet further savings over the SVD. Once the factorization is performed, the algorithm effectively retains only the initial “tall” columns of the full Cholesky factorization and truncates the “short” columns. II. T HE K ALMAN FILTER Consider the time-varying discrete-time system xk+1 = Ak xk + Gk wk , (2.1) yk = Ck xk + Hk vk , (2.2) where xk ∈ Rnk , wk ∈ Rdw , yk ∈ Rpk , vk ∈ Rdv , and Ak , Gk , Ck , and Hk are known real matrices of appropriate sizes. We assume that wk and vk are zero-mean white processes with unit covariances. Define Qk , Gk GT k and Rk , Hk HkT , and assume that Rk is positive definite for all k > 0. Furthermore, we assume that wk and vk are uncorrelated for all k > 0. The objective is to obtain an estimate of the state xk using the measurements yk . The Kalman filter provides the optimal minimum-variance estimate of the state xk . The Kalman filter can be expressed in two steps, namely, the data assimilation step, where the measurements are used to update the states, and the forecast step, which uses the model. These steps can be summarized as follows: Data Assimilation Step Kk = Pkf CkT (Ck Pkf CkT + Rk )−1 , Pkda xda k = = Pkf xfk − Pkf CkT (Ck Pkf CkT + + Kk (yk − Ck xfk ). (2.3) −1 Rk ) Ck Pkf , (2.4) (2.5) Forecast Step xfk+1 = Ak xda k , f Pk+1 Pkf = Ak Pkda AT k n×n The matrices ∈ R and error covariances, that is, Pkda (2.6) + Qk . ∈ R n×n (2.7) are the state- da T Pkf = E[efk (efk )T ], Pkda = E[eda k (ek ) ], given by P = Un Σn UnT , where Un is orthogonal. For q 6 n, let ΦSVD (P, q) ∈ Rn×q denote the SVD-based rank-q approximation of the square 1/2 root Σq of P given by ΦSVD (P, q) , Uq Σq1/2 . The following standard result shows that SS , where S , ΦSVD (P, q), is the best rank-q approximation of P in the Frobenius norm. Lemma III.1. Let P ∈ Rn×n be positive semidefinite, and let σ1 · · · > σn be the singular values of P . If S = ΦSVD (P, q), then min rank(P̂ )=q 2 kP − P̂ k2F = kP − SS T k2F = σq+1 + · · · + σn2 . (3.4) The data assimilation and forecast steps of the SVD-based rank-q square-root filter are given by the following steps: Data Assimilation step −1 f f , (3.5) CkT + Rk Ks,k = P̂s,k CkT Ck P̂s,k −1 da f f P̃s,k = P̂kf − P̂kf CkT Ck P̂s,k CkT + Rk Ck P̂s,k , (3.6) f f xda s,k = xs,k + Ks,k (yk − Ck xs,k ), In the following sections, we consider reduced-rank square-root filters that propagate approximations of a squareroot of the error covariance instead of the actual error covariance. III. SVD-BASED R EDUCED -R ANK S QUARE -ROOT F ILTER Note that the Kalman filter uses the error covariances Pkda and Pkf , which are updated using (2.4) and (2.7). For computational efficiency, we construct a suboptimal filter that uses reduced-rank approximations of the error covariances Pkda and Pkf . Specifically, we consider reduced-rank approximations P̂kda and P̂kf of the error covariances Pkda and Pkf such that kPkda − P̂kda kF and kPkf − P̂kf kF are minimized, where k · kF is the Frobenius norm. To achieve this approximation, we compute singular value decompositions of the error covariances at each time step. Let P ∈ Rn×n be positive semidefinite, let σ1 > · · · > σn be the singular values of P , and let u1 , . . . , un ∈ Rn be corresponding orthonormal eigenvectors. Next, define Uq ∈ Rn×q and Σq ∈ Rq×q by σ1 .. U q , u 1 · · · u q , Σq , . (3.1) . σq With this notation, the singular value decomposition of P is (3.7) where (2.8) (2.9) (3.3) T where da efk , xk − xfk , eda k , xk − xk . (3.2) f f S̃s,k , ΦSVD (P̃s,k , q), (3.8) f P̂s,k (3.9) , f f S̃s,k (S̃s,k )T . Forecast step xfs,k+1 = Ak xda s,k , f P̃s,k+1 = da T Ak P̂s,k Ak (3.10) + Qk , (3.11) where da da S̃s,k , ΦSVD (P̃s,k , q), (3.12) da P̂s,k (3.13) , da da T S̃s,k (S̃s,k ) , f P̃s,0 and is positive semidefinite. Next, define the forecast and data assimilation error cof da variances Ps,k and Ps,k of the SVD-based rank-q square-root filter by f Ps,k , E (xk − xfs,k )(xk − xfs,k )T , (3.14) da da da T Ps,k , E (xk − xs,k )(xk − xs,k ) . (3.15) Using (2.1), (3.7) and (3.10), it can be shown that da f T Ps,k = (I − Ks,k C)Ps,k (I − Ks,k C)T + Ks,k Rk Ks,k , (3.16) f da T Ps,k+1 = Ak Ps,k Ak + Qk . (3.17) T T f f f da da da Note that S̃s,k S̃s,k 6 P̃s,k and S̃s,k S̃s,k 6 P̃s,k . f f Hence, even if P̃s,0 = Ps,0 , it does not necesarily follow that f f da da P̃s,k = Ps,k and P̃s,k = Ps,k for all k > 0. Therefore, since f Ks,k does not use the true error covariance Ps,k , the SVDbased rank-q square-root filter is generally not equivalent to the Kalman filter. However, under certain conditions, the SVD-based rank-q square-root filter is equivalent to the Kalman filter. Specifically, we have the following result. f f Proposition III.1. Assume that P̃s,k = Ps,k and f da da f f rank(Ps,k ) 6 q. Then, P̃s,k = Ps,k and P̃s,k+1 = Ps,k+1 . If, f f f in addition, Ps,k = Pkf , then Ks,k = Kk and Ps,k+1 = Pk+1 . Proof. Since rank(P̃kf ) 6 q, it follows from Lemma III.1 that T f f f f S̃s,k = P̃s,k . (3.18) P̂s,k = S̃s,k Hence, it follows from (3.5) that f f CkT + Rk Ks,k = Ps,k CkT Ck Ps,k while substituting (3.18) into (3.6) yields −1 , (3.19) da f f f f P̃s,k = P̃s,k − P̃s,k CkT (Ck P̃s,k CkT + Rk )−1 Ck P̃s,k . (3.20) f f Next, substituting (3.19) into (3.16) and using P̃s,k = Ps,k in the resulting expression yields da da P̃s,k = Ps,k . (3.21) f Since rank(P̃s,k ) da rank(P̃s,k ) 6 q, and da P̂s,k 6 q, it follows from (3.20) that therefore Lemma III.1 implies that T da da da = S̃s,k S̃s,k = P̃s,k . (3.22) Hence, it follows from (3.11), (3.17), and (3.21) that f f P̃s,k+1 = Ps,k+1 . (3.23) Finally, it follows from (2.3) and (3.19) that Ks,k = Kk . (3.24) Therefore, (2.4) and (3.16) imply that da Ps,k = Pkda . (3.25) Hence, it follows from (2.7), (3.17) and (3.25) that f f Ps,k+1 = Pk+1 . f f Corollary III.1. Assume that xfs,0 = xf0 , P̃s,0 = Ps,0 , and rank(P0f ) 6 q. Furthermore, assume that, for all k > 0, rank(Ak ) + rank(Qk ) 6 q. Then, for all k > 0, Ks,k = Kk and xfs,k = xfk . xfs,0 Since P ∈ Rn×n is positive semidefinite, the Cholesky decomposition yields a lower triangular Cholesky factor L ∈ Rn×n of P that satisfies LLT = P. Partition L as L= IV. C HOLESKY-D ECOMPOSITION -BASED R EDUCED -R ANK S QUARE -ROOT F ILTER The Kalman filter gain Kk depends on a particular subspace of the range of the error covariance. Specifically, Kk depends only on Ck Pkf and not on the entire error covariance. We thus consider a filter that uses reduced-rank approximations P̂kda and P̂kf of the error covariances Pkda and Pkf such that kCk (Pkda − P̂kda )kF and kCk (Pkf − P̂kf )kF are minimized. To achieve this minimization, we compute a Cholesky decomposition of both error covariances at each time step. L1 ··· Ln n , where, for i = 1, . . . , n, Li ∈ R has real entries T Li = 01×(i−1) Li,1 · · · Li,n−i+1 . (4.2) (4.3) Truncating the last n − q columns of L yields the reducedrank Cholesky factor ΦCHOL (P, q) , L1 · · · Lq ∈ Rn×q . (4.4) Lemma IV.1. Let P ∈ Rn×n be positive definite, define S , ΦCHOL (P, q) and P̂ , SS T , and partition P and P̂ as P1 P12 P̂1 P̂12 , (4.5) P = , P̂ = (P12 )T P2 (P̂12 ) P̂2 where P1 , P̂1 ∈ Rq×q and P2 , P̂2 ∈ Rr×r . Then, P̂1 = P1 and P̂12 = P12 . Proof. Let L be the Cholesky factor of P . It follows from (4.3) that Li LT ∈ Rn has the structure i 0i−1 0(i−1)×(n−i+1) Li LT = , (4.6) i 0(n−i+1)×(i−1) Xi where Xi ∈ R(n−i+1)×(n−i+1) . Therefore, n X 0q×q 0q×r T Li Li = , 0r×q Yr (4.7) i=q+1 where Yr ∈ Rr×r . Since P = n X Li LT i , (4.8) i=1 it follows from (4.4) that P = P̂ + xf0 , Proof. Since = (2.8) and (3.14) imply that f f Ps,0 = P0f . It follows from (3.16) that if rank(Ps,k ) 6 da q, then rank(Ps,k ) 6 q, and hence (3.17) implies that f rank(Ps,k+1 ) 6 q. Therefore, using Proposition III.1 and induction, it follows that Ks,k = Kk for all k > 0. Therefore, (2.5), (2.6), (3.7) and (3.10) imply that xfs,k = xfk for all k > 0. (4.1) n X Li LT i . (4.9) i=q+1 Substituting (4.7) into (4.9) yields P̂1 = P1 and P̂12 = P12 . Lemma IV.1 implies that, if S = ΦCHOL (P, q), then the first q rows and columns of SS T and P are equal. The data assimilation and forecast steps of the Choleskybased rank-q square-root filter are given by the following steps: Data Assimilation step “ ”−1 f f Kc,k = P̂c,k CkT Ck P̂c,k CkT + Rk , “ ”−1 da f f f f P̃c,k = P̂c,k − P̂c,k CkT Ck P̂c,k CkT + Rk Ck P̂c,k , xda c,k = xfc,k + Kc,k (yk − Ck xfc,k ), (4.10) (4.11) (4.12) where f f S̃c,k , ΦSVD (P̃c,k , q), (4.13) f P̂c,k (4.14) , f f S̃c,k (S̃s,k )T . Forecast step xfc,k+1 = Ak xda c,k , (4.15) f da T P̃c,k+1 = Ak P̂c,k Ak + Qk , (4.16) where da da S̃c,k , ΦSVD (P̃c,k , q), (4.17) da P̂c,k (4.18) , da da T S̃c,k (S̃c,k ) , f and P̃c,0 is positive semidefinite. Next, define the forecast and data assimilation error cof da variances Pc,k and Pc,k , respectively, of the Cholesky-based rank-q square-root filter by h i f , E (xk − xfc,k )(xk − xfc,k )T , (4.19) Pc,k h i da da T Ps,k , E (xk − xda , (4.20) c,k )(xk − xc,k ) f da that is, Pc,k and Pc,k are the error covariances when the Cholesky-based rank-q square-root filter is used. Using (2.1), (4.12) and (4.15), it can be shown that da Pc,k f Pc,k = = f (I − Kc,k C)Pc,k (I da T Ak Pc,k Ak + Qk . T T Kc,k Rk Kc,k , − Kc,k C) + (4.21) (4.22) T T da f da f S̃c,k 6 P̃kda . S̃c,k 6 P̃kf and S̃c,k Note that S̃c,k f f Hence, even if P̃c,0 = Pc,0 , the Cholesky-based rank-q square-root filter is generally not equivalent to the Kalman filter. However, in certain cases, the Cholesky-based rank-q square root filter is equivalent to the Kalman filter. Proposition IV.1. Let Ak and Ck have the structure A1,k 0 Ak = , Ck = Ip 0 , (4.23) A21,k A2,k f where A1,k ∈ Rp×p and A2,k ∈ Rr×r , partition Pkf and P̃c,k as Pkf = » f P1,k f P21,k f )T (P21,k f P2,k f f P1,k , P̃c,1,k – f , P̃c,k = » f P̃c,1,k f P̃c,21,k f (P̃c,21,k )T f P̃c,2,k – , (4.24) f f where ∈ R and P2,k , P̃c,2,k ∈ Rr×r , and f f f f assume that q = p, P̃c,1,k = P1,k , and P̃c,21,k = P21,k . Then, f f f f Kc,k = Kk , P̃c,1,k+1 = P1,k+1 , and P̃c,21,k+1 = P21,k+1 . Proof. Let p×p have entries da da T P1,k (P21,k ) = , da da P21,k P2,k (4.25) da da where P1,k ∈ Rp×p is positive semidefinite and P2,k ∈ r×r R . It follows from (2.4) that da f f f f P1,k = P1,k − P1,k (P1,k + Rk )−1 P1,k , (4.26) da P21,k (4.27) = f P21,k − f f P21,k (P1,k −1 + Rk ) (Q21,k )T Q2,k . (4.31) f da f Define P̂c,k and P̂c,k by (4.14) and (4.18), and let P̂c,k da and P̂c,k have entries f P̂c,k = – » da – f f da P̂c,1,k (P̂c,21,k )T P̂c,1,k (P̂c,21,k )T da , P̂c,k = da , (4.32) f f da P̂c,21,k P̂c,2,k P̂c,21,k P̂c,2,k » da f where P̂1,k , P̂1,k ∈ Rp×p are positive semidefinite and da f r×r P̂2,k , P̂2,k ∈ R . Substituting (4.32) into (4.10) yields " # f P̂1,k f Kc,k = (P̂1,k + Rk )−1 . (4.33) f P̂21,k f f Since Sc,k = ΦCHOL (P̃c,k , q) and we assume that q = f f f f p, P̃c,1,k = P1,k , and P̃c,21,k = P21,k , it follows from Lemma IV.1 that f f f f = Pc,21,k . , P̂c,21,k P̂c,1,k = Pc,1,k (4.34) Hence, Kc,k = Kk . Next, substituting (4.14) into (4.11) yields da f f f f P̃c,k = P̂c,k − P̂c,k CkT (Ck P̂c,k CkT + Rk )−1 Ck P̂c,k . (4.35) da Let P̃c,k have entries " da P̃c,k = da P̃c,1,k da P̃c,21,k da (P̃c,21,k )T da P̃c,2,k # , (4.36) da da where P̃c,1,k ∈ Rp×p is positive semidefinite and P̃c,2,k ∈ r×r R . Therefore, it follows from (4.23) and (4.32) that da f f f f P̃c,1,k = P̂c,1,k − P̂c,1,k (P̂c,1,k + Rk )−1 P̂c,1,k , (4.37) da P̃c,21,k (4.38) = f P̂c,21,k − f f P̂c,21,k (P̂c,1,k −1 + Rk ) f P̂c,1,k . Substituting (4.34) into (4.37) and using (4.25) and (4.26) yields da da da da P̃c,1,k = P1,k , P̃c,21,k = P21,k . Moreover, since Lemma IV.1 that da S̃c,k = da ΦCHOL (P̃c,k , q), (4.39) it follows from da da da da P̂c,1,k = P̃c,1,k , P̂c,21,k = P̃c,21,k . (4.40) It follows from (4.16) and (4.23) that Pkda Pkda where Qk has entries Q1,k Qk = Q21,k f P1,k . Substituting (4.23) into (2.3) yields f P1,k f Kk = (P1,k + Rk )−1 . f P21,k (4.28) Furthermore, (2.7) and (4.23) imply that f da T P1,k+1 = A1,k P1,k A1,k + Q1,k , (4.29) f da da T P21,k+1 = A2,k P21,k AT 1,k + A21,k P1,k A1,k + Q21,k , (4.30) da T f P̃c,1,k+1 = A1,k P̂1,k A1,k + Q1,k , (4.41) f da da T P̃c,21,k+1 = A2,k P̂21,k AT 1,k + A21,k P̂1,k A1,k + Q21,k . (4.42) Therefore, (4.29), (4.39), and (4.40) imply that f f f f P̃c,1,k+1 = P1,k+1 , P̃c,21,k+1 = P21,k+1 . Corollary IV.1. Assume that Ak and Ck have the strucf f f f ture in (4.23). Let q = p, P̃c,1,0 = P1,0 , P̃c,21,0 = P21,0 , f f and xc,0 = x0 . Then for all k > 0, Kc,k = Kk , and hence xfc,k = xfk . Proof. Using induction and Proposition IV.1 yields Kc,k = Kk for all k > 0. Hence, it follows from (2.5), (2.6), (4.12), and (4.15) that xfc,k = xfk for all k > 0. for i = 1, . . . , r − 1. Using (4.11) in (5.12) yields V. L INEAR T IME -I NVARIANT S YSTEMS In this section, we consider a linear time-invariant system and hence assume that, for all k > 0, Ak = A, Ck = C, Gk = G, Hk = H, Qk = Q, and Rk = R. Furthermore, we assume that p < n and (A, C) is observable so that the observability matrix O ∈ Rpn×n defined by C CA O, (5.1) .. . CAn−1 has full column rank. Next, without loss of generality we consider a basis such that In O= . (5.2) 0(p−1)n×n Therefore, (5.1) and (5.2) imply that, for i ∈ {1, . . . , n} such that ip 6 n, CAi−1 = 0p×p(i−1) Ip 0p×(n−pi) . (5.3) The following result shows that the Cholesky decomposition based rank-q square-root filter is equivalent to the optimal Kalman filter for a specific number of time steps. Proposition V.1. Let r > 0 be an integer such that 0 < q = pr < n. Then, for all k = k0 , . . . , k0 + r − 1, Kc,k = Kk . f P̂c,k+1 (Ai−1 )T C T (5.13) h i f f T f T −1 f i T T = A P̂c,k − P̂c,k C (C P̂c,k C + R) C P̂c,k (A ) C + Q(Ai−1 )T C T , Therefore, (5.6) implies that, for all i = 1, . . . , r − 1, P̂ f c,k+1 (Ai−1 )T C T (5.14) f i T T f T = ψ P̂c,k (A ) C , P̂c,k C , A, Q, C, R, i . Hence, it follows from (5.5)-(5.7) that for i = 1, . . . , r − 1, f P̂k+i CT f Since P̃c,k = Pkf0 , it follows from Lemma IV.1 that for 0 i = 1, . . . , r, f CAi−1 P̂c,k = CAi−1 Pkf0 . 0 Hence, it follows from (5.7) and (5.15) that for 1, . . . , r − 1, P̂kf0 +i C T = Pkf0 +i C T . Hence, for i = 1, . . . , r − 1, “ ” f Pk+1 (Ai−1 )T C T = ψ Pkf (Ai )T C T , Pkf C T , A, Q, C, R, i , (5.5) where ψ(X, Y, A, Q, C, R, i) (5.6) = AX − AY (CY + R) i−1 T CX + Q(A T ) C . Therefore, P f k+i C T = i = (5.17) Kc,k0 +i = Kk0 +i . f However, in general Pc,k 6= Pkf for k = k0 , . . . , k0 + r − 1, Kc,k . VI. E XAMPLES = APkf AT − APkf C T (CPkf C T + R)−1 CPkf AT + Q. (5.4) −1 (5.16) Finally, (2.3) and (4.10) imply that for i = 0, . . . , r − 1 Proof. Substituting (2.4) into (2.7) yields f Pk+1 (5.15) f f f = ϕ(P̂c,k (Ai )T C T , P̂c,k (Ai−1 )T C T , . . . , P̂c,k C T , A, Q, C, i). (5.7) We compare performance of the SVD-based and Cholesky-based reduced-rank square-root Kalman filters using a compartmental model [22] and 10-DOF mass-springdamper system. A schematic diagram of the compartmental model is shown in Fig 1. The number n of cells is 20 with one state per cell. All ηii and all ηij (i 6= j) are set to 0.1. We assume that the state of the 9th cell is measured, and disturbances enter all cells so that the disturbance covariance Q has full rank. ϕ(Pkf (Ai )T C T , Pkf (Ai−1 )T C T , . . . , Pkf C T , A, Q, C, i). f da and P̂c,k by Define P̂c,k f f f da da da T P̂c,k , S̃c,k (S̃c,k )T , P̂c,k , S̃c,k (S̃c,k ) . (5.8) It follows from Lemma IV.1 and (5.3) that for all k > 0 and i = 1, . . . , r, da da f f . (5.9) = CAi−1 P̃c,k , CAi−1 P̂c,k = CAi−1 P̃c,k CAi−1 P̂c,k Note that f da T P̃c,k+1 = AP̂c,k A + Q. (5.10) Multiplying (5.10) by (Ai−1 )T C T yields f da P̃c,k+1 (Ai−1 )T C T = AP̂c,k (Ai )T C T + Q(Ai−1 )T C T . (5.11) Substituting (4.40) into (5.11) yields f da P̂c,k+1 (Ai−1 )T C T = AP̃c,k (Ai )T C T + Q(Ai−1 )T C T , (5.12) Fig. 1. Compartmental model involving interconnected subsystems. We simulate three cases of disturbance covariance and compare costs J , E(eT k ek ) in Figure 3 where (a) shows the cost when Q = I, ((b) shows costs when Q is diagonal with entries ( Qi,i = 4.0, if i = 9, 1.0, else, (6.1) and (c) shows costs when Q is diagonal with entries ( 0.25, if i = 9, Qi,i = (6.2) 1.0, else, We reduce the rank of the square-root covariance to 2 in all three cases. In (a) and (b) of Figure 3, the Choleskybased reduced-rank square-root Kalman filter exhibits almost the same performance as the optimal filter whereas the SVD-based reduced-rank square-root Kalman filter shows degraded performance in (a). Meanwhile, in (c), the SVDbased has a large transient and large steady-state offset from the optimal, whereas the Cholesky-based reduced-rank square-root Kalman filter behaves close to the full-order filter. The mass-spring-damper model is shown in Figure 2. The total number of masses is 10 with two states (displacement and velocity) per mass. For i = 1, . . . , 10, mi = 1 kg, and kj = 1 N/m, cj = 0.01 Ns/m, j = 1, . . . , 11. We assume that the displacement of the 5th mass is measured and disturbances enter all states so that disturbance covariance Q has full rank. Fig. 2. Mass-spring-damper system. We simulate three cases of disturbance covariance and compare costs J in (a),(b) and (c) of Figure 4. (a) shows costs when Q = I, (b) shows costs when Q is given by (6.1), and (c) shows costs when Q is given by (6.2). We reduce the rank of square-root covariance to 2 in all three cases. In (a) and (b), the costs for the Cholesky-based and SVD-based reduced-rank square-root Kalman filters are close to each other but larger than the optimal. Meanwhile, in (c), the SVD-based shows unstable behavior, while the Choleskybased filter remains stable and nearly optimal. VII. C ONCLUSIONS We developed a Cholesky decomposition method to obtain reduced-rank square-root Kalman filters. We showed that the SVD-based reduced-rank square-root Kalman filter is equivalent to the optimal filter when the reduced-rank is equal to or greater than the rank of error-covariance, while the Cholesky-based is equivalent to the optimal when the system is lower triangular block-diagonal according to the observation matrix C which has the form [Ip×p 0] and p is equal to the reduced-rank q. Furthermore, the Choleskybased rank-q square root filter is equivalent to the optimal Kalman filter for a specific number of time steps. In general cases, the Cholesky-based does not always perform better that the SVD-based and vice versa. Finally, using simulation examples, we showed that the Cholesky-based exhibits more stable performance than the SVD-based filter, which can become unstable when the strong disturbances enter the system states that are not measured. 6 10 Optimal KF RRSqrKF−Chol RRSqrKF−SVD R EFERENCES 5 10 4 J 10 3 10 2 10 1 10 50 100 150 200 250 time index 300 350 400 (a) 6 10 Optimal KF RRSqrKF−Chol RRSqrKF−SVD 5 10 4 J 10 3 10 2 10 1 10 50 100 150 200 250 time index 300 350 400 (b) 6 10 Optimal KF RRSqrKF−Chol RRSqrKF−SVD 5 10 4 10 J [1] M. Lewis, S. Lakshmivarahan, and S. Dhall, Dynamic Data Assimilation: A Least Squares Approach. Cambridge, 2006. [2] G. Evensen, Data Assimilation: The Ensemble Kalman Filter. Springer, 2006. [3] D. S. Bernstein, L. D. Davis, and D. C. Hyland, “The optimal projection equations for reduced-order, discrete-time modelling, estimation and control,” AIAA J. Guid. Contr. Dyn., vol. 9, pp. 288–293, 1986. [4] D. S. Bernstein and D. C. Hyland, “The optimal projection equations for reduced-order state estimation,” IEEE Trans. Autom. Contr., vol. AC-30, pp. 583–585, 1985. [5] W. M. Haddad and D. S. Bernstein, “Optimal reduced-order observerestimators,” AIAA J. Guid. Contr. Dyn., vol. 13, pp. 1126–1135, 1990. [6] P. Hippe and C. Wurmthaler, “Optimal reduced-order estimators in the frequency domain: The discrete-time case,” Int. J. Contr., vol. 52, pp. 1051–1064, 1990. [7] C.-S. Hsieh, “The Unified Structure of Unbiased Minimum-Variance Reduced-Order Filters,” in Proc. Contr. Dec. Conf., Maui, HI, December 2003, pp. 4871–4876. [8] J. Ballabrera-Poy, A. J. Busalacchi, and R. Murtugudde, “Application of a reduced-order kalman filter to initialize a coupled atmosphereocean model: Impact on the prediction of el nino,” J. Climate, vol. 14, pp.1720–1737, 2001. [9] S. F. Farrell and P. J. Ioannou, “State estimation using a reduced-order Kalman filter,” J. Atmos. Sci., vol. 58, pp. 3666–3680, 2001. [10] P. Fieguth, D. Menemenlis, and I. Fukumori, “Mapping and PseudoInverse Algorithms for Data Assimilation,” in Proc. Int. Geoscience Remote Sensing Symp., 2002, pp. 3221–3223. [11] L. Scherliess, R. W. Schunk, J. J. Sojka, and D. C. Thompson, “Development of a physics-based reduced state kalman filter for the ionosphere,” Radio Science, vol. 39-RS1S04, 2004. [12] I. S. Kim, J. Chandrasekar, H. J. Palanthandalam-Madapusi, A. Ridley, and D. S. Bernstein, “State Estimation for Large-Scale Systems Based on Reduced-Order Error-Covariance Propagation,” in Proc. Amer. Contr. Conf., New York, June 2007. [13] S. Gillijns, D. S. Bernstein, and B. D. Moor, “The Reduced Rank Transform Square Root Filter for Data Assimilation,” Proc. of the 14th IFAC Symposium on System Identification (SYSID2006),Newcastle, Australia, vol. 11, pp. 349–368, 2006. [14] A. W. Heemink, M. Verlaan, and A. J. Segers, “Variance Reduced Ensemble Kalman Filtering,” Monthly Weather Review, vol. 129, pp. 1718–1728, 2001. [15] D. Treebushny and H. Madsen, “A New Reduced Rank Square Root Kalman Filter for Data Assimilation in Mathematical Models,” Lecture Notes in Computer Science, vol. 2657, pp. 482–491, 2003. [16] M. Verlaan and A. W. Heemink, “Tidal flow forecasting using reduced rank square root filters,” Stochastics Hydrology and Hydraulics, vol. 11, pp. 349–368, 1997. [17] J. L. Anderson, “An Ensemble Adjustment Kalman Filter for Data Assimilation,” Monthly Weather Review, vol. 129, pp. 2884–2903, 2001. [18] M. K. Tippett, J. L. Anderson, C. H. Bishop, T. M. Hamill, and J. S. Whitaker, Monthly Weather Review, vol. 131, pp. 1485–1490, 2003. [19] G. J. Bierman, Factorization Methods for Discrete Sequential Estimation. reprinted by Dover, 2006. [20] M. Morf and T. Kailath, “Square-root algorithms for least-squares estimation,” IEEE Trans. Autom. Contr., vol. AC-20, pp. 487–497, 1975. [21] G. W. Stewart, “Matrix Algorithms Volume 1: Basic Decompositions,” SIAM, 1998. [22] D. S. Bernstein and D. C. Hyland. Compartmental modeling and second-moment analysis of state space systems. SIAM J. Matrix Anal. Appl., 14(3):880–901, 1993. 3 10 2 10 1 10 50 100 150 200 250 time index 300 350 400 (c) Fig. 3. Time evolutions of the cost J , E(eT k ek ) for the compartmental system. For this example, R = 10−4 , P0 = 102 I, the rank of reduced rank filters is fixed at 2 and energy measurement is taken at compartment 9. Disturbances enter compartments 1,2,. . .,20. In (a), the cost J is when disturbance covariance is Q = I20×20 , (b) is for the case when Q(9,9) is changed to 4.0 and (c) is for the case when Q(9,9) is changed to 0.25. 6 10 Optimal KF RRSqrKF−Chol RRSqrKF−SVD 5 J 10 4 10 3 10 100 200 300 400 500 600 time index 700 800 900 1000 (a) 6 10 Optimal KF RRSqrKF−Chol RRSqrKF−SVD 5 J 10 4 10 3 10 100 200 300 400 500 600 time index 700 800 900 1000 700 800 900 1000 (b) 20 10 Optimal KF RRSqrKF−Chol RRSqrKF−SVD 15 J 10 10 10 5 10 0 10 100 200 300 400 500 600 time index (c) Fig. 4. Time evolutions of the cost J = E(eT k ek ) of the mass-springdamper system. Here R = 10−4 , P0 = 1002 I, the rank of reduced rank filters is fixed at 2, and a displacement measurement of m5 is available. Velocity and force disturbances enter mass 1,2,. . .,10. (a) shows the cost J when disturbance covariance is Q = I20×20 while (b) is for the case when Q(9,9) is changed to 4.0 and (c) is for the case when Q(9,9) is changed to 0.25.