Reference: liebelt, Paul “ An Introduction to Optimal Estimation

Gauss-Markov Theorem Theorem: If x is an n vector and y is an m vector of random variables with second moment matrices E[ xxT ]  Cx E[ xyT ]  Cxy (1) E[ yyT ]  Cy where C y is positive definite. The linear minimum mean square estimate 1, x̂ , of x given the data, y , is xˆ  C xy C y 1 y Note that the inverse of an (2) m m matrix (C y 1 ) is required. The associated error matrix Ce of the estimation error is Ce  E[eeT ]  E ( xˆ  x)( xˆ  x)T 1 1  E[(Cxy C y y  x)(Cxy C y y  x)T ] (4) 1  Cx  Cxy C y C T xy In addition, if the following condition exists 1 E ( xˆ )  x  Cxy C y E ( y) then (5) x̂ is the linear minimum variance unbiased estimate of x . The proof of the Gauss-Markov Theorem is straightforward. We wish to find the linear estimator xˆ  Ay that minimizes (6) Ce Ce  E[( xˆ  x)( xˆ  x)T ]  E[( Ay  x)( Ay  x)T ]  AE[ yyT ] AT  E[ xxT ]  AE[ yxT ]  E[ xyT ] AT  AC y AT  Cx  AC T xy  Cxy AT 1 T The mean square error is given by e e where e is the estimation error, e  xˆ  x . 1 (7) Ce is found by taking the variation with respect to A The minimum of  Ce  0   ACy AT   ACT xy  ACy AT  Cxy AT   A(Cy AT  CT xy )  ( ACy  Cxy ) AT  0 For arbitrary A (8) this requires that AC y  C xy  0 or, A  C xy C y 1 i.e. xˆ  C xy C y 1 y For (9) x̂ to be unbiased 1 E ( xˆ )  x  Cxy C y E ( y) ( 10 ) Some other properties of minimum mean square error linear estimators In two dimensions we expect x̂ to be the true value of x projected into the y plane. y3 x e y2 x̂ y1 The error vector ( e) n 1 is perpendicular to the measurement space defined by y i.e. e  xˆ  x To show this, multiply ( 11 ) e by y T and take the expected value ey T  xˆy T  xyT ˆ T ]  E[ xyT ] E[eyT ]  E[ xy ( 12 ) 2 But, 1 ˆ T ]  E[Cxy C y yyT ] E[ xy ( 13 ) 1  C xy C y C y  C xy Hence, E[eyT ]  Cxy  Cxy  0 ( 14 ) Also, this means that Cexˆ  0 i.e. ( 15 ) x̂ is orthogonal to e as seen in the figure. To demonstrate this we can show that E[exˆT ]  0 as follows: E[exˆT ]  E[( xˆ  x) xˆT ] ˆ ˆ T ]  E[ xxˆ T ]  E[ xx 1 1 1  Cxy C y E[ yy T ]C y C T xy  E[ xy T ]C y C T xy 1 1  Cxy C y C T xy  Cxy C y C T xy 0 Special Cases of the Gauss-Markov Theorem Assume that we have a linear observation/state relationship. Then y  Hx   where H is an ( 16 ) m n matrix. From equation (2) 1 xˆ  Cxy C y y ( 17 ) We can compute C xy and C y , i.e. Cxy  E[ xyT ]  E[ x( Hx   )T ]  E[ xxT ]H T  E[ x T ] ( 18 )  Cx H T  Cx 3 and C y Cy  E[ yyT ]  E[( Hx   )( Hx   )T ] ( 19 )  HE[ xxT ]H T  HE[ x T ]  E[ xT ]H T  E[ T ]  HCx H T  HCx  C T x H T  C Substituting into equation (17), the estimate for x̂ becomes xˆ  (Cx H T  Cx )( HCx H T  HCx  C T x H T  C ) 1 y Using the expression for ( 20 ) Ce from the Gauss-Markov Theorem 1 Ce  Cx  Cxy C y C T xy Ce  Cx  (Cx H T  Cx )( HCx H T  HCx  C T x H T  C ) 1 (C x H T  C x )T Note that in the case of linear observations we require knowledge of ( 21 ) C x , Cx and C in addition to the data y . The computation of these matrices can be a difficult task. Usually we would assume that C x is diagonal with large diagonal elements which implies nothing is known about the true state. If we know the joint density function, f ( x1 , x2 xn ) , then C x can be computed from its definition   Cx     xx   If we know  T f ( x1 , x2 xn )dx1 , dx2 dxn ( 22 )  COV ( x)  E[( x  x )( x  x )T ] , where x is the mean of x , we can compute C x Cx  COV ( x)  xx T ( 23 ) As stated, we generally use a diagonal matrix with large numerical values for 4 Cx . If we have information on the joint density function, g , of the components of   Cx      x         g ( x1 xn , 1  m )dx1 dxn d 1 d m ( 24 )  The second moment matrix, Ce  T x and  , then T Ce , is usually assumed known, where h(1  m ) d 1 d m ( 25 )  , are uncorrelated2, then If the state vector, x, and observation noise, Cx  0 and equation (20) for ( 26 ) x̂ and equation (21) Ce simplify to xˆ  Cx H T ( HCx H T  C ) 1 y ( 27 ) Note that this still requires an m  m matrix inversion. This can be simplified by using Theorem 3 of the Matrix Inversion Theorems of Appendix B, i.e., BCT ( A  CBCT )1  (CT A1C  B1 )1 CT A1 (28 ) B  Cx , C  H , A  C ( 29 ) Let then 1 1 Cx H T ( HCx H T  C )1  (Cx  H T C H )1 H T C 1 ( 30 ) and 1 1 1 xˆ  ( H T C H  Cx )1 H T C y which is the minimum variance estimate of matrix ( 31 ) x̂ derived in Section 4.4. Finally, the estimation error covariance Ce is 1 1 Ce  ( H T C H  Cx )1 ( 32 ) Reference: Liebelt, Paul “ An Introduction to Optimal Estimation” Addison-Wesley, 1967 2 This is the same assumption made for the minimum variance estimate. 5

Reference: liebelt, Paul “ An Introduction to Optimal Estimation

Related documents

Products

Support

Reference: liebelt, Paul “ An Introduction to Optimal Estimation

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib