Document 14391635

advertisement
2. Linear Estimation
Most important sections (from preamble)
2.1, 2.4, 2.6 and 2.A
2.1 Normal Equations
2.1.1 Affine Estimation
Consider zero mean vector valued random variables x and y.
x is p x 1, y is q x 1 column vectors
A linear (affine) estimator would be formed as
xˆ  K  y  b
K is p x q and b is p x 1
But if both x and y are zero mean
xˆ  K  y
By selecting a zero mean, we move from an Affine process to a linear process … remember state
space linear systems. We could be estimate the state values based on observations.
2.1.2 Mean-Square-Error Criterion
Determine the MIMO solution for K as






2
min E ~
x*  ~
x  min E ~
x i 
K
Noting that
and



 
ki
2
2
2
E~
x* ~
x E ~
x 0   E ~
x 1    E ~
x  p  1

 
E~
x i   E xi   k i*  y
Where that the size of ki is q x 1, ki* is 1 x q, and K is p x q
2



2

2
Then, each ki minimizes one term of E ~
x i  .
and minimizing the individual values for ki effectively minimizes with respect to the K matrix!
To form the “expected” minimization process: (Note k i is q x 1 and y is q x 1)
 


2
E~
x i   E xi   k iH  y  x i   k iH  y


H

 

E ~
x i    E xi   xi    E xi   y  k  k  E y  xi    k  E y  y  k
E ~
x i    E xi    E xi   y  k  k  E y  xi    k  R  k
E ~
x i        R    k  k  R    k  R  k
E~
x i   E xi   xi   xi   y H  k i  k iH  y  xi   k iH  y  y H  k i
2
H
2
H
H
2
2
2
H
i
H
i
H
i
H
i
H
H
2
x i
x i y
i
H
i
H
i
H
i
H
i
yx i
H
yy
yy
i
i
i
To state this as a cost function
J k i    x2i   R x i  y  k i  k iH  R yx i   k iH  R yy  k i
Note that
R xy  R yx
H
J k i    x2i   R x i  y  k i   R x i  y  k i   k iH  R yy  k i
H
J k i    x2i   2  ReR x i  y  k i   k iH  R yy  k i
Thus the ith cost function element is dependent upon the first and second moments of x(i) and y.
Notice that the resulting value will be purely real for complex input!
To find the global minimum for K, we must be concerned with



 


2
2
2
E~
x* ~
x E ~
x 0   E ~
x 1    E ~
x  p  1

But this is where matrix theory can help …. if you use the trace of the matrix.







H
E~
x* ~
x  Tr E ~
x ~
x *  Tr R xx  R xy  K  R xy  K   K H  R yy  K

Now the sum of cost function elements is dependent upon the auto and cross-covariance matrices
of x and y.
Again, the resulting value will be purely real for complex input!
Now we have to form the minimization and solve for ki and K.
2.1.3 Minimization by Differentiation (gradient methods)
Using the gradient for complex values (taking a derivative with respect to each dimension ki,

 ki J k i    ki  x2i   R x i  y  k i   R x i  y  k i   k iH  R yy  k i
H

 ki J k i    R x i  y  k iH  R yy  0
k iH  R yy  R x i  y
Using the gradient for real values (or scalars)

 ki J ki    ki  x2i   Rx i  y  k i   Rx i  y  k i   k iH  R yy  ki
H

 ki J k i   2  R x i  y  2  k iH  R yy  0
k iH  R yy  R x i  y
Note: For the math behind taking the gradient/derivative see Appendix 2.B.
If we collect all the minimized solutions for ki, all referred to as (kiO)H, we can define the optimal
Kopt matrix:
K opt  R yy  Rxy
Which for Ryy invertible becomes
K opt  Rxy  R yy
K opt

1
k1H 
 
    for xˆ  K opt  y
k pH 
 
Dr. Bazuin, and others, often referred to this as the “Wiener solution”, but the actual
Wiener-Hopf solution solves a “deeper”, more complex, and broader problem than just
this ….
see p. 64 Remark 1 and p. 167-169.
Dimensionality note: that the size of x is p x 1, y and ki are q x 1, Ryy is q x q, Rxy is p x q, and
Kopt is p x q.
2.1.4 Minimization by Completion-of-Squares
 2
H
J k i    x2i   R x i  y  k i   R x i  y  k i   k iH  R yy  k i  1 k iH   x i 
 R yx i 

 R x i  y   1 

R yy  k i 

Math trick (again), expanding just the central matrix
B   I D   0   I



C  0 I   0 C   D H
 A
B H

0
I 
for D  C  B and   A  B  D H
Then
  x2i 

 R yx i 
 R x i  y  1  R x i  y  R yy 1   x2i   R x i  y  R yy 1  R x i  y H


R yy  0
0
I
 
1
0  

1
R yy   R x i  y  R yy

assuming the correct solution
ki
opt
1
 R yy  R yx i  from k i
opt H
 R yy  Rx i  y
we can substitute in for the expanded matrix
  x2i 

 R yx i 
H
 R x i  y  1  k i opt   x2i   R x i  y  k i opt


 
R yy  0
0
I  
0   1

opt
R yy   k i
0
I 
The cost function then becomes
1  k opt H   x2i   R x i  y  k i opt
i
J k i   1 k iH  

0
0
I

 

0   1

opt
R yy   k i

 k
J k i   1
H
i
 ki
opt H
 
2
x i 
 R x i  y  k i
J k i    x2i   R x i  y  k i
To minimize the cost, we must have k i  k i
opt
opt
0   1 

opt 
R yy  k i  k i 
opt
0


 k iH  k i
0
opt H
0  1 

I  k i 
R
yy

 ki  ki
opt


H
0
I 
The minimum cost functions are then in any of the forms …
J k i    x2i   k i
opt H
 R yy  k i
J k i    x2i   R x i  y  k i
J k i    x2i   k i
opt H
opt
opt
 R yx i 
The summed cost function for all weights can be described as



H
E~
x* ~
x  Tr R xx  K opt  R yy  K opt

where Tr{} is a trace function defined as the sum of the diagonal elements.
2.1.5 Minimization of the Error-Covariance Matrix
The same solution can be derived using matrix operations.
First defining a cost matrix (note that the order of the expected value changed). This cost matrix
is not a scalar ….
 

H
J K   E ~
x ~
x *  E x  K  y   x  K  y 

 R xx
K  
 R yx

J K   E ~
x ~
x *  I

 R xy   I 

R yy   K H 
Expanding the central matrix
 R xx
 R
 yx
 R xy   I  R xy  R yy 1   R xx  R xy  R yy 1  R yx


R yy  0
0
I
 
I
0  

1
R yy   R yy  R yx
for
K opt  R xy  R yy
K opt
 R xx
 R
 yx
H
1
1
 R yy  R yx
 R xy   I  K opt   R xx  R xy  K opt H


R yy  0
I  
0
0   I

H
R yy   K opt
0
I 
Substituting
J K   I
H
 I  K opt   R xx  R xy  K opt
K  


I  
0
0

0   I

H
R yy   K opt

0  I 

I   K H 
H
H
J K   E ~
x ~
x *  R xx  R xy  K opt  K  K opt   R yy  K  K opt 
with the minimum


H
J K   E ~
x ~
x *  R xx  R xy  K opt


H
J K   E ~
x ~
x *  Rxx  K opt  R yy  K opt


J K   E ~
x ~
x *  R xx  K opt  R yx
From which the scalar mmse becomes



E~
x* ~
x  TrJ K   Tr E ~
x ~
x*

0
I 
Theorem 2.1.1 Optimal Linear Estimation
For {y, x} zero mean r.v. the linear least-mean-square estimator (l.l.m.s.e.) of x given y is
xˆ  K opt  y
where Kopt is any solution to the linear system of equations
K opt  R yy  Rxy
This estimator minimizes the following two error measures



min E ~
x ~
x*
x* ~
x and min E ~
K
K

The cost function on the left is the trace of the cost function on the right. The resulting minimum
mean-square errors are given by

  
2
opt
opt H
opt
min E ~
x i   J ki   x2i   k i
 R yy  ki
ki


 

H
min E ~
x*  ~
x  Tr J K opt   Tr Rxx  K opt  R yy  K opt
K


H
x ~
x *  Rxx  K opt  R yy  K opt
min E ~
K
K opt
 k opt H 
 1

  
k opt H 
 p


2.2 Design Examples
See on-line textbook Chapter 4 – Orthogonality Principles.
Note: these are optimal linear least mean square estimators (l.l.m.s.e)
Those in chapter 1 were optimal minimum mean square estimators (m.m.s.e.).
That is why solutions differ [previous examples are repeated, 2.2.1 and 2.2.2 with expansion of 2
& N and 2.2.3.
Note the sequence … single observation, two observations and N observations.
Example 2.2.1 (Noisy measurement of a binary signal). [see Ex 1.3.1]
Binary values for transmission take on the values of +/- 1 with a desired probability of ½
(not always true, but usually desired).
If we make observation/estimate in a zero-mean, unit variance Gaussian environment:
y  xv
where x and the distribution of v are independent of each other.
Using Theorem 2.1.1
K opt  R yy  Rxy
where
R yy   y and Rxy   xy
2
Since x and v are zero mean and independent
 y2   x2   v2  1  1  2
Also
 xy  E x  y   E x   x  v   E x  x  x  v   E x  x   E x  v   1  0  1
Then
K opt   y   xy
2
K opt  2  1
K opt 
1
1
and xˆ   y
2
2


 


 

H
min E ~
x*  ~
x  Tr J K opt   Tr Rxx  K opt  R yy  K opt
K

1
2 1
2
2 1
x* ~
x  TrJ K opt    x    y   1  
min E ~
K
2
2
4 2
Example 2.2.1b (Multiple measurements of a binary signal). [Ex. 1.4.3]
 y 0
v0
 y 1   x   v1





xˆ  h y  h y0, y 1
~
x  x  xˆ
The optimal linear least mean square estimators (l.l.m.s.e)
K opt  R yy  R xy
Defining the vectors and matrices
 y 0   y 0 H y 0   y 1H 
R yy  E y  y H  E 
H
H 
y 1  y 1 
 y 1  y 0 


  E x  v0 x  v0   E x  v0   1  1  2
  E x  v0  x  v1   E x  v0  v0  1  0  1
E y 0   y 0 

E y 0   y 1
H

H
2
H

2
2
H

2 1 
R yy  E y  y H  

1 2 
Next

 
Rxy  E x  y H  E x  y 0 
x  y 1
H
H

R xy  E x  x H
Then
  E x  x  v0
H

x  x H  1 1
x  x  v0 
H

2 1 
K opt  
  1 1
1 2 
K opt
 2  1
 2  1
 1 2 
 1 2 
  1


 1 1 2 2  1 1 
 3
3
2 1
1
K opt  
3

1
1
and xˆ  

3
3

 
1
3 
1   y 0  1
1

  y 0    y 1


3   y 1  3
3

H
min E ~
x*  ~
x  Tr J K opt   Tr Rxx  K opt  R yy  K opt
K



H
2 1
 1 1  2 1  1 1 
min E ~
1
x*  ~
x  TrJ K opt   1  







K
3 3
 3 3  1 2  3 3 
 
Example 2.2.2 (Multiple measurements of a binary signal).
y i   x  vi  for i=0 to N-1
By inspection from previous answers …
xˆ 

N 1
1
  y i 
N  1 i 0

 
x* ~
x  TrJ K opt  
min E ~
K
1
N 1
Brief discussion …. multiple sensors with independent noise …


Rvv  E v  v H   v  I
2
Math Trick
a  col1,1,1,  ,1
For
I  a  a 
T 1
a  aT
a  aT
I
I
2
1  aT  a
1 a
For the example problem above
R yy  I  a  a T and R xy  a T
The optimal linear least mean square estimators (l.l.m.s.e)
K opt  R yy  Rxy

  aT  1  aT  a  aT  aT  1  N  aT

1 N
1 N

1
N  T

K opt  1 
 aT
a 
1 N
 1 N 
N 1
1
1
xˆ 
 aT  y 
  y i 
N 1
N  1 i 0

a  aT
K opt  a T   I 
 1 a 2

and


 

H
min E ~
x*  ~
x  Tr J K opt   Tr Rxx  K opt  R yy  K opt
K


min E ~
x*  ~
x  1
K

 1  T
1
1
  a  a  aT  a  aT  a
 aT  I  a  aT  a 
 1  
2 
1 N
1 N
 1  N  



 1 
N
1
  N  N  N   1 
x* ~
x  1  

min E ~
2

K
1 N 1 N
 1  N  



Example 2.2.3 (Transmission over a noisy channel). [Ex. 1.4.2]
This is the concept of a noisy communication channel, but with a very simple FIR channel
response.
Channel modified symbol zi
z i   si   0.5  si  1
Where
The measured sample symbol is
y i   z i   vi   si   0.5  si  1  vi 
We want to estimate two symbols s(0) and s(1) based on observed measurements y(0) and y(1).
y0  z 0  v0  s0  v0
y1  z 1  v1  s1  0.5  s0  v1
Comment: this assumes the start of a transmission such that s(-1) = 0
The optimal linear least mean square estimators (l.l.m.s.e)
K opt  R yy  R xy
Defining the vectors and matrices
 y 0   y 0 H y 0   y 1H 
H
R yy  E y  y  E 
H
H 
y 1  y 1 
 y 1  y 0 



  E s0  v0  s0  v0     
E y 0   y 1   E s 0   v 0   s 1  0.5  s 0   v 1   0.5  
E y 1  y 1   E s 1  0.5  s 0   v 1  s 1  0.5  s 0   v 1     0.25  
E y 0   y 0 
H
2
H
2
s
H
v
2
H
s
H
2
H
s
 2   v 2
R yy  E y  y H   s
2
 0 .5   s


0 .5   s
2
 s  0.25   s
2
2
 2
0 .5 


  v  0.5 2.25
2
2
s
v
2
Next

R xy  E x  y

H

 s 0   y 0 H
 E
H
 s 1  y 0 
H
s 0   y 1 
H 
s 1  y 1 
  E s0 s0  v0   
E s 0   y 1   E s 0   s 1  0.5  s 0   v 1   0.5  
E s 1  y 0    E s 1  ss 0   v0    0
E s 1  y 1   E s 1  s 1  0.5  s 0   v 1   
E s 0   y 0 
H
2
H
s
H
2
H
H
s
H
H
2
H
s
 s 2
R xy  
 0
2
0 .5   s   1 0 .5 


 s 2  0 1 
Then
0.5  1 0.5
2
K opt  


0.5 2.25 0 1 
K opt
 2.25  0.5
1 0.5  0.5
2 



4.5  0.25
0 1 
K opt 
 sˆ0 
 sˆ1   K opt


0.5
1  2

4.25  0.5 2 
 y 0  1  8 2  y 0  1  8  y 0   2  y 1 

  

  

 y 1 17  2 8   y 1 17  2  y 0   8  y 1
And


 

H
min E ~
x*  ~
x  Tr J K opt   Tr Rxx  K opt  R yy  K opt
K

1 0
0.5  2 0.5   2  0.5 1 
1  2
min E ~



x* ~
x  Tr 





K
2  4.25 
0 1 4.25  0.5 2  0.5 2.25 0.5


1 0
1
min E ~

x* ~
x  Tr 

2
K
0 1 4.25


0.5  4.25
0 
 2




 0.5 2  2.125 4.25 
1 0
9.5625 2.125 
1
min E ~



x* ~
x  Tr 

K
8.5  
0 1 18.0625  2.125


 1
 2.125  8.5  9.5625
 8.5
min E ~
x* ~
x  Tr 

   18.0625  1
K
18.0625  2.125 9.5625  




min E ~
x* ~
x 1
K
Example 2.2.4 (linear channel equalization) - 1st occurrence
This is indicative of: channel impairments, symbol cross-talk
All random processes must be assumed to be WSS (wide-sense stationary) … transients are
gone, statistical/probabilistic characteristics are valid.
Symbol set si
The measured sample symbol is
y i   z i   vi   si   0.5  si  1  vi 
Where v(i) is our normal random Gaussian noise term
An equalizer consisting of a fixed, three-tap FIR filter will be employed.
sˆi      0  y i    1  y i  1   2  y i  2 
For the moment we will have no time delay …
Solution
H
xˆ  sˆi   k 0  y
k iH  R yy  R x i  y
Determine the covariance and cross covariance

R yy  E y  y T

Building up from y(i)
y i   y i   s i   y i   0.5  s i  1  y i   vi   y i 
T

T

T

 
T
ryy 0   E s i   y i   0.5  E si  1  y i   E vi   y i 
T
T
T


 

E s i  1  y i    E s i  1  s i   0.5  s i  1  s i  1  s i  1  vi    0.5
E vi   y i    E vi   si   0.5  vi   si  1  vi   vi    E vi   vi      1
E s i   y i   E s i   s i   0.5  s i   s i  1  s i   vi   1
T
T
T
T
T
T
T
T
T
T
T
T
T
2
v
ryy 0  1  0.5  0.5   v2  2.25



 

ryy 1  E s i   y i  1  0.5  E s i  1  y i  1  E vi   y i  1
T

 
E s i  1  y i  1   E s i  1  s i  1
E v i   y i  1   E vi   s i  1
T
T

E s i   y i  1  E s i   s i  1  0.5  s i   s i  1  s i   vi  1  0
T
T
T
T

 0.5  s i  1  s i  2   s i  1  vi  1  1
T
T
T
T
T

 0.5  vi   s i  2   vi   vi  1  0
T
T
T
ryy 1  0  0.5  1  0  0.5



 

ryy 2  E s i   y i  2  0.5  E s i  1  y i  2  E vi   y i  2 

T
 
E s i  1  y i  2    E s i  1  s i  2 
E v i   y i  2    E vi   s i  2 
T
T

E s i   y i  2   E s i   s i  2   0.5  s i   s i  2   s i   vi  2   0
T
T
T
T
T
T
T
T

 0.5  s i  1  s i  3  s i  1  vi  2   0
T
T

 0.5  vi   s i  3  vi   vi  2   0
T
T
ryy 2   0  0.5  0  0  0

R yy  E y  y T

0 
2.25 0.5

  0.5 2.25 0.5 
 0
0.5 2.25
Next

R xy  E x  y T

Building up

 


 

E x i   y i  1   E s i   s i  1   0.5  E s i   s i  2    E s i   v i  1   0
E xi   y i  2    E s i   s i  2    0.5  E s i   s i  3   E s i   vi  2    0
E xi   y i   E s i   s i   0.5  E s i   s i  1  E s i   vi   1
T
T
T
T
T
T
T
T
T


R x i  y  E x  y T  1 0 0
Note
T
T
T

R yxi   E y  x T


 


 

E y i  1  x i    E s i  1  s i    0.5  E s i  2   s i    E vi  1  s i    0
E y i  2   xi    E s i  2   s i    0.5  E s i  3  x i    E vi  2   s i    0
E y i   xi   E s i   s i   0.5  E s i  1  s i   E vi   s i   1
T
T
T
T
T
T
T
T
T
T
T
T
Then,
0 
2.25 0.5

k  R yy  k   0.5 2.25 0.5   1 0 0  Rx i  y
 0
0.5 2.25
H
i
k iH
H
i
0 
2.25 0.5
 1 0 0   0.5 2.25 0.5 
 0
0.5 2.25
k iH 
4.8125
1

 4.8125  1.125 0.25 
1 0 0   1.125 5.0625  1.125
 0.25  1.125 4.8125 
det

 1.125 0.25
 0.4688  0.1096 0.0244
10.2656
The mmse for the linear system is
J k i    x2i   R x i  y  k i
opt
 0.4688 
J k i   1  1 0 0   0.1096  1  0.4688  0.5312
 0.0244 
This performs better than the previous example!
2.3 Existence of Solutions
The solution to the problem is shown to be
K opt  R yy  R xy
Under what conditions does a solution exist? And, is the solution unique?
This depends on the rank of the auto-covariance matrix of the sampled data or Ryy.
We already know that Ryy is non-negative definite. It will be non-singular when it is positive
definite!
Existence requires it to be a positive definite matrix.
The proof in the book is a “counter proof” demonstrating counter arguments to those above:
(1) If the matrix is singular infinitely many solution are possible.
(2) There are infinitely many solutions if and only if the matrix is singular.
As an odd addition to this process … if the matrix is singular …
There are infinitely many solutions … but
All the solutions result in the same value for the estimator and the same value cost function!
2.4 Orthogonality Principles
The linear least-mean-squares estimator admits an important geometric interpretation in the form of an orthogonality
condition.
K opt  R yy  R xy

 
K opt  E y  y H  E x  y H

or equivalently


E x  K opt  y   y H  0
but this is
~
x  x  xˆ  x  K opt  y
Therefore, we have the orthogonality
~
xy
We thus conclude that for linear least-mean-squares estimation, the estimation error is
orthogonal to the data and, in fact, to any linear transformation of the data, say, A  y for any
matrix A. This fact means that no further linear transformation of y can extract additional
information about x in order to further reduce the error covariance matrix. Moreover, since the
estimator x̂ is itself a linear function of y, we obtain, as a special case, that
~
x  xˆ
2.5 Nonzero–Mean Variables
The discussion has focused so far on zero-mean random variables z and y. When the means are
nonzero, we should seek an unbiased estimator for 2 of the form
xˆ  K  y  b
for some matrix K and some vector b. As before, the optimal values for { K , b} are determined
through the minimization of the mean-square error,

x*  ~
x
min E ~
K ,b

where ~
x  x  xˆ
To solve this problem, we start by noting that since the estimator should be unbiased we must
enforce Exˆ   x . Taking expectations of both sides of shows that the vector b must satisfy
E xˆ   E K  y  b 
x  K opt  E  y   b  K  y  b
Using this expression for b, we can substitute
xˆ  K  y   x  K  y 
or
xˆ  x  K   y  y 
This expression shows that the desired gain matrix K should map the now zero-mean variable
 y  y  to another zero-mean variable xˆ  x  . In other words, we are reduced to solving the
problem of estimating the zero-mean random variable x  x  from the also zero-mean random
variable  y  y  .
We already know that the solution Kopt is found by solving
K opt  R yy  R xy
in terms of the covariance and cross-covariance matrices of the zero mean random variables. Or
in this case,

R yy  E  y  y    y  y 
H


and R xy  E  x  x    y  y 
The optimal solution is then given by
xˆ  x  K opt   y  y 
H

Comparing this equation to the zero-mean case from Thm. 2.1, we see that the solution to he
nonzero-mean case simply amounts to replacing x and y by the centered variables x  x  and
 y  y  , respectively, and then solving a linear estimation problem with these centered (zeromean) variables.
For this reason, there is no loss of generality, for linear estimation purposes, to assume that all
random variables are zero-mean; the results for the nonzero-mean case can be deduced via
centering.
For computation of the least mean square estimates minimum mean square error it follows tha

K opt  R yy  K opt  E  y  y    y  y 
H
  Ex  x    y  y    R
H
xy
or equivalently

E K opt   y  y   x  x    y  y 
H
 0
so that the orthogonality condition in the nonzero-mean case becomes


H
x  y  y
E~
x   y  y   0 or ~
where ~x  x  xˆ  x  x  x  xˆ
Moreover the resulting m.m.s.e. matrix becomes


H
E~
x ~
x H  R xx  K opt  R yy  K opt
with

R xx  E  x  x    x  x 
H

Note: thuis is why we have referred to the matrices as auto- and cross-covariance instead of autoand cross-correlation matrices.
2.6 Linear Models (on-line Chap. 5)
This is a subtle change from
y  xv
to
yH xv
but when dealing with vectors and matrices, everything must be reviewed ….
Theorem 2.6.1 Linear estimator for linear models:
For {y, x, v} zero mean r.v. that are related linearly as
yH xv
For x and v uncorrelated with invertible covariance matrices. The linear least mean-square (llms)
estimator can be evaluated as

xˆ  R xx  H H  Rvv  H  R xx  H H

1

 y or K opt  R xx  H H  Rvv  H  R xx  H H

1
or equivalently

1
1
xˆ  R xx  H H  Rvv  H

1

1
1
1
 H H  Rvv  y or K opt  R xx  H H  Rvv  H
The resulting minimum mean-square error matrix J K opt   R xx  K opt  R yy  K opt

J K opt   R xx  H H  Rvv  H
1
1

1
H
 H H  Rvv
becomes

1
Note:

R yy  E H  x  v   H  x  v 
H


R yy  E H  x  x H  H H  v  x H  H H  H  x  v H  v  v H

For independent x, and v

 
 
 H  E x  x  H  E v  v 
 
R yy  E H  x  x H  H H  E v  x H  H H  E H  x  v H  E v  v H
R yy
H
H
R yy  H  Rxx  H H  Rvv
H

1
And:

R xy  E x  H  x  v 

H

 

R xy  E x  x H  H H  E x  v H
Rxy  Rxx  H H
Proof of the 1st assertion
1
xˆ  K opt  y  R xy  R yy  y


xˆ  Rxx  H H  Rvv  H  Rxx  H H

1
y
Another matrix Identity
 A  B  C  D 1  A 1  A 1  B  C 1  D  A 1  B 1  D  A 1
Continuing



1
1
1
1
xˆ  R xx  H H  Rvv  Rvv  H  R xx  H H  Rvv  H

1

 H H  Rvv
1
1
1
1
xˆ  R xx  H H  Rvv  R xx  H H  Rvv  H  R xx  H H  Rvv  H


1
1
1
xˆ  R xx  R xx  H H  Rvv  H  R xx  H H  Rvv  H
 
xˆ  I  R

1

 H
1
H
1
 y
1
 Rvv  y

 H R  y
 R  H  R  H  R  H  R  H  R  H   H  R  y
xˆ  I  R  H  R  H   H  R  y
xˆ  R  H  R  H   H  R  y

H H
 y
 H H  Rvv
1
1
1
1
1
xˆ  R xx  R xx  H H  Rvv  H  R xx  H H  Rvv  H  R xx  H H  Rvv  H
xx
1
1
1
H
vv
xx
1
1
1
1
1
1
vv
vv
1
1
H
vv
1
H
1
H
1
H
vv
vv
H
1
H
xx
H
xx
xx
1
vv
1
vv
Why?
If x is a sample (scalar), H a column vector and v a vector …
The original solution required a matric inversion … the later “inversion” does not!
vv
The error matrix translation is converts to forms of Rxx and Rvv as
J K opt   Rxx  K opt  R yy  K opt
H
J K opt   Rxx  Rxy  R yy  R yy  R yy  Rxy
1
1
J K opt   Rxx  Rxy  R yy  Rxy
1
H
H

J K opt   Rxx  Rxx  H H  Rvv  H  Rxx  H H

1
 H  Rxx
Again recognizing the form of the mathematical conversion
A 1  A 1  B  C 1  D  A 1  B   D  A 1   A  B  C  D 
1

1
J K opt   R xx  H H  Rvv  H
Don’t you love “simplifying the math”?!
1
1

1
Remark 2: Non Zero Mean
For {y, x, v} non-zero mean r.v. that are related linearly as
y  H xv
For x and v uncorrelated with invertible covariance matrices. The linear least mean-square (llms)
estimator can be evaluated as

xˆ  x  R xx  H H  Rvv  H  R xx  H H

1
 y  H  x  v 
or

1
1
xˆ  x  R xx  H H  Rvv  H

1
 H H  Rvv   y  H  x  v 
1
The resulting minimum mean-square error matrix is

J K opt   R xx  H H  Rvv  H
1
1

1
2.7.1 Channel Estimation (Example Application)
There are some application where we want to “estimate the channel response” as compared to
equalizing the symbol output of the channel.
The observed data is the “receiver input”.
The channel model is an FIR filter of unknown coefficients. Therefore,
y  H c  v
In our case, the “known values” are the transmitted symbols in time samples. Therefore we have
and we want to form

1
cˆ  K opt  y  Rcy  R yy  y or cˆ  Rcc  H H   Rvv  H  Rcc  H H

1
y
The problems set-up that got us here is defined in the text as:
The channel is assumed initially at rest (i.e., no initial conditions in its delay elements) and a
known input sequence { s( i ) }, also called a training sequence, is applied to the channel. The
resulting output sequence { z ( i ) }is measured in the presence of additive noise, v( i ) ,as shown
in the figure.
The quantities {y, H} so defined are both available to the designer, in addition to the covariance
matrices { Rcc  E c  c H , Rvv  E v  v H } (by assumption). In particular, if the noise sequence {




v ( i ) }is assumed white with variance is  v , then Rvv   v  I . With this information, we can
estimate the channel as follows. Since we have a linear model relating y to c, as indicated by the
equations, then according to Thm. 2., the optimal linear estimator for c can be obtained from
either expression:
2

cˆ  Rcc  H H   Rvv  H  Rcc  H H

1
2

1
1
 y  Rcc  H H  Rvv  H H

1
1
 H H  Rvv  y
Notes for communications:
(1) Almost every communication system transmits a preamble before the rest of the message is
sent. The preamble is composed of numerous “known” signal segments if not all “known” signal
segments. Therefore, channel estimation can be performed and then used as part of a processing
algorithm as C(z)-1.
(2) The length of the “channel estimation” FIR filter must be determined for a system and to
perform the computations. Good news/Bad new …. you can always pick an FIR length longer
than required. This may mean you are performing more processing than required, but a longer
fixed length FIR may be easier to deal with than one who’s length must be estimated and then
computed.
2.7.2 Block Data Estimation
A reformulation of Channel Estimation
2.7.3 Linear Channel Equalization
A reformulation of Example 2.2.4 Linear Channel Equalization
2.7.4 Multiple Antenna Receiver
A reformulation of Example 2.2.2 Multiple Measurements of a Binary Signal
App 2.A Range Space and Nullspaces of Matrices
See p. 103 of text
Null space vectors are orthogonal to range space vectors …..
Important in multiple antenna receiver applications, particularly for multiple input, multiple
output adaptive receivers (utilization of smart antennas).
App 2.C Kalman Filter
See p. 108 of text
This is Chapter 7 of the on-line textbook.
Problems 2.11, 2.20, 2.25
Problem2.11(MatchedFilterBound):Theproblemissetupasfollows:wearebasically
performingaone‐shottransmission,whereazero‐meanrandomsymbol istransmitted
over an FIR channel with an impulse response ≜
0 , 1 ,…,
1 . The
receiversignalisdefinedas
∙
Our objective is to design an FIR receiver with impulse response , basically a matched
filtersoastomaximize
.
(a) Thefirsttaskistocomputethevarianceofthesignalcomponentattheoutputofthe
receiver. If the receiver has taps, then its output signal will have a signal and a
noisecomponentobtainedbyconvolvingthereceivefilterwiththechannelandthe
noise vectors. Using this result the variance of the signal component can be
computed.
(b) Nowweshifttovarianceofthenoisecomponentattheoutputofthereceiver.Again
usingtheequationsobtainedin(a),wecancomputevariance.Sinceweknowthat
thechannelnoisevectorhasvariance ,thevarianceofthenoisecomponentwill
beequalto ‖ ‖ .
(c) Wearenowrequiredtoverifythematchedfiltersolution,i.e.,identifythecondition
that maximizes the
of the receiver at time . The
at the output of the
receiverisgivenby
ThenusingCauchy‐Schwartzinequalitywecanobtainthematchedfiltersolution.
(d) Here we calculate the largest value that the
can attain; the matched filter
in(c).
bound.Wealreadyhaveobtained
(e) Finally using the matched filter solution we evaluate the output of the receiver at
time .
Problem 2.20 (Frequency Domain Equalization): This problem explains the concept
behindfrequencydomainequalization.Thedata
istransmittedwithcyclicprefixing
overachannelwithtransferfunction
0
⋯
1
Ablockof
1 noisymeasurements iscollected,itsdiscretefouriertransform(DFT)
calculated, then after scaling the result by an appropriate factor the estimator in the
frequency domain can be determined. This problem uses the results of Prob. 2.19, more
specificallythefactthatusingcyclicprefixingresultsinacirculantchannelmatrix.
(a) The first section of the problem is to determine the noisy observation in the
frequencydomain.If istheDFTmatrixofsize
1 ,then
Λ ̅
where
wecanwrite
, ̅
,
.Thetrickhereistoprove
∗
∗
isunitary,sothat
∗
Λ
is the diagonal matrix with eigenvalues denoted by
. This result is
obtained from the fact that a circulant matrix can be diagonalized by DFT. The
textbookprovidesthisresultbutfailstoexplaintheconcept.Wearealsorequiredto
findthecovariancematricesof ̅, thoughthisisrelativelysimple.
(b) Wewillnowdeterminetheleast‐mean‐squaresestimatorinthefrequencydomain.
̅
̅
can be evaluated easily and the estimator ̅
The covariance matrices
̅ ,
determined.Thelastpartofthisproblemistoprove
̅
ThissectionusestheresultthatΛ
beforehasneverbeenderived.
∗
|
|
0 ,
1 ,…,
whichasmentioned
Problem 2.25 (Random Telegraph Signal): In this problem we model a random
telegraph signal as a Poisson process and then evaluate various probabilities, covariance
matrices and then establish the linear least‐mean‐squares estimator.
is a random
process with initial value 0
1 with probability 1/2. After each occurrence of an
eventinthePoissonprocess,
changespolarity.ThePDFofthePoissonprocessisgiven
by
,
!
0,1,2 ….
istheaveragenumberofeventspersecond, isthenumberofeventsoccurringinthe
interval 0, .
(a) Calculate the probabilities
1|
0
It is useful to recall the Taylor series expansion
1 and
∑
1|
!
. Since
0
0
1 .
1,
we know that for every even , value of
1. So
1| 0
1
.Similarargumentcanbemadefor
1| 0
1 .
(b) Calculate the probabilities
1| 0
1 and
1| 0
1 .
Wecanusetheresultsfrom(a)tocomputetheprobabilities.
(c) Calculate the probabilities
1 and
1 . Using conditional
probabilitieswecanestablishthat
1
0
0
1
1
1| 0
1| 0
1
1 (d) We can easily show that
is zero‐mean. The auto‐correlation
can be
computed by
is independent of time and
. Since
meaniszero,
isalsostationary.
(e) Now let’s compute the least‐mean‐squares estimator of 0 given both
and
,
andtherebythem.m.s.e.
2 .Usingtheresultsof(d)wecanevaluate
Download