mean hence

advertisement
Space/Time mapping – Part 3
The Simple Kriging limiting case
Space/time mapping using second order moments and hard data:
We will consider a special case of the general BME framework corresponding to the case where the
general knowledge consists only of the mean trend and covariance function, and the site-specific
data consist only in hard data. This case corresponds to the Simple Kriging method of classical
Geostatistics.
22
1
The general knowledge base and prior pdf
The general knowledge base G of the S/TRF X(p) consist of
It’s mean trend mX(p)=E[X(p)].
It’s covariance cX(p, p’)= E[ (X(p)-mX(p)) (X(p’)-mX(p’)) ].
This general knowledge base is using the stochastic moment constraints

h ( pmap )   dxmap g ( x map ) f G ( x map ; pmap ) ,
=1 …,Nc
where the chosen g(Xmap) and corresponding h are as follow
=0
gi (Xmap)= 1,
h0 = 1
=i, i =1,.,n
gi (Xmap)= Xi,
hi = mX(pi)
=(i,j) i =1,.,n; j =1,.,n
gij (Xmap)= Xi Xj,
hij =cx(pi,pj)+mX(pi)mX(pj)
23
Hence, the mathematical form of the maximum entropy pdf is given by
n
n
n
fG(xmap) = exp { o+   i x i +    i j xi x j }
i 1 j 1
i 1
where the 1+n+n2 lagrange coefficients  0 ,  i , i =1,…,n, and
by solving the following set of 1+n+n2 equations
=0
i =1,…,n j =1, 1,…,n are obtained
n
1=  dx map exp { o+   i x i +    i j xi x j }
i 1
=i, i =1, …,n
n
n

 ij ,

i 1 j 1
n
n
n
mx(pi)=  dx map xi exp { o+   i xi +    i j xi x j }
i 1 j 1
i 1

n
n
n
=(i,j) i =1,…,n j =1, 1,…,n cx(pi,pj)+mx(pi)mx(pj)=  dx map xi x j exp { o+   i x i +   i j xi x j }
i 1
i 1 j 1
We can actually solve the above set of equations to obtain values for the 1+n+n2 unknown
parameters. Let us write the equation for the maximum entropy pdf using vector notation as follow
T
T
fG(xmap) = exp { o+ x + x x } where x =[x1, …, x n] ,  =[1, …,  n] and  is a n by n matrix
T
with elements ij. The problem is to find the values for  and . Let us define the 1 by n vector o =-
24
 /2 and the n by n matrix D = - /2. Since o and D have the same size as  and , and since
-1
-1
there is a one-to-one relationship between them, then solving for o and D is the same as solving for 
-1
T -1
and . By rearranging the relationships we get  = -D /2 and  = o C , which when substituted
in the prior pdf equation leads to
fG(xmap)
= exp { o+ x + x x }
T -1
T
-1
=
exp { o+ (o D )x + x (-D /2) x }
T -1
T -1
T -1
T -1
= exp o exp 0.5{ 2o D x - x D x + (o D o-o D o)}
T -1
T -1
T -1
T -1
= exp {o+0.5 o D o} exp 0.5{ 2o D x - x D x - o D o}
T -1
T -1
T -1
T -1
T -1
= exp {o+0.5 o D o} exp 0.5{ (o D x +x D o)- x D x - o D o}
T -1
T -1
T -1
T -1
T -1
= exp {o+0.5 o D o} exp {-0.5( x D x -o D x+ o D o -x D o)}
T -1
T
T
-1
T
T
-1
= exp {o+0.5 o D o} exp {-0.5( (x -o )D x+ (o -x )D o)}
T -1
T
T
-1
T
T
-1
= exp {o+0.5 o D o} exp {-0.5( (x -o )D x- (x -o )D o)}
T -1
T
T
-1
= exp {o+0.5 o D o} exp {-0.5( (x -o )D (x-o))}
T
Using this above equation for fG(xmap) we can rewrite the equations as

T -1
T
T
-1
=0
1 =  dχ map exp {o+0.5 o D o} exp {-0.5( (x -o )D (x-o))}
25

mx(pi)=  dx map xi exp {o+0.5 o D o} exp {-0.5( (x -o )D (x-o))}
=(i,j) i =1,…,n j =1, 1,…,n

T -1
T
T
-1
cx(pi,pj)+mx(pi)mx(pj)=  dx map xi x j exp {o+0.5 o D o} exp {-0.5( (x -o )D (x-o))}
=i, i =1, …,n
T
-1
T
T
-1
We now use the following property of the multivariate Gaussian pdf with mean o and covariance D
1/ 2
D 1

T
T
-1
 dx 2 n / 2 exp {-0.5( (x -o )D (x-o))}=1
Using this property for equation =0 we get
1/ 2
D 1
T -1
exp {o+0.5 o D o}=
2 n / 2
Using that expression we can write the remaining equations as
1/ 2
D 1

T
T
-1
=i, i =1, …,n
mx(pi)=  dx map xi
n / 2 exp {-0.5( (x -o )D (x-o))}
2 
=(i,j) i =1,…,n j =1, 1,…,n
1/ 2
D 1

T
T
-1
cx(pi,pj)+mx(pi)mx(pj)=  dx map xi x j
n / 2 exp {-0.5( (x -o )D (x-o))}
2 
Using properties of the multivariate Gaussian pdf we get
26
=i, i =1, …,n
mx(pi)= oi
=(i,j) i =1,…,n j =1, 1,…,n
cx(pi,pj)+mx(pi)mx(pj)= Dij+oioj
or equivalently
=i, i =1, …,n
oi =mx(pi)
=(i,j) i =1,…,n j =1, 1,…,n
Dij =cx(pi,pj)+mx(pi)mx(pj)-oioj = cx(pi,pj)+oioj-oioj= cx(pi,pj)
which can be written in vectorial form as o=m and D=C, where mi=mx(pi) i =1, …,n and Cij=cx(pi,pj),
-1
T -1
i =1, …,n, j =1, …,n. From this is follows that  =-C /2 ,  =m C , which is the solution of the set
of equations that provides numerical values of the lagrange coefficients.
Hence, by way of summary, the maximum entropy pdf given a general knowledge base G consisting
the values of the mean trend mx(p) and covariance cx(p, p’) at points pi, i =1, …,n is given by
T
fG(xmap) = exp {o+ x + x x}
1/ 2
C 1
-1
T -1
T -1
where  =-C /2 ,  =m C and o= ln(
n / 2 )-0.5 m C m, which can equivalently be written as
2 
the multivariate pdf
27
C 1
fG(xmap) =
1/ 2
 1

T 1
exp

(
x

m
)
C
(
x

m
)
n/2
 2

2 
where the elements of m and C are mi= mx(pi) i =1, …,n and Cij=cx(pi,pj), i =1, …,n, j =1, …,n
28
2
Hard data and the posterior pdf
χ 
T
Let all the data be hard, xdata=xhard=[x1,xn-1] , so that xmap=  hard  where xk =xn . Then the
 χk 
posterior pdf is given by
n
fK(xk) =
f G ( x hard , xk )
=
f G ( x hard )
n
n
exp{ 0    i xi    i j xi x j }
i 1
n
i 1 j 1
n n
 dxk exp{ 0    i xi    i j xi x j }
i 1
,
i 1 j 1
m 
Since the prior pdf fG(xmap) is Gaussian with mean mmap=  hard  and covariance
 mk 
C hard, hard C hard, k 
Cmap= 
 , and since the posterior pdf fK(xk)=fG(xkxhard) is its conditional pdf given
C
C
k ,k 
 k , hard
the hard data, it follows from properties of Gaussian distributions that the posterior pdf is also
Gaussian
29
Hence when the knowledge base consists of the mean trend mx(p), the covariance cx(p, p’), and the
T
hard data xhard=[x1,xn-1] of a S/TRF X(p), then the posterior pdf fK(xk) is univariate Gaussian, and
from properties of Gaussian distributions we find that its mean mk|hard and variance Ck|hard are
mk|hard= mk + Ck,hard Chard,hard-1 (xhard-mhard)
Ck|hard= Ck,k - Ck,hard Chard,hard-1 Chard,k
where mk = mx(pk) is a scalar, Ck,hard is a row vector with the n-1 elements cx(pk,pj), j =1,.,n-1 ,
Chard,hard is a n-1 by n-1 matrix with elements cx(pi,pj), i =1,.,n-1; j =1,.,n-1, xhard is a column vector
with the n-1 hard data, mhard is a column vector with the n-1 elements mx(pi) , i =1,.,n-1,
Ck,k=cx(pk,pk) is a scalar, and Chard,k is the transpose of Ck,hard.
We note that:
 A good choice for the estimator ̂ k of X(pk) is the posterior mean mk|hard, i.e. use ̂ k =mk|hard
 This estimator is a linear combination of the hard data, i.e. ̂ k =0+1 x1 + … +n-1 x n-1
 The posterior variance Ck|hard provides an assessment of the estimation error
 The posterior variance Ck|hard is always smaller than the prior variance Ck,k
30
3
Example
Let X(p) be a homogeneous Random Field representing river water quality along the scalar
coordinate p (one dimensional space along the river). The mean of X(p) is m=10 g/m3, it’s
covariance is cX(r)=exp(-r), and the value of X(p) at 3 monitoring locations phard=[p1, p2, p3] T=[0 2.5
T
3.1] (Km) was exactly measured to be xhard=[x1,x2,x3] =[20 14 18] (g/m3). Estimate the water
quality at the estimation location pk=1.7 Km.
Solution:
T
The vector xmap=[x1, x2, x3, xk] representing X(p) at the mapping points pmap=[p1, p2, p3, pk] T=[0 2.5
3.1 1.7] is multivariate Gaussian, with the following mean vector mmap=E[xmap] and covariance
matrix Cmap=Cov[xmap] = cX(pmap, pmap) with elements Cij=exp(-|pi - pj|)
mmap
 1
10 
e |0  2.5|

 
10
1

 
   ; C map  
10

 
 ( sym )
10
 

e |0  3.1|
e | 2.5  3.1|
1
0.082 0.045 0.183 
e |0 1.7|   1
 

| 2.5 1.7|
1
0
.
549
0
.
449
e
 



1
0.247 
e |3.11.7|  



(
sym
)
1
1

 
31
The posterior pdf of xk is univariate Gaussian with the following mean and variance
1
E xk χ hard   mk  Ck , hardChard,
, hard  x hard  m hard 
1
0.082 0.045    20  10  
 1

    
 10  0.183 0.449 0.247
1
0.549    14   10  
 ( sym )
1    18  10  

 13.2
1
Varxk χ hard   Ckk  Ck , hard Chard,
, hard Chard, k
1
0.082 0.045   0.183 
 1

 

 1  0.183 0.449 0.247
1
0.549   0.449 
 ( sym )
1   0.247 

 0.777
32
4
Deriving Simple Kriging as a Best Linear Unbiased Estimator (BLUE)
Another way to derive the formulas for Simple Kriging is by defining it as a Best Linear Unbiased
Estimator (BLUE). Let X(p) be a random field with known mean mx(p), covariance cX(p, p’). Let
T
Xhard =[X1,…, Xn-1] represent X(p) at the points phard=(p1, , pn-1), and let the hard data
T
xhard=[x1,xn-1] be the exact known values of Xhard. We start by defining the estimator X̂ k of Xk
=X(pk) as the linear combination of Xhard
X̂ k 0  T Xhard.
where 0 and  T = [1, n-1] are parameters to be determined. The unbiasness condition imposes
that E[ X̂ k ]=E[Xk], which leads to0 mk T mhard , so that the estimator can be rewritten as
X̂ k   mk + T (Xhard. mhard)
33
The expected value of the estimation error, ek = Xk X̂ k , is the mean square error e2 =
E[(Xk X̂ k )2]. Substituting X̂ k  mk + T (Xhard. mhard) in e2 leads to

e2  E[ Xkmk  T (Xhard. mhard))2]
 E[(Xkmk)2  2 (Xkmk) T (Xhard. mhard)+ ( T (Xhard. mhard))( T (Xhard. mhard))]
 E[(Xkmk)2  2 (Xkmk) (Xhard. mhard)T + ( T (Xhard. mhard))((Xhard. mhard)T )]
 E[(Xkmk)2  2 (Xkmk) (Xhard. mhard)T+  T ( (Xhard. mhard) (Xhard. mhard)T) ]
 E[(Xkmk)2]  2 E[ (Xkmk) (Xhard. mhard)T]+  T E[(Xhard. mhard) (Xhard. mhard)T] 
 Ck,k  2 Ck,,hard  +  T Chard,,hard 
where Ck ,hard  [cx ( pk , p1) ... cx ( pk , pn1)] and C hard , hard
 cx ( p1, p1 ) ... cx ( p1, pn 1 ) 


...
...

.
cx ( pn 1, p1 ) ... cx ( pn 1, pn 1 )
34
The parameters  T are obtained by minimizing the mean square error e2
 e2
 0 , i  1,..., n  1
i



 e2
0
λT
 Ck , k  2 C k , hard λ  λT C hard , hard λ 
0
λT
 2 Ck , hard  2 λT Chard , hard  0

1
λT  Ck , hard Chard
, hard
1
x̂
Substituting λT  Ck , hard Chard
, hard and using k and xhard in place of X̂ k and Xhard, respectively, leads
to
x̂k   mk + Ck,,hard Chard,,hard (xhard. mhard)
-1
e2  Ck,k  2 Ck,,hard Chard,,hard Chard,,k
-1
These equations correspond to the mean and variance of the BME posterior pdf obtained when using
mean and covariance as general knowledge, and hard data as site specific knowledge. Hence when
35
the knowledge base consists of mean, covariance and hard data, BME obtains Simple Kriging as
special case.
36
Download