bme spatiotemporal epidemiology modeling of disease rate

advertisement
Adjusting for sampling variability in disease mapping
using a Poisson soft data model
1. Simple co-kriging
Suppose that the random variable xk and the random vector zh=[ zh1, …, zhn]T are correlated with
known covariance / cross covariance. We seek the best linear unbiased estimate x̂ k for xk given
measured values for zh.
In simple co-kriging, we assume that the expected value of xk and zh are known, i.e. E[xk]=mk and
E[zh]=mh, where mk and mh are known values.
We then apply the three properties of kriging
Simple co-Kriging is a linear estimator
x̂ k =0 + T zh , where =0 and T=[1,…, n] are kriging weights
SK is unbiased
E[ x̂ k ]= E[xk ] 
x̂ k = mk + T (zh - mh )
Proof: Same as simple kriging
SK minimizes the estimation variance
Let ek=( x̂ k-xk) be the estimation error. Since the estimate x̂ k is unbiased, it follows that E[ek]=0, and
that the estimation error variance is v̂ k=var(ek)=E[( x̂ k-xk)2]. Substituting x̂ k with its expression we
get
 x  mk 
 , and
ek=( x̂ k-xk) = [1 -T]  k
z

m
 h
h
v̂ k=var(ek)= Ckk + T Czh,zh -2 T Czh,k ,
where Ckk=cov(xk, xk)=var(xk), Czh,zh=cov(zh, zh) and Czh,k=cov(zh, xk) have known covariance values.
The kriging weights that minimize v̂ k are given by
 = Czh,zh Czh,k
Proof : Same as simple kriging
Finally, substituting the above expression for  in the equations for x̂ k and v̂ k, we obtain
the Simple co-Kriging estimator:
x̂ k = mk + Ck,zh Czh,zh(zh - mh )
v̂ k = Ckk - Ck,zh Czh,zh Czh,k
2. Simple kriging with measurement error
The Simple Kriging with Measurement Error (SKME) estimator is simply the Simple co-Kriging
estimator with the additional assumption that
zh= xh+ h
where h=[ h1, …, hn]T are independent uncorrelated measurement errors that are independent from
xh and xk i.e. h is such that
var(h)= 2= [12, …, n2]T, cov(h,h)= I 2, cov(xh,h)= 0 and cov(xk,h)= 0
where I is the identity matrix. It follows from the above that
Czh,zh = cov(zh, zh) = cov(xh, xh) + I 2 = Chh + I 2 and
Czh,k = cov(zh, xk) = cov(xh, xk) + 0 = Chk
Hence the SKME estimator is:
x̂ k = mk + Ckh (Chh + I 2)(zh - mh )
v̂ k = Ckk - Ckh (Chh + I 2)Chk
where Ckk=var(xk), Chh=cov(xh, xh), Chk=cov(xh, xk) and 2=var(h).
3. Ordinary co-kriging
Similarly, in ordinary co-kriging we suppose that the random variable xk and the random vector zh=[
zh1, …, zhn]T are correlated with known covariance, and we seek the best linear unbiased estimate x̂ k
for xk given measured values for zh, but this time we assume that the expected value of xk and zh is
unknown and constant, i.e. the ordinary co-kriging model assumption is E[xk]=m and E[zh]=1 m.
Ordinary co-kriging is simply obtained from OK by changing xh to zh, hence Ordinary co-kriging is
the linear estimator
x̂ k = T zh ,
where the kriging weights are given by
C zh , zh   1  C zh , k

Τ 1 1

Following the same procedure as for ordinary kriging, we obtain that
the Ordinary co-Kriging estimator is: x̂ k = m̂ + Ck,zhCzh,zh(zh - 1 m̂ )
v̂ k= Ckk -TCzh,k -

mˆ  1Τ C zh , zh 1zh / 1Τ C zh , zh 1 1


(1  1Τ C zh , zh 1C zh , k ) 

1 
where   C zh , zh C zh , k  1


1Τ C zh , zh 1 1




1

1
Τ
   1Τ C
zh , zh C zh , k  1 / 1 C zh , zh 1




and where Ckk=cov(xk, xk)=var(xk), Czh,zh=cov(zh, zh) and Czh,k=cov(zh, xk)
4. Ordinary kriging with measurement error
The Ordinary Kriging with Measurement Error (OKME) estimator is simply the Ordinary coKriging estimator with the additional assumption that
zh= xh+ h
where h are measurement errors with the same properties as assumed above in SKME. It follows
that OKME is the linear estimator
x̂ k = T zh ,
where the kriging weights are given by
(C hh  I  2h )  1  C hk

Τ 1  1

Following the same procedure as for ordinary kriging, we obtain that
the OKME estimator is:
x̂ k = m̂ + Ckh(Chh + I 2)(zh - 1 m̂ )
v̂ k= Ckk -TChk -

mˆ  1Τ (C hh  I  2h ) 1 zh / 1Τ (C hh  I  2h ) 1 1

(1  1Τ (C hh  I  2h ) 1C hk ) 

2 1 

where   (C hh  I  h )  C hk  1
Τ
2 1

1
(
C

I

)
1


hh
h

Τ
2

1
Τ
2

1
   1 (C hh  I  h ) C hk  1 / 1 (C hh  I  h ) 1



and where Ckk=var(xk), Chh=cov(xh, xh) and Chk=cov(xh, xk) and 2=var(h).
5. Poisson kriging
Let dh = [ d1,…, dn]T be a vector of the number of positive cases of a disease or some other outcome
of a natural process observed at the hard data points ph = [p1,…, pn]T, let nh = [ n1,…, nn]T be the
corresponding numbers of individuals at risk for these positive cases, and let zh= [ d1/n1,…, dn/nn]T
be the corresponding observed rates.
For example ni might be the number of blood tests performed at a clinic located at point pi, di might
be the number of these tests found to be positive for HIV, and zi = di/ni would then be the
corresponding observed rate of positive HIV tests.
The model assumption of Poisson kriging is that there exist a vector xh= [ xh1,…, xhn]T such that
di | xi ~ Poisson(nixi) and cov(di,dj|xh) = 0 for if i≠j
which means that we assume that xh exists such that di is conditionally Poisson distributed with a
mean and a variance equal to nixi, and that di and dj are conditionally independent. Following this
assumption, we shall define xi as the unobserved “Poisson mean” of the observed rate zi=di/ni, hence
we will refer to xi as the “Poisson mean rate”.
Poisson kriging then simply corresponds to Ordinary co-Kriging where we are seeking the estimate
of the poisson mean rate xk at the estimation point pk given the observed rates zh.
As in Ordinary co-kriging, we assume that the expected value of E[xk] and E[xh] is unknown and
constant, i.e. we assume that E[xk]=m and E[xh]=1 m.
Next, we derive formulaes for cov(zi , zj) and cov(zi , xk) given the Poisson model assumption, and we
use these formulaes in the equations for ordinary co-kriging:
The expected value E[zi] is
E[zi] = m
Proof :
First we calculate the conditional mean
E[zi |xh] = E[dj/ni |xj] = E[dj |xj]/ ni = nixi /ni = xi
Then we deconditionalize
E[zi] = E[ E[zi |xh] ] = E [ xi] = m
The covariance cov(zi , zj) is
cov(zi , zj) = ij m /ni + cov(xi , xj )
Proof :
E[zi zj|xh] = cov(zi,zj|xh) + E[ zi|xh] E[zj|xh] = cov(dj,dj|xh) / (ni nj )+ E[di/ni|xi] E[dj/nj|xj]
= ij var(di|xi)/ni2 + E[di|xi] E[dj|xj] / (ninj) = ij nixi /ni2 + nixi njxj / (ninj) = ij xi /ni + xi xj
E[zi zj] = E[ E[zi zj|xh] ] = E[ij xi /ni + xi xj ] = ij m /ni + cov(xi , xj ) + m2
cov(zi , zj) = E[zi zj] - m2 = ij m /ni + cov(xi , xj )
The covariance cov(zi , xk) is
cov(zi , xk) = cov(xi , xk )
Proof
E[zi xk |xh] = E[ (di/ni) xk |xh] = xk /ni E[ di | xi] = xk nixi /ni = xkxi
E[zi xk] = E[ E[zi xk |xh] ] = E[ xixk ] = cov(xi , xk ) + m2
cov(zi , xk) = E[ zi xk ] - m2 = cov(xi , xk )
Finally we incorporate the above formulaes in the equations for ordinary co-kriging, and we get that
Poisson kriging is the linear estimator
x̂ k = T zh = T [d1/n1,…, dn/nn]T
where the kriging weights are given by
m
 n
C zh , zh   1  C zh , k
 j 1 (Cij   ij n ) j    Cik for i  1,..., n
 
i

Τ

1

1
n



j 1  j  1

(C hh  I 2h )  1  C hk
 
Τ 1 1

where Chh=cov(xh, xh), Chk=cov(xh, xk) and 2= [m/n1, …, m/nn]T
We note that 2 corresponds to the variance of errors h defined by
zh= xh+ h
Hence, using the above errors, Poisson kriging can further be reduced to OKME, so that
the Poisson Kriging estimator is:
x̂ k = m̂ + Ckh(Chh + I 2)(zh - 1 m̂ )
v̂ k= Ckk -TChk -

mˆ  1Τ (C hh  I  2h ) 1 zh / 1Τ (C hh  I  2h ) 1 1

(1  1Τ (C hh  I  2h ) 1C hk ) 

2 1 


where   (C hh  I  h )  C hk  1
Τ
2 1

1 (C hh  I  h ) 1



Τ
2

1
Τ
2

1
   1 (C hh  I  h ) C hk  1 / 1 (C hh  I  h ) 1



and where Ckk=var(xk), Chh=cov(xh, xh) and Chk=cov(xh, xk) and 2= [m/n1, …, m/nn]T
6. BME with a Poisson soft data model
We note that OKME is obtained as a special case of BME when using Gaussian soft data. Hence
Poisson kriging is obtained as a special case of BME using the following pdf for the soft data
describing the rate xi at point pi
fS(i) =
1
m
2
ni
di 2
)
ni
m
2
ni
 (x 
exp
where di is the observed number of positive cases, ni is the number of individual at risk, di/ni is the
observed rate, and m is the population-weighted mean of the observed rates. We see that higher
variance and therefore less weight are given to uncertain observed rates (i.e. rates observed from a
low number of individuals at risk).
Download