Adjusting for sampling variability in disease mapping using a Poisson soft data model 1. Simple co-kriging Suppose that the random variable xk and the random vector zh=[ zh1, …, zhn]T are correlated with known covariance / cross covariance. We seek the best linear unbiased estimate x̂ k for xk given measured values for zh. In simple co-kriging, we assume that the expected value of xk and zh are known, i.e. E[xk]=mk and E[zh]=mh, where mk and mh are known values. We then apply the three properties of kriging Simple co-Kriging is a linear estimator x̂ k =0 + T zh , where =0 and T=[1,…, n] are kriging weights SK is unbiased E[ x̂ k ]= E[xk ] x̂ k = mk + T (zh - mh ) Proof: Same as simple kriging SK minimizes the estimation variance Let ek=( x̂ k-xk) be the estimation error. Since the estimate x̂ k is unbiased, it follows that E[ek]=0, and that the estimation error variance is v̂ k=var(ek)=E[( x̂ k-xk)2]. Substituting x̂ k with its expression we get x mk , and ek=( x̂ k-xk) = [1 -T] k z m h h v̂ k=var(ek)= Ckk + T Czh,zh -2 T Czh,k , where Ckk=cov(xk, xk)=var(xk), Czh,zh=cov(zh, zh) and Czh,k=cov(zh, xk) have known covariance values. The kriging weights that minimize v̂ k are given by = Czh,zh Czh,k Proof : Same as simple kriging Finally, substituting the above expression for in the equations for x̂ k and v̂ k, we obtain the Simple co-Kriging estimator: x̂ k = mk + Ck,zh Czh,zh(zh - mh ) v̂ k = Ckk - Ck,zh Czh,zh Czh,k 2. Simple kriging with measurement error The Simple Kriging with Measurement Error (SKME) estimator is simply the Simple co-Kriging estimator with the additional assumption that zh= xh+ h where h=[ h1, …, hn]T are independent uncorrelated measurement errors that are independent from xh and xk i.e. h is such that var(h)= 2= [12, …, n2]T, cov(h,h)= I 2, cov(xh,h)= 0 and cov(xk,h)= 0 where I is the identity matrix. It follows from the above that Czh,zh = cov(zh, zh) = cov(xh, xh) + I 2 = Chh + I 2 and Czh,k = cov(zh, xk) = cov(xh, xk) + 0 = Chk Hence the SKME estimator is: x̂ k = mk + Ckh (Chh + I 2)(zh - mh ) v̂ k = Ckk - Ckh (Chh + I 2)Chk where Ckk=var(xk), Chh=cov(xh, xh), Chk=cov(xh, xk) and 2=var(h). 3. Ordinary co-kriging Similarly, in ordinary co-kriging we suppose that the random variable xk and the random vector zh=[ zh1, …, zhn]T are correlated with known covariance, and we seek the best linear unbiased estimate x̂ k for xk given measured values for zh, but this time we assume that the expected value of xk and zh is unknown and constant, i.e. the ordinary co-kriging model assumption is E[xk]=m and E[zh]=1 m. Ordinary co-kriging is simply obtained from OK by changing xh to zh, hence Ordinary co-kriging is the linear estimator x̂ k = T zh , where the kriging weights are given by C zh , zh 1 C zh , k Τ 1 1 Following the same procedure as for ordinary kriging, we obtain that the Ordinary co-Kriging estimator is: x̂ k = m̂ + Ck,zhCzh,zh(zh - 1 m̂ ) v̂ k= Ckk -TCzh,k - mˆ 1Τ C zh , zh 1zh / 1Τ C zh , zh 1 1 (1 1Τ C zh , zh 1C zh , k ) 1 where C zh , zh C zh , k 1 1Τ C zh , zh 1 1 1 1 Τ 1Τ C zh , zh C zh , k 1 / 1 C zh , zh 1 and where Ckk=cov(xk, xk)=var(xk), Czh,zh=cov(zh, zh) and Czh,k=cov(zh, xk) 4. Ordinary kriging with measurement error The Ordinary Kriging with Measurement Error (OKME) estimator is simply the Ordinary coKriging estimator with the additional assumption that zh= xh+ h where h are measurement errors with the same properties as assumed above in SKME. It follows that OKME is the linear estimator x̂ k = T zh , where the kriging weights are given by (C hh I 2h ) 1 C hk Τ 1 1 Following the same procedure as for ordinary kriging, we obtain that the OKME estimator is: x̂ k = m̂ + Ckh(Chh + I 2)(zh - 1 m̂ ) v̂ k= Ckk -TChk - mˆ 1Τ (C hh I 2h ) 1 zh / 1Τ (C hh I 2h ) 1 1 (1 1Τ (C hh I 2h ) 1C hk ) 2 1 where (C hh I h ) C hk 1 Τ 2 1 1 ( C I ) 1 hh h Τ 2 1 Τ 2 1 1 (C hh I h ) C hk 1 / 1 (C hh I h ) 1 and where Ckk=var(xk), Chh=cov(xh, xh) and Chk=cov(xh, xk) and 2=var(h). 5. Poisson kriging Let dh = [ d1,…, dn]T be a vector of the number of positive cases of a disease or some other outcome of a natural process observed at the hard data points ph = [p1,…, pn]T, let nh = [ n1,…, nn]T be the corresponding numbers of individuals at risk for these positive cases, and let zh= [ d1/n1,…, dn/nn]T be the corresponding observed rates. For example ni might be the number of blood tests performed at a clinic located at point pi, di might be the number of these tests found to be positive for HIV, and zi = di/ni would then be the corresponding observed rate of positive HIV tests. The model assumption of Poisson kriging is that there exist a vector xh= [ xh1,…, xhn]T such that di | xi ~ Poisson(nixi) and cov(di,dj|xh) = 0 for if i≠j which means that we assume that xh exists such that di is conditionally Poisson distributed with a mean and a variance equal to nixi, and that di and dj are conditionally independent. Following this assumption, we shall define xi as the unobserved “Poisson mean” of the observed rate zi=di/ni, hence we will refer to xi as the “Poisson mean rate”. Poisson kriging then simply corresponds to Ordinary co-Kriging where we are seeking the estimate of the poisson mean rate xk at the estimation point pk given the observed rates zh. As in Ordinary co-kriging, we assume that the expected value of E[xk] and E[xh] is unknown and constant, i.e. we assume that E[xk]=m and E[xh]=1 m. Next, we derive formulaes for cov(zi , zj) and cov(zi , xk) given the Poisson model assumption, and we use these formulaes in the equations for ordinary co-kriging: The expected value E[zi] is E[zi] = m Proof : First we calculate the conditional mean E[zi |xh] = E[dj/ni |xj] = E[dj |xj]/ ni = nixi /ni = xi Then we deconditionalize E[zi] = E[ E[zi |xh] ] = E [ xi] = m The covariance cov(zi , zj) is cov(zi , zj) = ij m /ni + cov(xi , xj ) Proof : E[zi zj|xh] = cov(zi,zj|xh) + E[ zi|xh] E[zj|xh] = cov(dj,dj|xh) / (ni nj )+ E[di/ni|xi] E[dj/nj|xj] = ij var(di|xi)/ni2 + E[di|xi] E[dj|xj] / (ninj) = ij nixi /ni2 + nixi njxj / (ninj) = ij xi /ni + xi xj E[zi zj] = E[ E[zi zj|xh] ] = E[ij xi /ni + xi xj ] = ij m /ni + cov(xi , xj ) + m2 cov(zi , zj) = E[zi zj] - m2 = ij m /ni + cov(xi , xj ) The covariance cov(zi , xk) is cov(zi , xk) = cov(xi , xk ) Proof E[zi xk |xh] = E[ (di/ni) xk |xh] = xk /ni E[ di | xi] = xk nixi /ni = xkxi E[zi xk] = E[ E[zi xk |xh] ] = E[ xixk ] = cov(xi , xk ) + m2 cov(zi , xk) = E[ zi xk ] - m2 = cov(xi , xk ) Finally we incorporate the above formulaes in the equations for ordinary co-kriging, and we get that Poisson kriging is the linear estimator x̂ k = T zh = T [d1/n1,…, dn/nn]T where the kriging weights are given by m n C zh , zh 1 C zh , k j 1 (Cij ij n ) j Cik for i 1,..., n i Τ 1 1 n j 1 j 1 (C hh I 2h ) 1 C hk Τ 1 1 where Chh=cov(xh, xh), Chk=cov(xh, xk) and 2= [m/n1, …, m/nn]T We note that 2 corresponds to the variance of errors h defined by zh= xh+ h Hence, using the above errors, Poisson kriging can further be reduced to OKME, so that the Poisson Kriging estimator is: x̂ k = m̂ + Ckh(Chh + I 2)(zh - 1 m̂ ) v̂ k= Ckk -TChk - mˆ 1Τ (C hh I 2h ) 1 zh / 1Τ (C hh I 2h ) 1 1 (1 1Τ (C hh I 2h ) 1C hk ) 2 1 where (C hh I h ) C hk 1 Τ 2 1 1 (C hh I h ) 1 Τ 2 1 Τ 2 1 1 (C hh I h ) C hk 1 / 1 (C hh I h ) 1 and where Ckk=var(xk), Chh=cov(xh, xh) and Chk=cov(xh, xk) and 2= [m/n1, …, m/nn]T 6. BME with a Poisson soft data model We note that OKME is obtained as a special case of BME when using Gaussian soft data. Hence Poisson kriging is obtained as a special case of BME using the following pdf for the soft data describing the rate xi at point pi fS(i) = 1 m 2 ni di 2 ) ni m 2 ni (x exp where di is the observed number of positive cases, ni is the number of individual at risk, di/ni is the observed rate, and m is the population-weighted mean of the observed rates. We see that higher variance and therefore less weight are given to uncertain observed rates (i.e. rates observed from a low number of individuals at risk).