Kriging with anisotropy Geometric anisotropy: 0 5000 Gamma(h) 10000 15000 20000 Range longer in one direction than another Nugget and (partial) sill same in all directions Fix by rotating / rescaling coordinate system krige() does automatically when you’ve fit an anisotropic SV 20000 40000 60000 Distance c Philip M. Dixon (Iowa State Univ.) 80000 Spatial Data Analysis - Part 5 Spring 2016 1 / 25 Kriging with anisotropy Zonal anisotropy variogram sills vary with direction The direction with the shorter range also has the shorter sill. model by a combining an isotropic model and a model which depends “only on the lag-distance in the direction θ of the greater sill” (Schabenberger and Gotway, 2005, p 152). γ(h ) = γ1 (|| h ||) + γ2 (hθ ) Chiles and Delfiner (1999, p. 96) warn against axis-specific models, e.g.: γ(h ) = γ1 (hx ) + γ2 (hy ) Under certain circumstances they can lead to Var Z (s) = 0, which not good. “fake it” by setting up a geometrically anisotropic component with a major axis much much longer than the minor axis Only distances only along the major axis contribute to γ(h ) c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 5 Spring 2016 2 / 25 Other types of Kriging Measurement error kriging Observations have measurement error Different interpretation of nugget Local prediction Speed up / make possible computations for large problems Block kriging Predict average for an area Log Normal or Trans-Gaussian kriging Observations have skewed distributions Indicator kriging / Disjunctive kriging Predict probability of an event Cokriging exploit correlations between two variables c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 5 Spring 2016 3 / 25 Measurement error kriging Remember the effect of the nugget: At the location of an observed value, Ẑ (s) = Z (s). and Var Ẑ (s) = 0 at that location 2 but any small distance away from that s, Var Z (s + h) = σnugget Kriging “honors the data” Assumes that a hypothetical repeat observation at s will be exactly the same number What if there is measurement error in Z (s)? A repeat measurement at same location will not be the same value. c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 5 Spring 2016 4 / 25 Measurement error kriging Now, do not want to honor the data (because what we observe includes non-repeatable measurement error) Identification problem: have only one obs. per location. Can not separate nugget from measurement error Need outside information / guess about the magnitude of the measurement error Or information about the magnitude of small-scale variation e.g., have closely spaced pairs can believe that that apparent nugget is just measurement error c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 5 Spring 2016 5 / 25 Measurement error kriging 2 If you “know” σmeas. error , then can account for this when kriging Consequence is that prediction at observed locations is a “smoothed” version of the observations. Consider a pure nugget process: Simple example that demonstrates the difference between kriging with a nugget and kriging with measurement error no spatial correlation at the spatial scale of the observations SV is flat for all non-zero distances: γ(d) = σ 2 , d > 0. nugget / meas. error have different γ(0) Consider three alternatives: 100% nugget: γ(0) = 0, γ() = σ 2 50% nugget, 50% meas. error: γ(0) = σ 2 /2, γ() = σ 2 100% meas. error: γ(0) = σ 2 , γ() = σ 2 c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 5 Spring 2016 6 / 25 Measurement error kriging All 3 cases: prediction is the mean value except at the observed locations because no spatial structure (pure nugget) Different predictions at the observed locations Example: Z (s i ) = 1, µ̂ = 3 Nugget Meas. error. 100 0 50 50 0 100 Pictures on next three slides c Philip M. Dixon (Iowa State Univ.) Ẑ (s i ) 1 2 3 Spatial Data Analysis - Part 5 Var Ẑ (s i ) 0 0.31 0.25 Spring 2016 7 / 25 5.0 4.5 4.0 3.5 3.0 2.5 2.0 1.5 1.0 c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 5 Spring 2016 8 / 25 5.0 4.5 4.0 3.5 3.0 2.5 2.0 1.5 1.0 c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 5 Spring 2016 9 / 25 5.0 4.5 4.0 3.5 3.0 2.5 2.0 1.5 1.0 c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 5 Spring 2016 10 / 25 Local prediction Currently, using all observations to make predictions 0 Ẑ (s i ) = µ̂ + σ Σ−1 (Z (s ) − µ̂) Only have to compute Σ−1 once What if you: have many observations (e.g. 10,000): Σ likely too large believe µ varies (and you don’t want to model that change) use only nearby obs. to predict at a location This is called “local prediction” Either use some max. # obs., or all obs. within some specified distance of prediction location. Need to compute Σ−1 separately for each prediction because different “data” being used each time but Σ much much smaller each time c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 5 Spring 2016 11 / 25 Block Kriging To now, we’ve focused on predicting Z (s ) for individual locations locations assumed to be points w/o area (physical/mathematical simplification, not reality) Sometimes called Punctual Kriging (for points) In many applications, want to predict mean or total over some area total and mean interconvertable: total = mean*area Areas are not undividable units experiment on people. A person is a clearly defined, undividable unit experiment in a field. You choose the plot size - no clearly defined undividable unit Called the Modifiable Areal Unit Problem both size and shape of area matter inferences depend on both e.g. Variance between “replicate” field plots depends on size and shape c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 5 Spring 2016 12 / 25 Block Kriging MAUP is one example of a “change of support” problem (COSP) Support: size, shape, and orientation of a unit associated with a measurement Changing support, e.g. by averaging or aggregating, creates new random variables (for the new plots) related to original r.v’s, but spatial and statistical properties are different e.g. semivariogram parameters will change Block Kriging is a second ex. of a COSP Define block average, Z (B), for block B with area | B |: Z 1 Z (B) = Z (s) ds |B| B Want to predict Ẑ (B) not Ẑ (s) at the center of the area 0 because weights, σ Σ−1 , are non-linear functions of location c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 5 Spring 2016 13 / 25 Block Kriging In practice: predict at a grid of point locations within B, and average: Z (B) = Σλi Z (s i ) choose λi to minimize MSEP like OK, but based on “point-to-block” covariances Z 1 Cov (Z (B), Z (s i )) = Cov (Z (u), Z (s i )) du |B| B Again, approx. by setting up a grid of pts in B: Cov (Z (B), Z (s i )) = 1X Cov (Z (uj ), Z (s i )) J j Software takes care of all these details c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 5 Spring 2016 14 / 25 Spring 2016 15 / 25 Spring 2016 16 / 25 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 5 Block Kriging Prediction variance for block kriging Block average, Z (B), is an average! estimated more precisely than is a single point. Variance is not σ 2 / | B | (i.e. analogous to σ 2 /n ) because of spatial correlation within the block but still smaller, especially for: large blocks small range not far from observed locations: Ẑ (s) ≈ µ̂, so very highly correlated within block. My thoughts: What, specifically, do you want to predict / map? Block kriging not as often used as it should be c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 5 Non-linear Kriging reminder: Best predictor (smallest MSEP) is E [Z (s 0 ) | Z (s )] Gaussian: linear function of observations: Σλi Z (s i ) Other distributions: still want E [Z (s 0 ) | Z (s )] no longer a linear fn of obs. Often see skewed ecological / environmental data. few really large values, other values close to mean Most common non-Gaussian distribution: log normal very appropriate for positively skewed data log Z (s i ) ∼ Gaussian, i.e. log Z (s i ) ∼ N(µl , σl2 ) parameters estimated by “usual” applied to log Z c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 5 Spring 2016 17 / 25 Spring 2016 18 / 25 Non-linear Kriging Log Normal kriging log transform Z (s ) [ predict P(s 0 ) = log Z (s 0 ) (on log scale) back transform: Ẑ (s 0 ) = exp P(s 0 ) P(s 0 ) is an unbiased prediction of log Z (s 0 ) but exp P(s 0 ) is a biased predictor of Z (s 0 ). (Jensen’s inequality) c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 5 Log normal Kriging Solutions: 1) Ignore the problem, focus on medians exp P(s 0 ) is an asymptotically unbiased estimate of median Z (s 0 ). 2) Use properties of logN distribution when log Z (s 0 ) ∼ N(µl , σl2 ), E Z = exp(µl + σl2 /2) Apply to exp P̂(s 0 ): h i E Ẑ (s 0 ) = E exp P̂(s 0 ) = exp P̂(s 0 ) + Var P̂(s 0 )/2 Log normal kriging procedure: use OK on log transformed values back-transform prediction apply correction c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 5 Spring 2016 19 / 25 Trans-Gaussian Kriging log is one member of the Box-Cox family of transformations. For these ( Z (s λ −1 λ 6= 0 λ Z ∗ (s ) = log Z (s ) λ = 0 λ = 1 ⇒ no transformationp λ = 0.5 ⇒ proportional to Z (s 0 ) transformation λ = −1 ⇒ proportional to 1/Z (s 0 ) transformation purpose of −1/λ is to make function continuous λ limit of Z λ−1 as λ → 0 is log Z Why transform? Intent is that Z ∗ (s 0 ) are Gaussian, or at least symmetric Optimal properties of kriging for Gaussian data MSEP makes most sense for symmetric distribution of values Fitting covariates (UK) makes most sense for symmetric distribution of residuals c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 5 Spring 2016 20 / 25 Indicator Kriging What if you want to predict an exceedance probability, P[Z (s 0 ) > threshold ] e.g. legal limit on concentration of mercury in fish Kriging predicts E [Z (s 0 ) | Z (s )] Define Z ∗ (s 0 ) = I (Z (s 0 ) > threshold ) Z ∗ (s 0 ) = 1 if condition is true (Z (s 0 ) > threshold ) Z ∗ (s 0 ) = 0 if condition is false (Z (s 0 ) ≤ threshold ) E Z ∗ (s 0 ) = P[Z (s 0 ) > threshold ] Apply indicator transformation to all obs, estimate semivariogram from indicator variables (can be hard) then krige. Note: Data used are indicator values so IK predictions are E Z ∗ (s 0 ) | Z ∗ (s) not the same as E Z ∗ (s 0 ) | Z (s) because the indicator transformation “throws away” information. c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 5 Spring 2016 21 / 25 Indicator Kriging Issues: no guarantee that 0 ≤ p ≤ 1 remember, Ẑ (s 0 ) can exceed range of data variety of ad-hoc fixes there are more complicated methods my general sense is they don’t work markedly better Extension: if you have P[Z > 1], P[Z > 2], . . ., you have an approximation the cdf of Z A bit more detail: Define Z1 (s) to be I (Z (s) < k1 ), and Z2 (s) to be I (Z (s) < k2 ), for many values of k gives you predictions of F̂ (k1 ), F̂ (k2 ), . . . c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 5 Spring 2016 22 / 25 Disjunctive Kriging Define indicators for non-overlapping regions e.g. Z (s 0 ) < 10, 10 ≥ Z (s 0 ) < 20, 20 ≥ Z (s 0 ) < 30, . . . not Z (s 0 ) < 10, Z (s 0 ) < 20, Z (s 0 ) < 30, . . . Knowing 10 ≥ Z (s 0 ) < 20 is more informative than knowing only that Z (s 0 ) < 20 (IK) very elegant math (which we’ll ignore) → predict any function g (Z (s 0 )), including the set of indicator functions Examples: Richard Webster has published a lot of ag-related studies Only implementation in R (2016) is in the geostat library Note: can combine block kriging ideas with any of the non-linear kriging methods e.g. define 1km x 1km areas and estimate P[soil N > threshold] c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 5 Spring 2016 23 / 25 Cokriging Two (or more) spatial variables Correlated when measured at same location: Cov Z1 (s ), Z2 (s ) Each is spatially correlated: Cov Z1 (s ), Z1 (s + h ) Also spatial cross-correlation: Cov Z1 (s ), Z2 (s + h ) Concepts: estimate that cross correlation by cross semivariogram use to improve predictions of Z1 and/or Z2 Thoughts based on my limited experience: do not need to measure Z1 and Z2 at same locations use fine grid to predict a sparsely observed, but correlated, value Picture on next slide If Z2 measured at each prediction point, use UK instead of co-kriging c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 5 Spring 2016 24 / 25 1.0 Cokriging + Y coordinate 0.4 0.6 0.8 + + + + 0.2 + + + 0.0 + 0.0 c Philip M. Dixon (Iowa State Univ.) + 0.2 0.4 0.6 X coordinate Spatial Data Analysis - Part 5 0.8 1.0 Spring 2016 25 / 25