Areal data Reminder about types of data Geostatistical data: Z (s ) exists everyhere, varies continuously Can accommodate sudden changes by a model for the mean E.g., soil pH, two soil types with different pH model as mean depends on soil type error = small scale variation, assumed continuous Areal data Spatial data measured and reported by regions Can be abrupt change at region boundaries Unlike geostat data, “location” is arbitrary within region c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2016 1 / 52 Areal data Areal data often arises by aggregation number of disease cases by county but doesn’t have to: US states, coded by party of state Governor Distinction between Geostat and Areal can be blurred SST data: given to you as geostatistical data related information: only available as single value per spatial grid cell (e.g., 1◦ x 1◦ area) To me: relative scale, size of measurement to span of study area Distinction between Areal and Point pattern can be blurred Disease cases: will treat as areal data but point pattern if have individual locations (household address) Again, relative scale matters. What is the location of an individual? c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2016 2 / 52 Spring 2016 3 / 52 Spring 2016 4 / 52 Areal Data Goals: Show data Describe spatial dependence Predict values Less common to predict at new locations Assume observations are “nosiy”, i.e., measurement error at each location, Want to smooth (reduce amount of noise) i.e., predict “true” values at observed locations Some examples in pictures c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Infant mortality, Auckland NZ districts 40 35 30 25 20 15 10 5 0 c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Number of plant species in 20cm x 20 cm patches of alpine tundra 12 10 8 6 4 2 c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2016 5 / 52 Wheat uniformity trial 5.0 4.5 4.0 3.5 3.0 c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2016 6 / 52 Spring 2016 7 / 52 Incidence of flu in Iowa 0.10 0.08 0.06 0.04 0.02 0.00 c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spatial smoothing Made up map of # cases of flu in central Iowa in January 2013 expressed as # cases / person Rate in February expected to be similar to that observed in January Q: Where do you expect February rate to be the highest? c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2016 8 / 52 Spatial smoothing Made up map of # cases of flu in central Iowa in January 13 expressed as # cases / person Rate in February expected to be similar to that observed in January Q: Where do you expect February rate to be the highest? A: One of the light areas in the middle! c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2016 9 / 52 Spatial Smoothing Why an unusually large value on edge of map may be noise, not signal could be biological: negative correlation in subsequent months often a consequence of population each square in the middle is ca 10,000 people. Those on the edges are ca 100 e.g. big city in the middle of the mapped area sd of estimated rate 10x higher in edge squares. if assume rates are spatially correlated, nearby areas inform about poorly est. areas, especially if nearby areas have large N Another difference between areal and geostatistical data Geostat: Often reasonable to assume constant variance Areal: Often not. Var / sd depends on something else about each region (e.g. N) c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2016 10 / 52 Areal data: what we’ll do describe spatial dependence and smoothing for normally distributed observations Relatively straight-forward computations Talk a bit about analyzing counts Full treatment requires hierarchical Bayes methods c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2016 11 / 52 Describing association Spatial correlation between what? Reminder: in geostats, we assumed that spatial dependence 1) was constant over the map (2nd order stationarity) 2) was same in all directions (isotropy), but could relax 3) depended on distance between two observations 2 described by semivariance: E [Z (s) − Z (s + h)] or cross-variance: E [Zi (s) − Zi (s + h)] [Zj (s) − Zj (s + h)] Consider data on a grid (species diversity picture on next slide) c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2016 12 / 52 12 10 8 6 4 2 c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2016 13 / 52 Describing association Could describe in terms of distance 1√= up/down or across 2 = diagonal, and so on 2= up/down or across 2 But approximate, since distances between areas, not points What about irregular regions (e.g. Auckland, on next slide)? Distance hard to define consistently (centroids of regions are very irregular) Areal data analysis has traditionally focused on pairwise “connectivity”: how well connected are two regions? c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2016 14 / 52 Spring 2016 15 / 52 40 35 30 25 20 15 10 5 0 c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Connectivity On a grid: two common definitions of connectivity rook’s move: a square in the middle connects to the two on either side and the two above/below queen’s move: a square in the middle connects to its 8 neighbors (rooks + 4 diagonals) squares on edge have 3/5 neighbors; those in the corners have only 2/3 Connections define the “spatial proximity” matrix, also called “spatial weight” or spatial connectivity” matrix # rows = # columns = # regions 1 if a pair of regions connect 0 otherwise always 0 down the diagonals c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2016 16 / 52 Connectivity Irregular regions: two approaches 1 if two regions share a boundary very irregular neighbor count. Auckland: from 1 - 10 neighbors, median 4 Could also use % boundary shared very useful for modeling transport among regions economic flows, disease propagation, invasive spp Can also use distance all regions with centroids within d of target find the k nearest neighbors (k smallest distance between centroids) make weight a function of distance, d −α Can use values as is, or row-standardize so row sum = 1 if have 2 N’s, each has connectivity 1/2 if have 4 N’s, each has connectivity 1/4 c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2016 17 / 52 Connectivity No standard way. Most common is: shared boundary = 0/1, perhaps with row standardization best if connectedness measure informed by subject-matter knowledge choice does have statistical consequences, especially when predicting/smoothing Some weight matrices are symmetrical 0/1 shared boundary d −α others are not % shared boundary row-standardized matrices c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2016 18 / 52 Spatial dependence One very common measure, one less common Moran’s I , dates to 1950 I = s2 = 1 Σi,j wij (Yi − Ȳ )(Yj − Ȳ ) s2 Σi,j wij 1 Σi (Yi − Ȳ )2 , i.e., mle, not usual unbiased est. N wij is the ij’th element of the spatial weight matrix I ranges from -1 to +1 0: no spatial correlation 1: perfect positive correlation c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2016 19 / 52 Moran’s I Tests of correlation = 0: 1) If sufficient # regions, I ∼ N with mean and variance that can be computed −1 E Iˆ = N−1 , variance formula not insightful not clear what is “sufficient”, depends on W , but would like 20 or more regions 2) permutation: randomly shuffle observed values over the regions, compute I each time #more extreme #permutations more extreme+1 =# #permutations+1 enumerate all permutations: p = sample (randomization test): p +1 accounts for the observed data (already included in all permutations) Usually one-sided test, only interested in positive spatial dependence c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2016 20 / 52 Moran’s I example Example: species diversity data using Queen’s move neighbors Iˆ = 0.751 randomization test: 0.751 exceeds all 9,999 randomizations, p = 1/10,000 = 0.0001 normal approximation: E Iˆ = -0.0157, Var Iˆ = 0.00451 Z = √I −E I = 11.4, p < 0.0001 Var I Next slide shows distribution of Moran’s I values for randomized values Spatial Data Analysis - Part 8 Spring 2016 21 / 52 Spring 2016 22 / 52 3 0 1 2 Density 4 5 c Philip M. Dixon (Iowa State Univ.) −0.2 c Philip M. Dixon (Iowa State Univ.) 0.0 0.2 0.4 0.6 0.8 Spatial Data Analysis - Part 8 12 10 8 6 4 2 c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2016 23 / 52 Geary’s c Based on squared differences, not covariance c= N − 1 Σi,j wij (Yi − Yj )2 2 Σi,j wij (Yi − Ȳ )2 similar in spirit to semivariance denominator scales to ±1 usually similar to but not same as Moran’s I when there is a difference I is a more global indicator, because calculate Ȳ c is more sensitive to differences in small neighborhoods Test Ho: no spatial dependence using permutations or normal approximation Notice that both I and c sum over all pairs of points. One number for entire region c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2016 24 / 52 Local indicators of spatial association rewrite I as: I = 1 Σi (Yi − Ȳ ) Σj wij (Yj − Ȳ ) s 2 Σi,j wij Calculate second sum separately for each region Ii = 1 (Yi − Ȳ ) Σj wij (Yj − Ȳ ) s2 global statistic is then I = Σi Ii /Σi,j wij illustrate using species diversity data First local Moran’s Ii , then Moran’s Ii marking Zi > 2 Blue areas are where that region is similar to neighboring regions c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 div Spring 2016 Ii 25 / 52 12 10 8 6 4 2 0 c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2016 26 / 52 1.0 0.8 0.6 0.4 0.2 0.0 c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2016 27 / 52 Moran correlogram Define different weight matrices using different distance classes (0, 1.5) using queen’s move neighbors (2, 3) → regions slightly further apart (3, 5), and so on Or, use 1st nearest neighbor, 2nd NN, 3rd NN, . . . Calculate Moran’s I for each weight matrix Plot I vs distance c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2016 28 / 52 0.8 0.6 0.4 0.2 −0.2 0.0 Moran's I 1 2 3 4 5 6 Distance class c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2016 29 / 52 Join count statistics Moran’s I and Geary’s c are for continuous observations c: “similar” because (Yi − Yj )2 is small What about categorical data E.g., US states, record whether governor is Republican or Democrat Is there spatial correlation? i.e., If your state is Republican, are neighboring states more likely to be Republican? Usual approach is the Black-black (BB) join count statistic 1 BB = Σi,j wij Ii Ij 2 Ii is 1 if the “event” happens in region i If wij is 1 if neighbors, 0 otherwise, BB is the number of pairs where both region and neighbor are events c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2016 30 / 52 Join count statistics Compare BB to expected value if no spatial dependence Either a Z-test (assume normal-distribution) BB − E BB Z= √ Var BB or permutation: keep same number of “event” and non-“event” regions, keep neighbor structure, assign “event” / non-“event” randomly to regions. compute BB for each randomization Example: Species diversity plot, define “rich” as 6 or more species Rich-rich, rooks neighbors: BB = 53, E BB = 35, Var BB = 8.495 √ Normal approximation: Z = (53 − 35)/ 8.495 = 6.176, p < 0.0001 Permutation: 53 is larger than any of 9999 randomizations, p = 0.0001 c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2016 31 / 52 TRUE FALSE c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2016 32 / 52 2500 2000 Frequency 1000 1500 500 0 ● 25 c Philip M. Dixon (Iowa State Univ.) 30 35 40 45 BB for random assignments Spatial Data Analysis - Part 8 50 55 Spring 2016 33 / 52 Combining multiple sites Imagine a study of spatial dependence in Iowa that is repeated in Indiana. Methods above will describe that spatial dependence in each state What if you believed the spatial dependence was similar in the two, and wanted one result How can you combine information from both states? No problem, but should think about a few issues. c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2016 34 / 52 Combining multiple sites 1) One mean for both states, or separate means? the issue is the scale of spatial dependence (within state only or including between state) not an issue if the means are similar, but they are often are not Separate means → evaluates pooled within state spatial dependence Separate means is most common 2) Include only pairs of regions within states, or pairs crossing states? Same spatial scale issues. Only pairs within states is most common. c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2016 35 / 52 Combining multiple sites 3) Do the two states contribute equal amounts of information? Here, same within state sampling, same # regions, same weighting scheme (In my story) So, equal amounts of information use a simple average of both states (A and B) Iˆ for both states: Iˆ = (IˆA + IˆB )/2 E I = (E IA + E IB )/2 Var Iˆ = (Var IˆA + Var IˆB )/4 If unequal amounts of information, use a weighted average Many possible ways to weight, depends on study specifics c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2016 36 / 52 Combining multiple sites Hard part is getting estimate and its variance Test Ho: no within state spatial dependence Z score for overall study, or permuting observations within state. BTW, same ideas can be used for semivariograms for multiple sites Computing trick: artificially separate sites, make sure min distance between IA and IN larger than max within state distance specify max semivariogram distance so that all pairs are within a site. weights each region by number of pairs c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2016 37 / 52 Multiple sites: example Iowa: Iˆ = 0.35, E Iˆ = −0.0159, Var Iˆ = 0.085 Indiana: Iˆ = 0.42, E Iˆ = −0.0159, Var Iˆ = 0.085 Individually: Iowa: Z = Indiana: Z 0.35−(−0.0159) √ = 0.085 0.42−(−0.0159) √ = 0.085 1.25, p = 0.10 = 1.49, p = 0.067 Together: Iˆ = (0.35 + 0.42)/2 = 0.385, E Iˆ = −0.0159, Var Iˆ = (0.085 + 0.085)/4 = 0.0425 Z= 0.385−(−0.0159) √ 0.0425 = 1.94, p = 0.026 Similar patterns in both areas, aggregate the two ⇒ stronger evidence c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2016 38 / 52 Smoothing areal data observations are measured with error if obs. are spatially correlated, nearby obs. contain information relevant to the true value / prediction for this obs. ⇒ predict Yi incorporating nearby obs. Nearby often means “all”, i.e. global mean Will talk about concepts with a few details Naturally leads to Bayesian inference, won’t go that far in 406 Bivand’s example is disease mapping (Chapter 8) applies to count data for many outcomes We focus on normally distributed outcomes c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2016 39 / 52 Smoothing areal data Will keep things simple to emphasize concepts Observe Yi in a region, have multiple regions Believe that each Yi observed with some random error Want to predict “true” µi for each region Normally distributed observations Yi ∼ N(µi , σ 2 ) assume (to keep things simple) that σ 2 known σ 2 may be different for each region Statistical problem: Given Yi predict µi Two common ways to solve: mixed model: Yi = µ + αi + εi , αi ∼ N(0, σa2 ), εi ∼ N(0, σe2 ) Find BLUP of αi Bayes: Yi ∼ N(µi , σ 2 ), find posterior distribution of µi | Yi c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2016 40 / 52 Bayes in one slide Distribution: probability density for a response given parameters Normal distribution, known variance. 2 1 Y ∼ N(µ), f (Y |µ) = √2πσ exp −(Y2σ−µ) 2 2 Likelihood: what can we say about a parameter given data 2 1 exp −(Y2σ−µ) l(µ | Y ) = √2πσ 2 2 NOT a probability density Prior: distribution of parameter before collecting data: f (µ) Posterior: distribution of parameter after collecting data: f (µ|Y ) Bayes Theorem: f (µ|Y ) ∝ f (Y | µ)f (µ) posterior proportional to likelihood times the prior c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2016 41 / 52 Bayes for normal data Equations simpler if use precision = 1/variance, not variance τ = 1/σ 2 Prior: What do I believe about µ before observing Y ? convenient: µ ∼ N(µ0 , τ0 ) Likelihood: Y |µ ∼ N(µ, τ ) Posterior: µ|Y ∼ N(µ1 , τ1 ) µ0 τ0 + Y τ , τ1 = τ0 + τ τ0 + τ 0 Can rewrite µ1 as µ0 τ0τ+τ + Y τ0τ+τ 2 2 σ0 Or µ1 = µ0 σ2σ+σ2 + Y σ2 +σ 2 µ1 = 0 0 i.e., posterior mean is a weighted average of data value and prior mean weights are the relative precision of each c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2016 42 / 52 Spring 2016 43 / 52 Bayes for normal data Comparison: Frequentist: µ̂ = Y maximum likelihood estimator: max l(µ|Y ) Bayesian: Have the posterior distribution, use mode, median or mean as the estimate of µ Here, all the same τ0 τ µ̂ = µ0 +Y τ0 + τ τ0 + τ Conceptually, two sources of information about µ the prior distribution the data c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Bayes for normal data Some numbers: µ0 Y σ0 σ τ0 5 9 10 1 0.01 5 9 1 1 1 5 9 1 10 1 5 1 1 1 1 Notice that Bayes estimate has smoothed τ µ1 1 8.96 1 7 0.01 5.04 1 3 the data Towards the prior mean, µ0 Amount of smoothing depends on σ for data and σ0 for prior Often, not clear how to choose prior c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2016 44 / 52 Empirical Bayes for spatial data Want to smooth observations towards global average (or average of neighbors) Bayes smooths towards the prior mean So use global average as the prior mean Variations allow for local averages Approach called Empirical Bayes Still have to choose σ0 : controls amount of smoothing Fit mixed model: Yi = µ + αi + εi , αi ∼ N(0, σa2 ), εi ∼ N(0, σe2 ) Use random effect variance, σa2 , as σ0 c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2016 45 / 52 Spring 2016 46 / 52 Spring 2016 47 / 52 Spring 2016 48 / 52 Empirical Bayes: example 0.10 0.08 0.06 0.04 0.02 0.00 c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Empirical Bayes: Var Y 0.010 0.008 0.006 0.004 0.002 0.000 c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Empirical Bayes: smoothed estimates 0.210 0.205 0.200 0.195 0.190 c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Empirical Bayes: smoothed estimates phat 0.7 psmooth 0.6 0.5 0.4 0.3 0.2 0.1 0.0 c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2016 49 / 52 Empirical Bayes: smoothed estimates Fit mixed model: estimated overall mean = 0.201 estimated sd αi = 0.00674 Range of sd Yi : 0.00516 to 0.103 Examples of smoothing: Var Yi 0.0107 0.00034 0.0000266 Example calculation: Var µ̂i Weight for Yi Yi µ̂i 0.004 0.667 0.203 0.118 0.223 0.204 0.631 0.213 0.209 Yi = 0.00034, Yi = 0.2229 0.006742 0.00034 + 0.2015 0.00034 + 0.006742 0.00034 + 0.006742 = 0.2229 × 0.1178 + 0.2015 × 0.8821 = 0.2229 = 0.2040 c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2016 50 / 52 Spring 2016 51 / 52 Empirical Bayes: Weights on global average 1.0 0.9 0.8 0.7 0.6 0.5 0.4 c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Empirical Bayes for spatial count data Many aspects easier when data are counts number of disease cases, number of deaths per region models for counts do not have σ or σ0 huge area of research and application Bivand, chapter 8, describes various models Take home: different models give very similar smoothed estimates Big difference from raw (unsmoothed) estimates Empirical Bayes does use data twice General sense that resulting maps are “too smooth” does not fully account for all sources of variability Requires Bayes with more complicated models Applications to count data usually require numeric approximation of the posterior MCMC: Markov Chain Monte Carlo c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2016 52 / 52