GEOG 60 – Introduction to Geographic Information Systems Professor: Dr. Jean-Paul Rodrigue Topic 4 – Geographical Data Analysis A – The Nature of Spatial Analysis B – Basic Spatial Analysis A The Nature of Spatial Analysis ■ ■ ■ ■ 1. Spatial Analysis and its Purpose 2. Spatial Location and Reference 3. Spatial Patterns 4. Topological Relationships 1 Spatial Analysis and its Purpose ■ Conceptual framework • Search of order amid disorder. • Organize information in categories. ■ Method • Inducting or deducting conclusions from spatially related information. • Deduction: Deriving from a model or a rule a conclusion. • Induction: Learning new concepts from examples. • Spatial analysis as a decision-making tool. • Help the user make better decisions. • Often involve the allocation of resources. 1 Spatial Analysis and its Purpose ■ Requirements Information Encoding Media Methods Message • 1) Information to be analyzed must be encoded in some way. • 2) Encoding implicitly requires a spatial language. • 3) Some media to support the encoded information. • 4) Qualitative and/or quantitative methods to perform operations over encoded information. • 5) Ways to present to results in an explicit message. 1 Spatial Analysis and its Purpose Remote sensing Geomorphology Climatology Quantitative methods Biogeography Cartography Spatial Analysis GIS Soils Human Geography Historical Political Economic Behavioral Population 1 Mapping Deaths from Cholera, London, 1854 (Snow Study) 1 Spatial Analysis and its Purpose ■ Data Retrieval DB HD SHP • Browsing; windowing (zoom-in & zoom-out). • Query window generation (retrieval of selected features). • Multiple map sheets observation. • Boolean logic functions (meeting specific rules). ■ Map Generalization • Line coordinate thinning of nodes. • Polygon coordinate thinning of nodes. • Edge-matching. 1 Spatial Analysis and its Purpose ■ Map Abstraction • Calculation of centroids. • Visual editing & checking. • Automatic contouring from randomly spaced points. • Generation of Thiessen / proximity polygons. • Reclassification of polygons. • Raster to vector/vector to raster conversion. ■ Map Sheet Manipulation • • • • Changing scales. Distortion removal/rectification. Changing projections. Rotation of coordinates. 1 Spatial Analysis and its Purpose ■ Buffer Generation • Generation of zones around certain objects. ■ Geoprocessing • Polygon overlay. • Polygon dissolve. • “Cookie cutting”. 6 ■ Measurements 4.5 7.5 5 • Points - total number or number within an area. • Lines - distance along a straight or curvilinear line. • Polygons - area or perimeter. 1 Spatial Analysis and its Purpose ■ Raster / Grid Analysis 15 • • • • Grid cell overlay. Area calculation. Search radius. Distance calculations. ■ Digital Terrain Analysis • • • • • • • Visibility analysis of viewing points. Insolation intensity. Grid interpolation. Cross-sectional viewing. Slope/aspect analysis. Watershed calculation. Contour generation. 3 Spatial Patterns ■ Relativity of objects Size Form Orientation Scale Proximity • Definition of an object in view of another. • Create spatial patterns. ■ Main patterns • Size. • Distribution/spacing : Uniform, random and clustered. • Proximity. • Density: Dense and dispersed. • Shape. • Orientation. • Scale. 3 Spatial Patterns ■ Spatial autocorrelation • • • • Set of objects that are spatially associated. Relationship in the process affecting the object. Negative autocorrelation. Positive autocorrelation. Uniform Clustered Positive autocorrelation Random 4 Topological Relations ■ Proximity • Qualitative expression of distance. • Link spatial objects by their mutual locations. • Nearest neighbors. • Buffer around a point or a line. ■ Directionality 4 Topological Relations ■ Adjacency • Link contiguous entities. • Share at least one common boundary. ■ Intersection ■ Containment • Link entities to a higher order set. City B City A 4 Topological Relations ■ Connectivity 1 2 3 4 5 6 • Adjacency applied to a network. • Must follow a path, which is a set of linked nodes. • Shortest path. • All possible paths. 4 Topological Relations Arable land Flat land ■ Intersection • What two geographical objects have in common. ■ Union Suitable for agriculture • Summation of two geographical objects. ■ Complementarity • What is outside of the geographical object. Land Non arable land B Elementary Spatial Analysis ■ 1. Statistical Generalization ■ 2. Data Distribution ■ 3. Spatial Inference 1 Statistical Generalization ■ Maps and statistical information • Important to display accurately the underlying distribution of data. • Data is generalized to search for a spatial pattern. • If the data is not properly generalized, the message may be obscured. • Balance between remaining true to the data and a generalization enabling to identify spatial patterns. • Thematic maps are a good example of the issue of statistical generalization. 1 Statistical Generalization Data 15 25 88 34 56 7 92 61 45 77 39 21 Classification 0-30 31-65 65- Spatial Pattern 1 Statistical Generalization ■ Number of classes • • • • Too few classes: contours of data distribution is obscured. Too many classes: confusion will be created. Most thematic maps have between 3 and 7 classes. 8 shades of gray are generally the maximum possible to tell apart. 1 Statistical Generalization ■ Classification methods • Thematic maps developed from the same data and with the same number of classes, will convey a different message if the ranging method is different. • Each ranging method is particular to a data distribution. Data Distribution ■ Histogram • • • • The first step in producing a thematic map. See how data is distributed. Use of basic statistics such as mean and standard deviation. An histogram plots the value against the frequency. Uniform Frequency 2 Value Normal Exponential Data Distribution ■ Equal interval C1 C2 C3 • Each class has an equal range of values. • Difference between the lowest and the highest value divided by the number of categories. C4 Frequency 2 • (H-L)/C L Value H • Easy to interpret. • Good for uniform distributions and continuous data. • Inappropriate if data is clustered around a few values. Data Distribution C1 C2 C3 C4 n(C1) n(C2) n(C3) n(C4) ■ Quantiles Frequency 2 Value • Equal number of observations in each category. • n(C1) = n(C2) = n(C3) = n(C4). • Relevant for evenly distributed data. • Features with similar values may end up in different categories. ■ Equal area • Classes divided to have a similar area per class. • Similar to quantiles if size of units is the same. Data Distribution ■ Standard deviation C1 C2 C3 C4 Frequency 2 Value -1STD X +1STD • The mean (X) and standard deviation (STD) are used to set cutpoints. • Good when the distribution is normal. • Display features that are above and below average. • Very different (abnormal) elements are shown. • Does not show the values of the features, only their distance from the average. Data Distribution ■ Arithmetic and geometric progressions C1 C2 Frequency 2 Value C3 C4 • Width of the class intervals are increased in a non linear rate. • Good for J shaped distributions. Data Distribution ■ Natural breaks C1 C2 Frequency 2 Value C3 C4 • Complex optimization method. • Minimize the sum of the variance in each class. • Good for data that is not evenly distributed. • Statistically sound. • Difficult to compare with other classifications. • Difficult to choose the appropriate number of classes. 2 Data Distribution ■ User defined • The user is free to select class intervals that fit the best the data distribution. • Last resort method, because it is conceptually difficult to explain its choice. • Analysts with experience are able to make a good choice. • Also used to get round numbers after using another type of classification method. • $5,000 - $10,000 instead of $4,982 - $10,123. ■ Using classification • Classification can be used to deliberately confuse or hide a message. 2 Data Distribution “no problems” - Equal steps “there is a problem” - Quantiles 2 Data Distribution “everything is within standards” - standard deviation 3 Spatial Inference ■ Filling the gaps • Sampling shortens the time necessary to collect data. • Requires methods to “fill the gaps”. ■ Interpolation and extrapolation • Data at non-sampled locations can be predicted from sampled locations. • Interpolation: • Predict missing values when bounding values are known. • Extrapolation: • Predict missing values outside the bounding area. • Only one side is known. Height Spatial Inference: Interpolation and Extrapolation Interpolation line Sample Location Delay at the traffic light 3 Extrapolation line Sample Interpolation line Number of vehicles Spatial Inference: Best Fit 112 110 y = 0.1408x + 116.69 2 R = 0.6779 108 106 Sex Ratio 3 104 102 100 98 96 -130 -120 -110 -100 Longitude -90 -80 -70 -60 3 Spatial Inference ■ Aggregation • Data within a boundary can be aggregated. • Often to form a new class. ■ Conversion • Data from a sample set can be converted for a different sample set. • Changing the scale of the geographical unit. • Switching from a set of geographical units to another. 3 Spatial Inference: Aggregation and Conversion Pine Trees Boreal Forest Poplar Trees District A District B1 B District District B2