Introduction to GIS Lecture 2: Part 1. Understanding Spatial Data Structures Part 2. Legend editing, choropleth mapping and layouts All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Part 1. Understanding Spatial Data Structures All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Perception, Semantics, and Space • How do we deal with representing semantic constructions of spatial objects, like “mountain,” “river,” “street,” “city,” • How about representing more conceptual semantic constructions like “temperature,” “migration pattern,” “traditional homeland,” “habitat,” “geographic range,” etc? • Answer: we have various data models which use different abstractions of reality All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Entities and Fields • There are two general approaches for representing things in space: – Entities/ Objects: precise location and dimensions and discrete boundaries (remember, points are abstractions). – Fields, or phenomena: a Cartesian coordinate system where values vary continuously and smoothly; these values exist everywhere but change over space All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Entities and Boundaries • There are two general types of boundaries, bona fide and fiat (D. Mark, B. Smith, A. Varzi) • Pure bona fide boundaries represent real discontinuities in the world, like roads, faults, coastlines, power lines, rivers, islands, etc. • Pure Fiat boundaries are a human cognitive or legal construction, based on a categorization, such as administrative unit, nation state, hemisphere • Some have elements of both, like soil type areas All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Two major data models • Entity approach roughly corresponds with the vector model • Field approach roughly corresponds with raster model • Any geographic phenomenon can be represented with both, but one approach is usually better for a particular circumstance All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Raster • • • • • • Spatial features modeled with grids, or pixels Cartesian grid whose cell size is constant Grids identified by row and column number Grid cells are usually square in shape Area of each cell defines the resolution Raster files store only one attribute, in the form of a “z” value, or grid code. • Consider the contrary…. All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Vector • Vector layers either represent: – Points (no dimensions) – Lines, or “arcs” (1 dimension) or – Areas, or “polygons” (2 or 3 dimensions) • Points are used to define lines and lines are used to scribe polygons • Each point line or polygon is a “feature,” with its own record and its own attributes All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Raster and Vector representations of the same terrain Raster: great for surfaces Vector: limited with surfaces All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Raster and Vector representations of the same land use All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Raster and Vector representations of the same land use: closer in All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Vector vs. Raster: bounding Raster: bad with bounding Vector: boundary precision All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Vector vs. Raster: Sample points Cancer rates across space All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Moving between vector and raster • In Arc View and Arc GIS, we can covert vector layers to grids, based on an attribute, or grids to vector layers • The disadvantage of vector to raster is that boundaries can be imprecise because of cell shape • Each time you convert, you introduce more error too All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS WHEN TO USE RASTER OR VECTOR??? All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Raster data analysis is better for representing phenomena: • where boundaries are not precise • that occur everywhere within a frame and can be expressed as continuous numeric values • where change is gradual across space • where the attribute of a cell is a function of the attributes of surrounding cells All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Raster technical advantages : • Simple file structure • Simple overlay operations • Small, uniform unit of analysis Raster technical disadvantages : • Big file size, especially for fine-grained data • Difficult and error-prone reprojections • Square pixels are unrealistic All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Vector analysis is better : • • • • Where there are definable regions Where the relative position of objects is important Where precise boundary definition is needed Where multiple attributes are being analyzed for a given spatial object • For modeling of routes and networks • For modeling regions where multiple overlapping attributes are involved • EG: units with man-made boundaries (cities, zip codes, blocks), roads, rivers All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Vector technical advantages : • Smaller file size (in general) • More graphically interpretable • Allows for topology (see further on) Vector technical disadvantages : • Complicated file structure • Minimum mapping units are inconsistent between overlapping layers All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Specific Vector Usages • All legal and administrative boundaries (zip codes, states, property lines, land ownership) • Building footprints and 3-D models • Roads • Bedrock geology • Pipelines, power lines, sewer lines • Flight paths and transportation routes • Coastlines All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Specific Raster Usages • Terrain modeling where micro-locational variability is present and matters • Groundwater modeling, where surface flow outside of channels is important • Representation of slope and aspect • Representations of distance and proximity to features • Spatial representation of probabilities (logit) • Modeling phenomena in nature with continuous spatial variability and numeric attributes, like soil moisture, depth to bedrock, percent canopy cover, vegetative greenness index, species richness index All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Tossups • In many cases, though, the choice between raster and vector may not be so clear. • Often it depends on the application • The following are some examples where you could go either way: All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Terrain • Vector-based models used for terrain, including contours and TIN – Problem: creates distinct terrain entities that distort reality: terraces and triangular facets • Raster based grids are more commonly used – They are optimal for showing spatial microvariation in elevation although still have the problem of being like miniature “steps” – Lattices deal with this through interpolation All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Soil • Soil type: Vector – Soil types are meant to represent discrete and homogeneous areas and are qualitative. There is no “slight gradation” between soil types like with pH • Soil pH: raster – pH is numeric, not categorical, and that number may vary slightly within a single soil type polygon – If pH were turned into categories, like High, Medium and Low, vector might be better All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Weather • Weather station data: Vector, coded with points • Average precipitation surface: Raster interpolation of points • Average precipitation contours: vector lines • Both are interpolations, but one may be more accurate in a given situation • Downside of contours: terrace effect, fewer intervals, more categorical All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Rivers • Most people think of a river as a discretely bounded entity, hence vector • What about where the river size fluctuates seasonally, e.g. desert rivers? • Or where the location of the river bed changes slowly and gradually over the years • Or where the river becomes delta, and the distinction between “river” and “swamp” becomes fuzzy? • Or where the river has a certain probability of flowing or being dry at any given location and time All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Rivers • Depends on the type of analysis being done • With vector can do network modeling of stream and river system, but only in the arcs – Vector stream model can take advantage of topologically enabled analysis tools • With raster, can do surface flow modeling – More realistic, because when it rains water flows everywhere, not just in channels, shows accumulation – Think of every piece of land as mini stream channel All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Metropolitan Areas • No official administrative boundary for this • Where does one metro area begin and another end? Look at the New York New Jersey area. • For a precise bounding, say for administrative purposes, use vector • Can also include “fuzzy boundaries” • To represent a gradual change from one urban area to another, use raster All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Vegetation Mapping • Vector works well for modeling vegetation stand type where categories are broad, e.g. mixed conifer, deciduous hardwood • Raster works better where there is micro-locational heterogeneity in species distribution • Raster also works better for representing ecotones, or edges between two stands • The more specific and variable the classification, the more likely the raster approach will be needed All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Part 2. Legend editing, choropleth mapping, and layouts All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Visual Analysis • The most intuitive form of vector analysis is visual analysis, where we code features with colors or symbols to deliver information • Frequently, we code features by an attribute value and let the color or symbol express the attribute value • Understanding legend editing and map classification is critical to making maps that effectively deliver information All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Mapping of Attribute Data In GIS, each feature can have a number of attributes attached to it (e.g. land parcel>> property ID, assessed value, square footage) We can map out these attribute values by their corresponding geography Two basic approaches for classifying the data: 1. Quantities approach 2. Category approach All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Mapping of Attribute Data Quantity approach: applies to numeric attributes that are ordinal (have order to them); this means one values is greater than or less than another; good for continuous data. Category approach: applies to categorical data, where the categories can have, but don’t need to have, order. If they do have order, the category approach ignore that order The same layer can have some quantitative and some categorical attributes All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Mapping of Attribute Data Category approach, example: vegetation type All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Mapping of Attribute Data Quantity approach, example: population All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Mapping Categories This is the simplest type of mapping: we are simply assigning a different color or symbol to each feature with a given category value Examples: vegetation types, land use, soil types, geology types, forest types, party voting maps, land management agency, recategorizations of numeric data (“bad, good, best” or “low, medium, high’). Can you think of any others? All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Mapping Categories To map categories in ArcGIS, we simply double click on the layer in the TOC and, in “layer properties,” click on the “symbology” tab Generally,we will choose “Categories>> Unique values” The we choose our values field that contains the attribute and then click the “Add all values” button All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Mapping Categories The symbology in the last slide gives us conservation lands, categorized by type of ownership All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Mapping Categories Often categories must be aggregated and redefined: this land use map had over 110 categories that were condensed to 12 All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Mapping Categories Do do this, we must group the “group values” function in the symbology properties window We can then give that grouping a label In this case 1262, 1263, 1264, 1265, etc. refers to different subcategories of commercial land use This classificationAllislecturesaved when I save my ArcMap Document material by Austin Troy (c) 2003 except where noted Introduction to GIS Quantity Mapping This is more complex, because there are so many ways to map out quantities Mapping options depends on the feature type: • For points, lines and polygons, we can darken or lighten the color to express magnitude: this is called graduated color, or color ramping • For lines and points we can increase symbol size to express greater magnitude: this is called graduated symbol; we can do this because points and lines have fewer than 2 dimensions All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Choropleth Mapping a thematic mapping technique that displays a quantitative attribute using ordinal classes applied as uniform symbolism over a whole areal feature. Sometimes extended to include any thematic map based on symbolism applied to areal objects. -Nick Chrisman A map that shows numerical data (but not simply "counts") for a group of regions by (i) classifying the data into classes and (ii) shading each class on the map. -Keith Clarke All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Graduated Color In Arc GIS layer properties>>symbology, we choose Quantities>>graduated color We then choose a value to represent In this case we choose median house value It automatically chooses five classes for the data All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Graduated Color The resulting map shows high housing value areas with dark colors and low with light All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Graduated Color In that case we used 5 classes. Changing the number of classes changes the information delivered; more classes: more info, but harder to see differences 3 classes for median value All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Graduated Color In that case we used 5 classes. Changing the number of classes changes the information delivered; more classes: more info, but harder to see differences 15 classes for median value All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Graduated Color The Classification Method also affects how the mapped attributes will look. Arc GIS normally defaults to the Jenks, or natural breaks, method These are the breaks it makes, based on the distribution of the data small All lecture material by Austin Troy (c) 2003 except where noted large Introduction to GIS Graduated Color Now, here’s an equal interval approach. Notice how all the breaks are evenly spaced. With a fairly normal distribution of data, this is usually OK All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Graduated Color Here’s what the same distribution looks like with only 5 equal intervals. All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Graduated Color However, when the distribution is skewed, or there are significant outliers, then equal interval is problematic because most intervals have no data in them. Here’s an example, with number of vacant houses per tract—most have near none, but a very few have a lot All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Graduated Color This map of vacant properties tells us almost nothing, because almost all the records fall into the first class All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Graduated Color Notice how with natural breaks there are now more classes on the left side, where most of the data are All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Graduated Color This map, made with Natural Breaks, is more intelligible All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Graduated Color There is a similar approach to Natural Breaks called Quantile. This method sets class boundaries so each class has equal numbers of observations in it All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Graduated Color This essentially sets the class boundaries so as to maximize the perceived variation in the map, as we see here Natural Breaks is similar, but does not necessarily result in an equal number of data points in each class; rather it uses Jenks' Goodness of Variance Fit (GVF) statistic All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Graduated Color Graduated color can also be applied to points. Here are houses display by sales price Natural breaks Equal interval All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Graduated Symbol Since points and lines are not dimensionally realistic, the symbols representing them can also be graduated. Here the size of the dot represents the house price All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Graduated Symbol The same thing can also be done with lines—for instance, the width of a line feature showing rivers can be made to represent the flow of that river segment. For many line features, like streets, ArcGIS comes preloaded with symbol palettes that recognize the attribute codes and put the appropriate symbol All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Symbol Styles We can also choose to “match to symbols in a palette” and then apply the “transportation.style” palette to the CFCC, or road category, attribute in our roads layer Choose your style palette here Must click here to match Results in this map All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Symbol Styles One could also manually create symbol styles for each street type. Clicking on each symbol in either the TOC or properties windows brings up a manual symbol selector. You can assign a separate one to each category. Includes many more classes of symbols that are industry standar All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Symbol Styles There are also a huge variety of industry-specific point symbols that can be either assigned through matching symbols to a predefined style or manually assigning those symbols All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Charts displayed geographically Attributes for point, line or polygon features can also be displayed as charts on the map All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Normalization With graduated color or symbol, we can also show an attribute normalized by another attribute or expressed as a percentage of total. Here we have number of vacancies per tract as a percentage of total households. Otherwise we’re only tracking total number. numerator denominator All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Layouts • • You can very simply create a map for layout in Arc GIS by simply clicking View>>Layout view. Layouts are designed to cartographically acceptable, which means they must have the key elements of a printed map, such as scale bars, north arrows, legends and titles. These can be added from the Insert menu All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Layouts • Example layout (from lab 6) title North arrow Scale bar legend All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Layouts • Legends are edited in the Legends property window, which can be accessed by double clicking the legends. Best way to learn about it is try it out Legends can show layer name as well as intervals for quantitative data and category names for categorical data All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Layouts • You can change names of the layers for the sake of your layout legend (since most layers have pretty unintuitive names) in the layer properties window All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Layouts • In layouts you can have detailed and highly formatted labeling and annotation. You can use an attribute field to label; this is specified in layer properties All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS MXD Files • • You can save your layout, along with all other preferences and settings by saving an Arc Map Document (MXD) file. However, this is not saving your data, only the settings, including the layout. If you move the MXD, you must move the layers with it. This is one reason why a geodatabase is easier than multiple shapefiles To save, just go to File>>save as All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Layer Files • • Layer (.lyr) files save all your settings and preferences for one single file. It is primarily for saving legend settings. So, for instance, if I a layer with 300 land use categories, and I create a legend classification that regroups them into 30 categories, each with a special color or hatching, I can save that as a layer file. Once created, opening a layer file will open the data layer with all the preferences saved. You can move the data around without moving the layer file as long as both are on the same system. All lecture material by Austin Troy (c) 2003 except where noted Introduction to GIS Layer Files • This is done in Arc Catalog, by right clicking and clicking “create layer.” Then I can create the legend preferences in Arc Catalog Then, double clicking in Arc Catalog will give me the layer properties, which can be changed All lecture material by Austin Troy (c) 2003 except where noted