GI Systems and Science February 6, 2012 Points to Cover A concept of data stream Data encoding Database Management System (DBMS) Data editing Finding and correcting errors Concept of Data Stream Formulating research question Collecting data Creating data model Entering data into a GIS The process of data encoding and editing is also known as ‘data stream’ Concept of Data Stream Figure 4.4 Source: Heywood et al., 2011 Figure 5.1 Source: Heywood et al., 2011 Concept of Data Stream Specific steps of this process and methods used will depend on: Source of spatial data Analogue data Digital data Type, format, scale or resolution of spatial data The need for and importance of universal data standards Data Encoding Data encoding is the process of getting data into the computer Various methods of data input exist depending on data source, project requirements and available resources Source type Data source Applicable encoding methods Resulting data type Analogue data Tabular data Keyboard entry Attribute data Text scanning Geocoding Vector data Geolocating Map data Manual digitizing Vector data Automatic digitizing Aerial photographs Field survey data Scanning Raster data Manual digitizing Vector data Automatic digitizing Scanning Raster data Keyboard entry Vector data Data Encoding Data encoding is the process of getting data into the computer All data in analogue format need to be converted to digital form Digital data do not need to be encoded but most often than not require to be converted into a proper format Source type Data source Applicable encoding methods Digital data Tabular data Digital data transfer which may include Map data Aerial photographs Field survey data Data conversion Data Encoding Manual digitizing Most common method of encoding spatial features from paper maps and hard copy aerial photos Box 5.1, page 138: Using a manual digitizing table Key step Registration of a map using control points Figure 5.2 Source: Heywood et al., 2011 Data Encoding Manual digitizing Two modes of digitizing Point mode Stream mode The accuracy of data generated by this method depends on many factors, including ‘hand-wobble’ Quite time consuming and expensive ArcGIS (ArcInfo version) has ‘on-screen’ digitizing capabilities Consult Editing and data compilation section of ArcGIS Help files Data Encoding Scanning One of the automatic digitizing methods Produces raster data Useful way to create background images used in on-screen digitizing Box 5.3, page 143: Using a scanner ArcGIS does not have scanning capabilities Figure 5.7 Source: Heywood et al., 2011 Data Encoding Electronic data transfer Includes downloading data from GPS and survey and monitoring equipment This data may require geolocation Most often than not includes data conversion to a format understood by your GIS Check ArcGIS Help files to find what data format are supported Finding spatial data on-line Box 5.8 on page 154 of the text U of R Library Data Editing (Cleaning) Once entered, data almost always needs to be corrected and manipulated to ensure that their structure is consistent with your GIS requirements or capabilities Issues that may have to be addressed at this stage of the GIS project Correcting errors in the data The re-projecting of data from different sources to a common projection The generalization of complex data to provide a simpler dataset The matching and joining adjacent map sheets once the data are in digital form Data Editing (Cleaning) Finding and correcting errors Errors in input data may derive from three main sources In the source data Introduced during encoding Propagated during data transfer and conversion Ways to check for errors in attribute data Checking for outliers Checking internal consistency Constructing trend surfaces Box 5.9 on page 156 of the text Data Editing (Cleaning) Finding and correcting errors Possible errors in spatial data Vary depending the data model and method of data encoding Possible errors in vector data Created in the process of digitizing ArcGIS (ArcInfo version) has a suite of editing tools for removal of errors in vector data Possible errors in raster data Missing entities Noise Usually corrected by filtering Data Editing (Cleaning) Re-projection and transformation Data derived from maps drawn on different projections will need to be converted to a common projection system before they can be combined or analyzed Data derived from different sources referenced using different co-ordinate systems need to be transformed to a common coordinate system Project tool in ArcGIS Data Editing (Cleaning) Generalization Data derived from larger-scale maps should be generalized to be compatible with the data derived from the smaller-scale maps ○ Vector data Weeding out superfluous points from lines so that the general shape of lines is preserved ○ Raster data Aggregation of cells with the same attribute values Filtering Reflection Box on page 171