This file was created by scanning the printed publication. Errors identified by the software have been corrected; however, some errors may remain. A New Method for Evaluating Positional Map Accuracy Michal ~ o d i n and ' David skea2. Abstract.-Typically, positional map accuracy is determined from a small number of 'well defined points'. If a sufficient number of such points exist, it is assumed that they are evenly dstributed over the map, and represent the overall map accuracy. In this study, a method of direct examination of the geometry representing point, linear, and aerial features is described. Pattern recognition procedures are used to automatically extract and match all identifiable nodes, arcs, and polygons of an accurate benchmark map, and a test map. Different measures of "similarity" are computed for corresponding map elements, - cut blocks and logging roads. Due to the automated approach, large datasets containing virtually all m a t c h g features are compiled. Thus, distribution, correlation, and spatial variation: tests, typically not done in standard accuracy assessments, can be camed out. Various mapping methods, operationally used in maintaining the inventory map database of the British Columbia Forest Service were tested using the developed methodology. INTRODUCTION The British Columbia Forest Service manages well over six thousand 1.20 000 mapsheets covering an area of approximately 83 million hectares. Forest openings resulting from timber harvesting and road construction are updated on a two year cycle. The map update process is carried out both by Forest Service staff and through contractors. The objective of this study was to develop and test a suitable methodology to evaluate positional accuracy of map update procedures used in the Forest Inventory program. Maps obtained by operationally available methods were compared to a highly accurate benchmark through a set of objective procedures which provide a quantitative measure of their mutual "dissimilarity". Positional mapping accuracy standards, as adopted by most modern mapping programs, state that a percentage of all well-defined planimetric features are located within some minimum distance of their geographic position with reference to a prescribed datum (Committee for Standards and Specifications, 1985). To test Remote Senstng Specialist, Ministry ofForests. Kctoria. British Columbia. Canada Systems Analyst, Ministry of Environment, Lands and Parks, fictoria. British Columbia. Canada if a given map meets a stated standard, the location of a number of well-defined points are sampled and compared with their true locations, as determined by some highly accurate positioning method (e.g., differential GPS). The differences between the two positions are then computed and summary statistics such as RMS and CSE (Circular Map Error) are computed. An outline of this methodology can be found in Maling (1989) or Merchant (1987). Another example , applied to forest mapping, is given in DMG (1991). Unbiased selection of well defined points is another problem. Dozier and Strahler (1983) have suggested that all possible well defined points be extracted from the map and then from these a random selection of these points be used as the sample. Lack of resources to collect sufficient ground information is common (Fenstermaker, 1991). METHODOLOGY Both test sites were mapped using the most accurate method available. The resulting benchmark map is of a higher order of accuracy than that produced by the tested methodologies. The same site is then remapped using the test methodologies, and these new maps are compared to the reference map. From the observed differences, a statistical picture of the similarity of the new map compared to the reference map is derived. All tests are empirical, thus no effort has been made to separate different error components. In telms of GIs error taxonomy (Collins and Smith, 1994), most errors described here can be assumed to be category E1,2,5,6,~,s,~.(systematic and random positional errors due to data collection, input, output and interpretation). A diagram showing the developed procedure is given in figure 1. If the reference map is considered error-free, this statistical picture would reflect the accuracy of the new maps. Y feature extraction - feature matching polygon bow& o f i t s polygon area differences polygon slivers arc shape differences polyg& shape differences difference observations Figure 1.--Process diagram for automated positional map accuracy assessment. Benchmark map compilation Two test sites were selected, both with significant relief and both containing a large number of features to be mapped. High accuracy (as compared to the tested 328 methods) benchmark maps were compiled for both sites using GPS ground control points and analytical plotter instrument. Positional accuracy of one of the benchmarks was evaluated using seventeen 'well defined points' to be within RMS 7.35 meters (X, Y) from their GPS field surveyed positions. Feature extraction Benchmark and test map were compiled in an unstructured format (noded and cleaned result of 'spaghetti digitizing'). Basic geometric objects: a) polygons representing forest openings, b) arcs constituting roads and partial polygon boundaries, and c) nodes identifying road-road and road-forest opening intersections were formed. Subsequently, an internal data structure, the kd tree was constructed, (Ooi, 1990). Each node of the tree contains one point and divides its subdivision of the plane into two parts (Fig. 2, 3). End points of each line segment contain links defining the line segment and the left-to-right sorting of the nodes. Each time an end point is found to lie within a given tolerance of an existing tree node, the order of the tree node is increased. Hence, nodes of order one represent line-ends not connected to another feature. Nodes of order two are simple two-line connections. Nodes of order three or higher represent line intersections. The built segment tree is scanned to extract nodes arcs and polygons. (Or 0) Figure 2.--Points on the plane. Figure 3.--The resulting kd-tree. Feature matching Automated feature extraction, matching, and differencing procedures have been developed on the Microstation platform as MDL commands. The matching process was partially hampered by the interpretive nature of the features involved. Polygon boundaries interpreted from LANDSAT imagery did not, in some cases, resemble the 'truef extent of that polygon on the benchmark map (Fig. 4) and some logging roads were simply not discernible. Matching of extracted geometric objects is facilitated by computing specific attributes for each object and defining some type of 'distance' measure (see Haralick and Shapiro, 1992). An example is given below: Nodes were matched based on position, number of lines malung up the intersection, and relative angle between the lines. For each node on the benchmark map, a range rectangle is defined. Each point on the benchmark whlch lies within h s range rectangle is tested. If the benchmark node has the wrong number of lines then it is rejected. For the remaining nodes, a weighted measure based on the position difference and relative angles is computed and the 'closest' node is accepted as the match. Arcs and polygons were matched using similar techniques. Polygons matching was close to 100% effective. Node and arc matching is in the order of 80% effective. To correct miss-matches, procedures to interactively view and correct node, arc, and polygon matches have been developed. Computed' difference measures were: Node offsets - ( A X , A y in meters, Arc and polygon closest point offsets - Offset of each arclpolygon vertex from the closest vertex of the matching arclpolygon. Arc and polygon perpendicular bisectors - Perpendxular bisector from each line connecting two vertices to the matclung line segment of the corresponding arclpolygon line segment. Polygon area differences - Area and perimeter differences between matched polygons. Polygon slivers - Two corresponding boundaries of matching polygons typically cross each other a number of times. These crossings create sliver polygons. For each matched polygon, slivers are extracted and their centroid position, area and perimeter. Figure 4.--Example of matched polygons (70) and polygons rejected due to interpretation differences (73). Tested methods and compiled datasets Table 1 shows the fourteen mapping systems tested. The result of all mapping was a digital file in IGDS format. For analog systems, such as PROCOM 11, this required an additional manual digitization step (consistent with normal practice). Table 1.--Tested m thodologies. Source data Format LANDSAT TM* , 1:1 000 000, 1500 000 SPOT PAN**, 1500 000, 1:25O 000 LANDSAT TM (Thematic Mapper) SPOT Panchromatic SPOT Multispectral Aerial photography, 1:40 000 Aerial photography, 1:40 000 Aerial photography, 1 140 000 Analogue, transparency* PROCOM I1 Contract or in-house production Contract Analogue, transparency* Digital, various levels of geometric corrections Digital Digital Analogue PROCOM I1 Contract Scanned @ 853 dpi, 1.2m pixel size Scanned, resampled to 4.8m pixels Mapping method I Digital update I I I Digital update - In-house I - - I In-house Digital update Mono restitution In-house Contract DIAP (Digital Image Analytical Plotter) DVP (Digital Video Plotter) Contract In-house The number of "correctly" mapped and matched features varies from one method to another. Typically, lower order methods, such as optical projection PROCOM I1 represent the lower bound while higher order softcopy photogrammetric methods (e.g., DIAP) produced results of the upper limit of the range. For some methodologies, the number of matched nodes, ('well defined points' 1) was as low as 10. Large data sets were are collected for arc and polygon boundary offsets as well as for polygon slivers. For more detail and test results, see Skea and Lodin (1995). Table 2.--Size of compiled datasets. Features type matched and evaluated Node offsets roadroad and roadpolygon arc and polygon boundary shortest distance and ~emendicularoffsets polygon area, position, shape sliver polygons ("in" and "out" relative to benchmark) dataset compiled for one test area 10-300 matched 2,000 - 5,000 observations in each dataset 50-90 matched 400 - 1800 - in each dataset Various distributions of polygon boundary error observations have been reported. Errors associated with manual digitizing (Bolstad et al,1990) were found to conform to a Chisquared distribution. Automated observation techniques used in this study produce data sets of sufficient size to directly test for distribution (Fig. 6). These observations suggest, that distribution of offsets cannot be assumed to be normal. Thus, estimates of epsilon bandwidth (Chrisman, 1989), used for polygon and arc boundary error estimates, should be based on distribution free statistics. QQPlot of Polygon Offsets QQPlot of Polygon Offsets 0 z Figure 5.--Quantile-quantile distribution plot of polygon offsets collected by two different methodologies, softcopy photogrammetric device using 1:40 000 photography (L), and an optical projection PROCOM I1 system using LANDSAT TM 1500 000 transparency R). Note, in a qq plot, cluster along a straight line indicates normal distribution. On the epsilon bandwidth plot (Fig. 6), the median value and epsilon bandwidth or EBW of displacements of polygodarc boundary are given. The median value shows any skew in the data (ideal data should be centered around zero) and the EBW is computed as: + 1.4 8 2 6 x m e d i a n x ( y - c e n t e r ) . This value corresponds to one sigma (the standard deviation) for Gaussian data but is robust with respect to distribution. The perimeter of each polygon is normalised between zero and 2 n Polygon Diqtacement OmllxRl K m D m WNO Figure 6.--Epsilon band plot of polygon displacements for the Mono restitution method. 332 RESULTS The focus of the statistical analysis was on assessing the form of the data and the relative accuracy, rather than just producing descriptive statistics. This included testing for distribution, computing variance of polygon area differences, and estimating &-bandwidth for polygon boundaries and arcs. Standard descriptive measures such as circular map errors were also computed. Results were presented in a comprehensive report (Skea and Lodin, 1995) at different levels of detail in the following categories: Detection limitations. The number of map features potentially omitted due to data resolution, interpretation limitations or simply due to the "hzzy" nature of forest openings. Positional error of intersections (nodes). A statistical measure used for the Provincial base mapping program (Circular Map Standard Error) was found well suited for node accuracy assessment. Surprisingly, positional errors of intersections (nodes) were systematically, for all tested methods, found not indicative of the overall map accuracy (Fig. 8). Figure 7.--Example of a node offset which is not representative of overall map accuracy. This 'T' shaped node formed by a roaaroad intersection captured from a LANDSAT satellite image (dashed line) is compared to the benchmark node which appears skewed. Clearly, the resulting offset is not indicative of the overall accuracy of the map. Positional error of roads and cutblock boundaries. Epsilon bandwidth was found useful as a means of expressing line (arc and polygon) accuracy. It was used to compare different methodologies , in the context of this study, as it is less dependent on distribution of data compared to other types of statistical measures. Area error of cutblocks. Overlays of benchmark and test map polygons produce sliver polygons. Areas of sliver polygons in- and out- relative to the benchmark were used as measures of systematic cutblock area under- or over- estimate associated with a given mapping method. Comparing polygon areas only was found biased as it would favor methods capable of mapping only large cutblock polygons (e.g., LANDSAT). CONCLUSIONS Automated pattern matching techniques can be applied successfblly to the problem of positional map accuracy assessment. Their use leads to large and more diverse sets of observations on which to judge map accuracy. Most importantly, positional accuracy of different types of map features cannot be assumed the same for a given method. This conclusion may be affected by nature of mapped features in this study (forest openings). Although the use of these observations clearly requires the development of new statistical analysis procedures, they also allow for the development of broader statistical descriptions of the data. REFERENCES Bolstad, P.V., Gessler, P. and Lillesand, T.M., 1990. Positional Uncertainty in Manually Digitized Map Data. International Journal of Geographical Information Systems, 4 (4), 399-412. Chrisman, N.R., 1989. Modeling Error in Overlays of Categorical Maps. In, Accuracy of Spatial Databases, edited by, Goodchild, M. and Gopal, S., Taylor and Francis. Collins, F.C. and Smith, L.J., 1994. Taxonomy for Error in GIs. In, Proceedings, International Symposium on the Spatial Accuracy of Natural Resource Data Bases , 1-7. ASPRS. Committee for Specifications and Standards. 1985, Accuracy Specification for Large-Scale Line Maps. Photogrammetric Engineering and Remote Sensing, Vol. 5 1, NO. 2, 195-199. DMG (Digital Mapping Group Ltd.), 1991. Project to Evaluate Photogrammetric Accuracy of Several Plotting Instruments to Transfer Forest Inventory Information from Aerial Photographs to TRIM Base Maps. Technical Report, British Columbia Ministry of Forests, Inventory Branch. Dozier, J. and Strahler, A.H.. 1983. Ground Investigations in Support of Remote Sensing. Chapter 23, Manual of Remote Sensing, second edition, Simonett, D.S. and Ulaby, F.T. editors, American Society of Photogrammetry and Remote Sensing. Fenstermaker, L.K., 1991. A Proposed Approach for National to Global Scale Error Assessments. In, Proceedings, GIS/LIS '9 1, Vol. 1, 293-300. Haralick, R.M. and Shapiro, L.G., 1992. Computer and Robot Vision, Volumes I & 11. Addison-Wesley Publishing Company. Maling, D.H., 1989. Measurements from Maps: Principles and Methods of Cartometry. Pergamon Press. Merchant, D.C., 1987. Spatial Accuracy Specification for Large Scale Topographic Maps. Photogrammetric Engineering and Remote Sensing 53 (7): 958-61. Ooi, B .C., 1990. Efficient Query Processing in Geographic Information Systems. Goos, G. and Hartmanis, J. (ed.), Lecture Notes in Computer Science, No. 471, Springer-Verlag. Skea, D. and Lodin, M., 1995, Computer-Assisted Methods to Assess the Positional Accuracy of Map Overlays. Ministry of Forests, Resources Inventory Branch, Technical report. BIOGRAPHICAL SKETCH Michal Lodin holds Masters degrees in Geology from Charles University, Prague and Surveying Engineering from the University of New Brunswick. After 8 years of consulting in Geomatics applications in Canada, U.S. and Europe, he now works in the Inventory Technical Support Section, Resources Inventory Branch, Ministry of Forests. David Skea's photogrammetric training includes diplomas from the British Columbia Institute of Technology and the International Institute for Aerial Survey and Earth Sciences in Holland. His research interest are in areas of pattern recognition, computer vision and computational geometry. He is employed with Geographic Data B.C. heading a project to develop an automated watershed and stream network mapping system.