This file was created by scanning the printed publication. Errors identified by the software have been corrected; however, some errors may remain. Plot Collocation Error: Impacts on Area Estimation Willem W. S. van Hees ' Abstract--Results of a study conducted to examine area estimation error caused by improper collocation of aerial photo plots and associated ground plots are presented. Analyses considered the complexity of the land cover pattern on the plot, cardinal direction of collocation error, and percentage of area by land cover class on control plots. During the 198O's, the Forest Inventory and Analysis (FIA)-Anchorage unit of the Pacific Northwest Research Station conducted renewable resource inventories employing a three-phase-with-subsampling inventory design. There were three remotely sensed samples and one ground sample. Regression estimators were developed for several resource quantities, particularly area by land cover class. Results from two inventories showed low correlations between most covariates on the different sampling layers. Statistical consultation indicated inaccurate collocation of sample plots could contribute to poor correlations. This study focused on the effects of improper collocation between low altitude aerial photo plots and ground plots on estimation of productive forestland area. Results indicate there may be a north-south directional bias and that land cover complexity as estimated by fractal dimension is an important interaction component. Area estimation errors ranged from zero for homogeneous plots to 10.9 percent for plots with more complex land cover patterns. INTRODUCTION During the 1980's the Anchorage Forest Inventory and Analysis (FIA) unit of the Pacific Northwest Research Station, conducted experimental multiresource inventories employing a three-phase sampling with subsampling design (Schreuder, Gregoire, Wood 1993). Landsat multispectral imagery (LS), highaltitude color infrared photography (HAP), low-altitude color infrared Research forester, Pacific Northwest Research Station, Anchorage, AK. photography (LAP), and ground (G) plots were the sampling layers. A grid of 8 ha plots on the LS layer was sampled with successively more extensive subgrids on HAP, LAP, and G. The subgrid of LAP plots was subsampled on the ground. At the LS layer the grid spacing was 5 km, at the HAP layer 10 km, 20 km at the LAP layer, and 40 krn on the ground, Figure 1. At each sample location land cover class was evaluated. Land cover polygons were mapped for area estimation at remotely sensed locations and were point sampled at ground locations. Population totals were estimated using several estimators, including regression estimation. Regression results were disappointing in that correlations between covariates and independent variables were not satisfactory. A number of factors, including unrecognized sources of variation and changing objectives were deemed responsible. A possible source of uncontrolled variation was plot collocation. I Figure 1--Three-phasesampling with subsampling For regression estimators in grid design used for multi-resource inventories in particular, plots at various Alaska during the 1980's. sample stages must be accurately collocated. This study was designed to investigate the hypothesis that apparently minor plot collocation errors can significantly contribute to area estimation error. I Two factors that could contribute to relative error magnitude were considered; the amount of a given land cover class on the plot and the complexity of the land cover pattern on the plot. If a plot is homogenous with respect to land cover class, then small errors in collocation would, at best, result in small relative errors in area estimation. On the other hand, if the plot has small amounts of a given land cover class, then small collocation errors could result in larger relative area estimation errors. To represent land cover class complexity, the fractal dimension of the mapped land cover pattern within the plot was estimated. The hypothesis then, was that area estimation error for a given land cover class (expressed as the difference in percent of plot area in that land cover class on a control plot versus a mislocated plot) would be a function of the total plot area in that land cover class (expressed as a percent of total plot area) and of the overall fractal dimension of the land cover pattern on the plot. METHODS Study Area Field data for this study did not exist; this study is a simulation in so far as production of collocation errors is concerned. Aerial photos from inventories of the central interior, east-central interior, and south central coastal regions of Alaska (Figure 2.) were used as the data source. Data Collection fi Alaska Color infra-red aerial photos of 3 1 different locations were subjectively selected to represent a variety of topographies, latitudes, longitudes, and land cover classes. Aerial photo scales ranged from 1:3,000 to 1:6,000. Figure 2--Study plots were selected from within the shaded area above. An aerial photo interpreter classified and delineated, on transparent overlays, up to 4 land cover classes on each photo. The entire photo was classified. Classes were chosen to simplify the classification process and thereby reduce interpretation error. The classes were productive forestland (forestland capable of producing 1.4 m3.ha-leyr-'), other forestland, nonforest, and water . Each of the resulting 31 land cover class map overlays was digitized. On the digitized images 8 ha control plots were drawn. These plots served as the source of the true, or ground, area. For each control plot, area by land cover class was measured and converted to a relative value (P,). Improper collocation between these control (or ground) plots and their associated aerial photo plots was modeled by overlaying new plots that were offcenter. For each control plot, improperly collocated simulated aerial photo plots were created by overlaying, off-center, new plots moved in one of eight cardinal directions. Directions of movement were randomly chosen without replacement. The amount of collocation error was approximately 0.65 cm on 1:6,000 scale photos (approximately 40 m on the ground). The magnitude of this error derives from field experience gained during the course of the inventories. Five such mislocated plots were established for each control plot. For each mislocated plot, area by land cover class was also measured and converted to a relative value (P,). Sets of control and moved plots were created until there were at least 5 observations in each cell of frequency tables for each land cover class. The frequency tables listed number of plots by complexity class (simple vs complex) and percentage of the control plot in the land cover class. The percentages of the control plot in a given land cover class were grouped into quarters. In order to place each plot in a complexity class, estimates of the fractal dimension of the land cover pattern on control plots were made using the dividers method (Sugihara and May 1990). Complexity class was established according to a vegetation complexity model for Alaska (van Hees 1994). Ultimately, 195 sets of control plots and associated moved plots were used for this study. Data Analysis For each set of control and moved plots, the differences in percent of area (or area estimation error), by land cover class, between the moved and control plots were found. The mean of the absolute values of the five differences in percent of area (Pe) was then calculated as: where: PCand P, as defined above. For preliminary analysis, the GLM (General Linear Models) procedure (SAS 1990a) was used to fit a linear regression model, by land cover class, to the data. The model used was: where: Pe = mean absolute area difference, Po = intercept parameter, P1,2,3 = regression coefficients, 1 < Fd < 2 Fd = fractal dimension. To investigate the possibility that directional bias existed, the same general linear model as in (2) was used except that individual absolute difference for the particular direction of mislocation was the dependent variable rather than mean absolute difference over the five mislocated plots. Subsequent to linear analyses, cluster analyses were conducted using the FASTCLUS procedure (SAS 199Ob). Using FASTCLUS, observations are divided into clusters on the basis of Euclidean distances computed from one or more quantitative variables. For this study the variables used for clustering were Pe, Pc, Fd, and (PC * Fd). RESULTS Initial screening by linear regression showed significant coefficients for the productive forestland class only. T-values and analysis of variance statistics are presented in Tables 1 and 2. The predictive value of the regression relation however, as measured by 3, was negligible. The value for this relationship was 0.129. Table 1--T-tests of significance of individual regression coefficients for multiple linear regression analysis of P, against PC,fractal dimension (Fd),and PC*Fd. Variable Coefficient Intercept 0.2244931 -0.3542443 -0.1525531 0.3058221 PC Fd pc*Fd Standard error 0.0579952 0.1019977 0.048 1601 0.0886174 P (two-tail, T 3.87 -3.47 -3.17 3.45 0.0002 0.0008 0.0021 0.0009 Table 2--Analysis of variance for multiple linear regression analysis of P, against PC,fractal dimension (Fd), and PC*Fd. Source Model Error Total DF 3 84 87 Sum of squares Mean square F value 0.00942545 0.06352763 0.07295308 0.00314182 0.00075628 4.15 Pr > F 0.0085 Adding a squared interaction term so that the model became Pe = PO+ PI ( Pc ) + P 2 ( F d ) + P 3 ( PC * F d ) + P 4 ( Pc * F d )27 (3) improved predictive power. The rL value for the expanded model was 0.375. Again, the productive forestland class was the only class for which all regression coefficients were significant. The mean difference (P,) between the percentages on the control plots versus the mislocated plots ranged from zero to 10.9 percent with a mean of 4.3 percent and standard deviation of 2.9 percent. In the case where there was zero difference between the control and the mislocated plots all plots were homogenous. Land class pattern fractal dimension (Fd) ranged from 1.000 for homogenous plots to 1.530 for the most complex patterns. Mean Fd was 1.165 with a standard deviation of 0.109. Model (3) was used to examine the data for possible directional bias. That is, could the direction in which mislocation of the aerial photo plot occurred affect the magnitude of relative area estimation error. Table 3 presents the results of this analysis. Noticeable is the strength of the estimated regressions for those observations where the direction of mislocation was along the northhortheast by south/southwest axis (highlighted figures in Table 3) when compared with the estimates for other directions. Table 3-F-values and T -scores for estimated coefficients of model (3) by direction of movement of improperly collocated plots for productive forestland. T for Ho = 0 Direction North Northeast East Southeast South Southwest West Northwest P C ~d -2.99 -2.59 -1.92 -1.81 -3.31 -1.74 -1.77 -1.12 -2.70 -2.61 -1.32 -1.21 -2.53 -1.80 -1.50 -0.77 P C * Fd 3.70 3.30 2.53 2.28 4.04 2.83 2.01 1.08 (PC * F d )2 -4.37 -1.83 -3.29 -1.68 -3.96 -2.87 -2.08 -1.16 r2 .3011 .2767 .I706 .I388 .2898 ,3526 .0824 .I359 Clustering of the data was undertaken to examine the possibility that the predictive power of the model (using equation (1)) could be improved. FASTCLUS was run setting the number of clusters at 3 and 4. Overall r2 for the 4-cluster grouping was higher than for the 3-cluster grouping (0.886 vs 0.829). Clusters were separated, by FASTCLUS, into the following ranges of PC with almost no overlap. The groups of PCwere 0 - 15% (n=18), 15% - 45% (n=20), 45% - 75% (n=31), and 75% - 100% (n=19). Model (2) was then used to examine the within cluster regression relationships. In the first and last clusters the regression relationships were quite strong (r2 = 0.729 and 0.841 respectively). The relationship for the second cluster was much weaker (r2 = 0.385) and for the third cluster the regression relationship was essentially nonexistent (r2 = 0.026). DISCUSSION With regards to the productive forestland component of the study area, the hypothesis that area estimation error would be a function of the percentage of the control plot (PC)in a land cover class and the fractal dimension of the land cover pattern is supported by the results of this study. Although this study does not provide substantive clues as to why the model was more informative for productive forestland than for other land cover classes a possible explanation lies in the physiography of the study area and the discovery of possible directional bias. Patterns of productive forestland tend to be linear in the study area. In interior Alaska productive forestland is found along rivers on south facing slopes whereas in coastal regions it is "marginal" - that is along the margins next to water and up to about 600 m above sea level. Rivers and water bodies in the study area have a predominantly east/west orientation. Thus, plot mislocations to the north or south would likely have more impact on error magnitude than mislocations to the east or west. Clustering of the observations did improve predictive ability for certain ranges of the data; particularly the upper and lower ends of the control plot percentage scale. It would be possible then, to more accurately assess impacts of rnislocation knowing the percentage of the plot in a land cover class along with an estimate of fractal dimension of the mapped land cover classification. Although the results of this study do not provide strong modeling capabilities they do provide useable guidelines. It is apparent that small mislocations between aerial photo and ground plots can produce significant area estimation errors (up to 10 percent) and that inventory designers and data analysts must be aware of physiography in order to consider possible introduction of directional bias. Also, there may be some predictive capability for extremes of land cover percentages. LITERATURE CITED SAS. 1WOa. SAWSTAT User's guide, Version 6, Fourth ed., vol. 2. SAS Institute, Inc., Cary, NC. USA. SAS. 1990b. SAWSTAT User's guide, Version 6, Fourth ed., vol. 1. SAS Institute, Inc., Cary, NC. USA. Schreuder, Hans T., T. G. Gregoire, and G. B. Wood. 1992. Sampling methods for multi-resource forest inventory. John Wiley and Sons, Inc., 605 Third Ave., New York, New York. 10158-0012. 446 p. Sugihara, George & May, Robert M. 1990. Applications of fractals in ecology. Trends in Ecology and Evolution, Elsevier Science Publishers Ltd., (UK), 5(3): 79-86. van Hees, Willem W. S. 1994. A fractal model of vegetation complexity in Alaska. Landscape Ecology. 9(4):27 1-278.