This file was created by scanning the printed publication. Errors identified by the software have been corrected; however, some errors may remain. Large Area Forest Cover Assessment: Effects of Misregistration in a Double Sampling Approach with Coarse and High Resolution Satellite Images Christoph ~leinn',Berthold ~raub',Matthias ~ e e s ' Abstract When combining satellite data of considerable different spatial resolution the effect of msregistration is one of the techcal issues to be addressed. This simulation stud is based upon subcontract research carried out in the framework of &e TREES roject (Tropical Ecos stem Environment Observation by Satellites, Joint kesearch Centre, Ispra, taly). Overall objective is a global tro ical forest cover estimate and the production of a forest map. In a first p ase of the invent0 approach a complete coverage by the coarsly resolving NOAA A satellite images is provided. To improve the forest area estimates derived from this image set a sample of Landsat-TM scenes was selected in a second phase: In corresppndin frames/blocks of the same eographical area forest cover percent in and TM was recorded. k e s e pairs of values formed the Input variables for a calibration regression. To assume perfect geogra hlc registration is certamly not realistic. Some effects of rnisre istration getween the coarse and hgh resolution ima e frames/ ixel bloc s on the resultin regression are studied in this paper in tfe form o a simulation study. The e ects can be considerable, particularly when small blocks of coarse resolution pixels are to be registered. f R & A&HRR P f # 1 INTRODUCTION AND OBJECTIVES Global monitoring of forest cover is a current issue in the context of the destruction of tropical forests and in the discussion about global climatic changes. Forest cover monitoring systems are to provide sound information on state and changes in forest cover. Satellite remote sensing plays an important role there: NOAA AVHRR has proved to have favourable characteristics to discriminate vegetation from other land cover and is frequently used in environmental monitoring (Ehrlich et al. 1994). LANDSAT TM has in the 1980's developed to be the standard high resolution sensor in many forestry applications. These two systems have quite different characteristics with respect to temporal, spatial and spectral resolution, in detail described in standard textbooks. The high temporal frequency and low cost of the coarsely resolving AVHRR (about 1 km x 1 km) is well suited to provide in a first phase a more or less complete coverage of the region of interest. This results in an equally 'coarse' estimation of forest cover. The estimate might then in a second phase be improved through a sample of the much higher resolving LANDSAT TM, which is much more detailed but also much more expensive. Regression technique is used to 1 Abteilun fQr Forstliche Biometrie, Universiat Freiburg, Werderring 6, D-79085 Freiburg, Germany &einn@orst. um-fieiburgde) The acronymfcp is used throughout the paper forforest cover percent predict 'true' forest cover from the coarse information delivered by AVHRR. 'True' forest cover is assumed to be represented by the TM forest cover. In terms of sampling theory this is a double sampling for regression approach (Cochran 1977). From an image interpretation and classification point of view the resulting regression can be regarded as calibration fbnction. To calculate the regression it is necessary to obtain geographically matching pairs of map frames, one of the coarse resolution map and one of the high resolution map. This procedure is also called coregstration. Two approaches how to calculate the regression are discussed in the literature: (1) Coregistration of forest percentages from both data sources; an example is found in Nelson (1989). This approach is also used in a global tropical forest assessment in the TREES project, a general description of which can be found in Malingreau (1993). Units to be coregistered can be areas corresponding to the size of one single coarse resolution pixel, with only to values possib1e:O and 1, or blocks of pixels. (2) Coregistration of forest percentages in the high resolution image with spectral values of coarse resolution image. Examples are found in Iverson et al. (1989)) Paivinen and Pitkanen (1992) and Zhu and Evans (1992). In this approach the units to be coregistered are areas of the size of one single coarse resolution pixel. Like all types of measurements coregistration is subject to error, too: The problem of misregistration arises. This paper investigates in the form of a simulation study possible effects of misregistration onto the calibration procedure. It is an'extension of a former study presented in Kleim et al. (1995). Approach (1) is pursued here, using blocks of pixels as registration unit. 2. GENERAL DESCRIPTION OF MISREGISTRATION Not many publications yet deal with the impact of misregistration in detail. One of the few articles focuses on change detection: Townshend et al. (1992) state that change assessments using satellite images of two different taking dates are affected by the level of registration, depending, of course, on several factors. They finally state that "high levels of registration must be achieved by operational monitoring systems if there is to be reliable monitoring of global change". Assuming perfect classification procedures, the two data sources under consideration here (coarse and high resolution) would yield identical forest cover estimates. A simple linear regression between the data pairs would result in the one to one line. In reality, there are more or less significant deviations: Sensors of different spectral and spatial resolution 'see' things in a different way. Additionally geometric inaccuracies lead to misregistration: The block of high resolution pixels is not exactly matching the corresponding block of coarse resolution pixels. Perfect registration would mean that (1) the centres of the two blocks match and that (2) their shape and (3) their alignment are the same. In Figure 1 these factors are depicted schematically. Misregistration means that not only the high resolution pixels in the true matching block have a chance to be registered, but also the pixels around them. Would the amount and type of misregistration be known, then one could calculate the probability of a pixel to be included in the registration process of one specific block. For perfect coregistration this probability would be 1 for the pixels in the matching block, 0 elsewhere. In the presence of misregistration the probability is high in the matching block but greater than 0 in a certain area around.it. It is 0 only beyond the maximum misregistration distance. Under the assumption that misregistration is an isotrope process this area is drawn in Figure 1 as a circular envelope. Figure 1: Large circle: Area in which the registration probability of the pixels of the high resolution images is greater than 0. Small circle: Area of pixels having an inclusion probability greater 0 for a 'rotating' displaced block with fixed distance d and shift Assumed that the actual registration of individual sample blocks in two different images is a random process, misregistration leads to a bias when estimating fcp: The expectation of the attribute (forest cover) measured in the high resolution image is not the value of the true matching block. This adds another source of variability (error) into the target regression between forest cover estimates of high and coarse resolution pixel blocks. 3. MATERIAL Computer generated foresthon-forest (110) maps of size 9000x9000 pixels were used to investigate the effects of misregistration. A set of 'homogeneous' maps was created by randomly locating clusters of dots. These dots have diameters 3, 30 and 100 pixels for the map sets 1a- 1e, 2a-2e and 3a-3e, respectively. The total fcp goes from 10% (in maps la, 2a, 3a) over 30%, 50%, 70% to 90% (in maps le-3e). Three inhomogeneous maps (maps 4 to 6) consist of regions with different structures. Examples of these maps at original, high resolution are given in Figure 2, left hand side. The high resolution maps were gradually degraded to produce coarse resolution maps. Square pixel blocks of size n x n pixels of the original image were collapsed to one new coarse resolution pixel, with n taking on the values 10, 25 and 50. Depending on the forest cover within the n x n high resolution pixels the attribute forest or non-forest was assigned to the new coarse resolution pixel. If fcp exceeded 30%, the attribute 'forest' was assigned. In Figure 2, right hand side, results of the degradation are shown. For the degraded maps only the 'core-region' of 8000x8000 pixels is shown in Figure 2, the region to which the analysis is limited. This allows for a buffer frame surrounding the analysed region, thus facilitating the treatment of edge effects in the simulations. Map l a (10% cover): Original resolution Map la: Degraded (level 50, see text) Map 3d (70% cover): Original resolution Map 3d: Degraded (level 50, see text) Map 4 (44.4% cover): Original resolution Map 4: Degraded (level 50, see text) Figure 2: Sample maps as used in the simulation study. Left hand side: Original maps of size 900019000 pixels. For the degraded maps (right hand side) only the core-region corresponding to 8000x8000 pixels in original resolution is depicted. Table 1: 'Forest' cover percent in the original and degraded maps used in this study. The fcp .values of the maps shown in Figure 2 are framed. Map Map la Degraded 10 x 10 25 x 2 5 forest cover vercent o f the total mav 10.0 13.6 12.9 3a 3b 3c 3d 3e ........................................ Map 4 Map 5 Mar, 6 Original resolution I Map I 1 10.0 30.0 50.0 70.0 90.0 44.4 I J 5 0 x 50 1 1 10.4 10.7 30.2 30.8 51.0 5 1.6 70.6 71.5 90.6 91.1 ..................................................... 48.1 50.1 I 1- 1- 5.4 11.0 31.6 53.1 92.0 The image degradation technique leads to a change in total forest cover as illustrated in Table 1. It is a very simple technique, which probably does not mirror very realistically the properties of two real map sets. But it was felt that this was sufficiently realistic to investigate the general properties of misregistration. With real data sets the differences in fcp values would be due to differences in spatial resolution, taking date, atmospheric conditions etc. and, of course, differences in image interpretation and classification. METHODS When modeling misregistration one has to make assumptions on the distance and/or direction distribution of the deviations relative to the true location. The registration process is modelled as purely random process here. The factors which were included in this simulation and which were subject to variation at several levels are listed in Table 2. Non-matching shapes of the two blocks to be are not taken into respect in this study. For each combination of the listed factor levels a systematic grid of coarse resolution sample blocks was superimposed onto the original, high resolution maps. For all blocks at first the fcp of the true matching high resolution block was determined. Then the high resolution block was displaced 100 times according to the probability assumptions given in Table 2. For each block, mean and standard deviation of the fcp values resulting from the 100 replications were recorded. 4. 5. RESULTS Statistics of the diffences between truly matching pixel block and the mean (expectation) of the 100 simulated misregistered pixel blocks: In Figure 1 it was illustrated that a bias is suspected when estimating the fcp as an expectation of misregistered pixel blocks. In some results are given for the maps shown in Figure 2, for degradation level 50. The bias (mean of the dflferences for all blocks analysed througout the whole map) is negligible for all maps investigated. Table 2: List of factors included in the model study Description andfacior levels Factor - -- Area structure Artificially generated maps: Some produced with a homogeneous generation process, some with several processes overlaid (heterogeneous) Total forest cover Total forest cover is lo%, 30%, 50%, 70% and 90% for the homogeneous maps and 32% to 44% for the heterogeneous ones Spatial resolution of coarse resolution image Side length of the square pixels of the coarse resolution maps correspond to 10,25 and 50 pixels of the high resolution image Forestlnm-forest rule to be used for image degradation M u m 'crown' cover percent in the degraded image for a pixel to be assigned to the class 'forest' is fixed to be 30% Block size Coarse resolution square pixel blocks of 2,5, 10,20 pixels side length Parameters of misregistration Misregistration described in Figure 1. Maximum distance is 4 pixels of coarse resolution following a linearly decreasing pdf (mean distance = 413 = 1.333 pixels, standard deviation = J16/18 = 0.943 pixels). Isotropy is assumed. The coregistered blocks may rotate up to f10 degrees, following a triangular probability density b c t i o n symmetric around 0 degrees. For the single sample blocks - i.e. not using themean of the 100 replications per block - the difference between truly matching pixel block and rnisregistered pixel bock can be considerable. This can be seen with the range and standard deviation of the differences as given in Table 3: For Map 3d and block size 2 x 2 the standard deviation is about 10% fcp, having a range of differences from -41% to +38%, meaning that one has to be aware of large variation when making blockwise evaluations. For a block size of 10 x 10 or even 20 x 20 this variation is much less. This variability of the differences is also a fhction of total forest cover, which can be observed when analysing for example the sequence of maps l a to le. These maps were generated with the same algorithm of random clustering of dots of a diameter of 3 pixels. The difference is that Map l a was only filled up to lo%, Map lb up to 30% and so on. Map l e has a 90% cover. In Figure 3 the standard deviation of the differences is shown over total forest cover: The highest variability is thus-reached with the intermediate fcp of around 50%. With a forest cover of 10% or 90% variability is much less. This relationship is a quite obvious one when analysing the extreme cases: In the complete presence (or absence) of forest, misregistration has no effect at all. Calibration regression: The regression is to predict blockwise the fcp estimate of the coarse resoution pixel blocks. Here, simple linear regression is used. Though it would be justiied for practical reasons the intercept was not suppressed. Gwen the situaion illustrated in Table 1 it is clear that one gets a slope coefficient different from 1 (in most cases smaller than I), when predicting fcp of the high resoluion map (dependent variable) using the fcp of the coarse resolution map as independent variable. This holds even for the situation of perfect coregistration. In analysing the regression results we are interested in two relationships: Regression 1 under perfect coregistration and Regression 2 in the presence of rnisregistration. In Table 4 general characteristics of the regressions are listed for the three sample maps addressed throughout this paper. Given the same map and the same degradation level there are considerable differences between the coefficients of Regression 1 and Regression 2 for the small block size 2, but the differences level out quite markedly for the largest block size 20. Intercept and slope coefficients of the Regressions 1 and 2 are quite close for this large block size. As Regression 1 is the 'true' regression under perfect coregistration one sees that small block sizes lead to an incorrect calibration function and that only for very large block sizes Regression 2 approaches the coefficients of the true Regression 1. Figure 3: Standard deviation of differences between fcp of truly matching pixel block and the mean of the 100 misregistered pixel blocks over fcp I 0 0 20 40 60 80 100 Total forest cover (fcp) Table 3: Descriptive statistics of the differences between the fcp of the true matching high resolution block and the fcp results of the 100 simulated misregistered blocks (degradation level 50): Block size Mean Std Dev Minimum Maximum 0.0356 -0.0931 0.1679 2 0.000246 5 0.000281 0.0103 -0.0326 0.0442 10 0.000255 0.0036 -0.0098 0.0096 20 0.000256 0.00 1 1 -0.0029 0.0031 ----------------------------------------. Map 2 0.003002 0.1017 -0.4104 0.3791 0.002945 3d 5 0.0429 -0.1661 0.1418 10 0 .002647 0.0156 -0.0417 0.0568 20 0.003288 0.0061 -0.0149 0.0177 - - - - - ---------------- --_-_-------------_. Map 2 0.001279 0.082 1 -0.4125 0.4526 5 4 0.001563 0.0195 -0.1277 0.0690 10 0.001645 0.0080 -0.0338 0.0274 20 0.000914 0.0038 -0.01 11 0.0125 Map 1a Differences between the coefficients of Regression 1 and Regression 2 are most clear for map set 1, consisting of very small forest 'patches' compared to the spatial resolution of the degraded images. This holds - as mentioned in the preceeding paragraph - particularly for small block sizes and low total forest cover. For larger patches and higher total fcp the effect is much smaller, though still there. Table 4: Basic statistics of the simple linear calibration regression for three sample maps (rmse=root mean square error, r2=coefficient of determination). Map 3d (70% fcp in original map) Mat, 4 (44% fct, in orieinal mat,) For map 3d, consisting of large forest patches (diameter 100 original pixels each) and having a total forest cover 70% this approximation is quite good even for the small block sizes. For degradation level 10 for example the slope coefficient is 0.9592 for block size 2 ('true' coefficient: 0.9978) and 0.9976 for block size 20, being almost equal to the 'true' 1.0044. For Map la, consisting of very small patches with a total cover of only about lo%, there is a considerable difference betweeen the slope coefficient for block size 2 (0.4413) and the 'true' value of 0.7028, while for block size 20 the difference between 'misregistered' slope (0.7416) and 'true' slope (0.7611) is small. The differences in slope, however, do not lead to an error of estimation of overall-fcp. This is illustrated in Table 3. The differences are leveled out by the adverse differences in intercept: Lower slope coefficients go together with higher intercepts. In Map l a the differences in intercept are relatively small. But one has to consider that the vast majority of observations in this map is in the class fcp=O, as total forest cover is only 10%. So small differences in intercept have a large effect. When the calibration regressions based upon misregistered blocks are used for map production (calibrating the fcp of each coarse resolution pixel block) a clear error is introduced. For larger fcp values the true value will be underestimated, for very small values overestimated. Negative intercepts as they occur in Table 4 are normally not acceptable in a calibration regression: It would mean that a forestfree coarse resolution pixel block is calibrated from an fcp of 0.0 to a negative fcp. In our simulation study, however, the calibration regression was intentionally not forced to be non-negative. The root mean square error (rmse) is in most cases higher for Regression 2 what would be expected as misregistration introduces more variability. The rmse, however, is not an adequate measure here as calculations were made using per block the mean of the fcp of 100 simulations. Thus, if one is interested in the variability for one single set of misregistered blocks, one would have to take additionally into account the variability within the individual blocks. 7. CONCLUSION Three major factors can be identified which determine the magnitude of the misregistration effect as to this simulation study: - Magnitude of misregistrution. The bigger the dislocation distance the bigger the difference in fcp values registered. This obvious relation, however, must also be seen in interacting context with the two other factors. - Size of pixel blocks to be registered. The bigger the block size the less the effect of misregistration For small pixel blocks the effect can be considerable. This relation interacts very much with the third important factor, the - spatial structure and fcp of the region of interest. For very high and very low fcp the misregistration effect is smaller than for intermediate fcp's. If the forest patches are large in comparison to the size of the pixel blocks to be coregistered, the effect of misregistration on fcp estimation is smaller. The bias of the total forest cover estimate that was suspected on the basis of theoretical considerations could not be confirmed to be significant in this