This file was created by scanning the printed publication. Errors identified by the software have been corrected; however, some errors may remain. Linear Mixture Modeling with Autocorrelated Errors Jayantha Ediriwickremal, Siamak Khorram2, Marcia Gumpertz3 and John Brockhaus4 Abstract.-The linear mixture model assumes that model errors are spatially uncorrelated. Spatial continuities exist in most geographical data. The data values close to each other are likely to have similar spectral characteristics. Especially in remotely sensed data, spatial autocorrelations are present among pixels. Residuals of the Advance Very High Resolution Radiometer data are examined for any deviation from the random assumption. The correlation among the residuals of the neighboring pixels and Moran's I statistic demonstrate the significance of non-random distribution of residuals. This study develops an autocorrelated error model to calibrate the linear mixture model. Seven different autocorrelation patterns are considered, and for each pattern, the linear mixture model is calibrated and land uselland cover class fractions are estimated from the Advance Very High Resolution Radiometer data. INTRODUCTION The effects of spatial autocorrelation are not clearly understood in unmixing mixed pixels by the linear mixture model (LMM). Autocorrelated errors are common in remotely sensed data (Campbell and Kiiveri 1988). Campbell and Kiiveri (1988) described point spread function of the sensor and atmospheric effects as causes for the spatial autocorrelation among neighboring pixels. The radiance values of the neighboring pixels may show similar spectral characteristics due to scattering and similar land useAand cover (LUILC) class patterns. These effects become apparent with increasing coarseness of the spatial resolution. The autocorrelated structure of remotely sensed data is widely used as additional knowledge in image classification. Belward (1992) used spatial attributes: contextual information and autocorrelated errors to evaluate the Advance Very High Resolution Radiometer (AVHRR) data in environmental monitoring. Campbell and Kiiveri (1988) discussed the advantages of using the spatial relations of the neighboring pixels in image classification, but they did not observe a significant advantage due to the spatial autocorrelation when the spectral class separations were high. Furthermore, no significant improvement was noticed in overall accuracy when spatial autocorrelations were included in the classification process. In the empirical method of unmixing mixed pixels by LMM, calibration coefficients are estimated by multivariate regression analysis. The calibration coefficients are usually estimated assuming the model errors are spatially uncorrelated with constant variance. Iverson et al. (1989) used multivariate regression analysis to develop an empirical relationship between the AVHRR data 1 Graduate Assistant, Computer Graphics Center, NCSU, Raleigh, NC. 2 Professor and Director, Computer Graphics Center, NCSU, Raleigh, NC, and Dean of Internationul Space University, Parc d'lnnovation, Communaute Urbaine de Strasbourg, Blvd. Gonthier d' Andernach, 67400 Ilkirch, France 3 Associate Professor, Department of Statistics, NCSU, Raleigh, NC. 4 Assistunt Professor, Mapping, Charting and Geodesy, Department of Geography and Environmental Engineering, United States Military Academy, West Point, NY. and the forest cover. Pech et al. (1986) examined multivariate calibration methods to estimate vegetation cover. In all these studies, the model errors were assumed to be spatially uncorrelated with constant variance. Multivariate regression models with autocorrelated errors are common in statistics. Nevertheless, the spatial autocorrelation effects are usually assumed insignificant in unmixing mixed AVHRR pixels. Autocorrelated error models are widely used for spatially correlated data accounting for the interaction between neighboring locations. Upton and Fingleton (1985) described a generalized least squares (GLS) regression method for spatially autocorrelated models. Basically, models with normally distributed spatially autoregressive errors fall in two schemes: simultaneous and conditional autorenressive schemes. The autocorrelation parameters in this study were estimated by a maximum likelihood method described by Upton and Fingleton (1985) based on a simultaneous autoregressive scheme. Seven spatial autocorrelation patterns were compared with the model that assumed errors were spatially uncorrelated. METHOD The study area, North Carolina Piedmont showed a wide variation of LUlLC patterns. The diversity of the LULC class patterns ranged from developed urban to various distributions of residential, agriculture, and forests. The frequency of the local variation was high within this area. A significant amount of autocorrelation was anticipated among the AVHRR pixels especially due to mixed pixel effects. Data Classified Landsat Thematic Mapper (TM) data and NOAA AVHRR data corresponding to the North Carolina Piedmont data set used by Khorram et al. (1994) were used in this study. Two sets of ASCII data files were developed in this process. One set was developed from the classified Landsat TM data and the other set was derived from the AVHRR data. The file created from the Landsat TM data described LU/LC class fractions within each one square lun area corresponding to the AVHRR pixels. The ASCII files created from the AVHRR data contained digital number values in 14 spectral bands (Ediriwickrema 1995). The LMM with Autocorrelated Errors The I M M assumed that there were no interactions between the six LULC classes. It also assumed that the resultant radiation of each pixel could vary only due to random noise. Correlation between AVHRR pixel values and atmospheric errors were also considered as insignificant. Under these assumptions, composite radiation of mixed pixels was related to the component LULC class fractions and to the spectral values of the L U M class pure pixels by Eq. 1. A pure pixel was defined as a pixel that was totally covered by only one defined LULC class. where A,,= pixel value of the j" AVHRR pixel in band k, Fji = fraction of the ith LULC class within the jth AVHRR pixel, Ri, = calibration coefficients of the ith LULC class in band k, a,,= random noise of the jth AVHRR pixel in the band k, and c = the number of LULC classes. The error term "q;'was assumed to be spatially correlated. The Gauss-Markov condition in the OLS regression is not satisfied when the residuals are spatially correlated. One model for spatially correlated errors is the simultaneous autoregressive error model. In this model the error for one pixel depends upon the errors of neighboring pixels. The error vector a for one band having "n" pixels is a = pWa + u (Upton and Fingleton 198S), which gives unx,= (I?,, -pWnx,)anx, and p is a constant -- autocorrelation parameter. W is a proximity matnx in which each row indicates which pixels are neighbors in the autocorrelation scheme, u = error vector that satisfies the GaussMarkov conditions, and I = identity matrix. Since the elements of "u" are uncorrelated, the simultaneous autoregressive model can be fitted by OLS after pre multiplying both sides by (I - pW). In other words, A *,,, = F *,,, Rcxb+ unxb,where A *,, = (Inxn- pWnxn )Anxb and F *,, = (Inxn- pWnxn )Fnxc . Then the estimates of the calibration coefficients ( R,,,) are written as follows. A reasonable p value for each spectral band was required to estimate Rcxb.In the simultaneous autoregression, p was chosen such that it maximized the log likelihood function "ln(L)" in Eq. 4. )A where 62= iiTii/ n and ii = (Inxn- pWnxn )A ,,, - (Inxn- pWnxn ,,,(upton and Fingleton 1985). In Eq. 4 the determinant I1 - pWI is not easy to compute for large data sets. Upton and Fingleton (1985) describe formulas to avoid computational complications as well as a way to compute I1 - pWI for large data sets. These formulae are applicable for only one type of proximity matrix, the Rook's case (Upton and Fingleton 1985). Therefore, a 20 column and 20 row subset was selected from the upper half by considering available system resources to reduce computational burden. This subset was used to calibrate the LMM. A similar size subset was also selected from the lower half of the image to estimate LU/LC class fractions, and thereby examine the effects of autocorrelated errors in the LMM. Weightings The proximity matrix (W) is constructed according to the autocorrelation pattern. Upton and Fingleton (1985) described three weighting named Rook, Bishop and Queen. Rook weighting considers only adjacent pixels within a row or column to be neighbors. The Bishop weighting considered only the cells that touch at the corners to be neighbors, while the Queen model considers all surrounding cells, i.e., touching comers and sides to be neighbors. In this study, seven different weighting structures were considered (Ediriwickrema 1995). Proximity matrix In the proximity matrix, the cells that account for contiguity are assigned one, and the remaining cells are assigned zero. Each row in the proximity is divided by its total so that row totals are all equal to one. For example, a lattice of 3 x 3 pixels results in a 9 x 9 proximity matrix. The proximity matrix for the weighting Rook's case will be similar to the one that is shown in (Ediriwickrema 1995). In that proximity matrix, the boundary pixels are weighted differently than the other pixels. Only the available surrounding pixels are accounted for; there are many other possibilities for weighting pixels on the edges. When all pixels are not weighted with the same number of pixels, the scaling results in an asymmetric " W matrix. Analysis of Spatial Continuity in the Model Residuals The spatial autocorrelation structure of the model residuals are commonly examined by semivariogram plots of the data, h-scatter plots (Isaaks and Srivastava 1989) of the errors, Moran's I statistic (Upton and Fingleton 1985). The residuals from the regression of AVHRR data on the LUlLC class fractions using ordinary least squares (OLS) were used to analyze the spatial continuity in the model. Ediriwickrema (1995) describes seven different spatial autocorrelation patterns. The spatial autocorrelation coefficient of the pixel residuals from an OLS fit of the LMM to their neighboring (as defined for each weighting) pixel residuals was calculated for each spectral band. The significance of spatial autocorrelation of the OLS residuals was tested using Moran's I statistic. Autocorrelation parameters The magnitude and the sign of the autocorrelation parameters measure the scale and the direction of spatial continuity. The proximity matrices are commonly designed with row totals equal to one, which restrains p to be between -1 and 1. Haining (1990) pointed out that p is an eigenvalue of w ' ~ , so p must lie between where Am is the largest eigenvalue of W and h,, is the smallest I/&,, and l/&~,,, eigenvalue of W. As described in section 'The LMM with Autocorrelated Errors," the p value for each band was calculated such that it maximized "ln(L)". In this study, the log likelihood function "ln(L)" values were evaluated for different p values ranging from - 1 to 1 with an increment of 0.0 1. The p value corresponding to the maximum "ln(L)", was selected as the autocorrelation parameter estimate of the respective spectral band. Calibration of the LMM Calibration coefficients were estimated by Eq. 5 based on spectral bands. For each spectral band, six coefficients were estimated representing all LULC classes. Each time when the calibration coefficients were estimated for a band, residual errors (G,,) were also calculated from the same data. Finally, the overall calibration coefficients matrix (kcx,)and the residual error matrix (ii,,) were developed. The variance-covariance matrix of the residual errors was calculated by T *bxb= uAbxn * 6nxb I n . In order to compare the autocorrelated models to the model that assumed errors are spatially uncorrelated, calibration coefficients were separately estimated assuming the model errors were spatially uncorrelated. Estimation of the LULC Class Fractions from the AVHRR Data The North Carolina Piedmont data set was divided into two portions. From the upper portion, the subset (A,,,) was selected to calibrate the LMM model. Subset (B,,) was selected from the second portion to estimate the LULC fractions (G,,) and assess the autocorrelated errors in the LMM. B, = G,,QCxb + w , where B,, = digital values of "m" AVHRR pixels in "b" spectral bands, Q,,,= calibration coefficients of "c" LULC classes in "b" spectral bands, and w,, = residual error matrix. Let Q,, = Rcx, and assuming the measurement errors associated with R , ~ , are insignificant, the regression model can be written as B, = G,,Rcxb + w,, and can also be re-written by taking transpose as B:~, = R;~,G:~, + wzx,. We propose a simple model incorporatin s atial correlation within and between spectral bands. Let b = vec(BT) = [blT b2% b3T....b14T]T,where b, is an m x 1 vector of AVHRR values for the k" spectral band. Let , , A A denote the covariance matrix among bands within a pixel. Let C , = (I -p,W)" denote the square root matrix of the spatial autocorrelation for band k. Then let the covariance matrix for all bands and all pixels be This covariance model says that the covariance among two bands in two different pixels is just the product of the components of spatial variance for each of the two bands and the covariance among bands "k" and "1" within a pixel o,.The estimated GLS estimator of the LULC class fractions for the j" pixel is then where $ is a 14 x 14 matrix of estimates of Var(b) corresponding to the jthpixel with the elements of C estimated by *,,, = 51xn* fin,, 1n , and bj is a 14 x 1 vector of AVHRR values for the j" pixel. The LULC class fractions were found to be best estimated by the constraint least square regression method (Ediriwickrema 1995). The same method was used in this study to estimate the LU/LC class fractions from the AVHRR data. The LULC class fractions were estimated separately considering each autocorrelated pattern, and assuming errors were spatially uncorrelated (Ediriwickrema 1995). Validation To examine the developments in the autocorrelated error model to the OLS model, the residual sum of squares was calculated for four sample areas. From the upper left of each quadrant, a 20 x 20 sample area was selected. RESULTS Correlation of Model Residuals The correlation of the model residuals among the neighboring pixels for each weighting is described in Figure 1. Except for the bands 11 and 12 generally all other bands show significant correlation among the neighboring pixels in all seven spatial autocorrelation patterns. Horizontal and vertical spatial autocorrelation patterns resulted in high correlation related to the diagonal spatial patterns. B Figure 1.-Correlation C D Weighting E F G of model residuals among neighboring pixels in each weighting and each spectral band. Significance of Autocorrelated Errors Except for the z-statistics in bands 11 and 12 in correlation patterns "b", "d", "e" and "f' (Ediriwickrema 1995)' all other z-statistics are greater than the upper onetail critical value at the 95% confidence level (1.645) (see Figure 2). The z-statistic values in all other spectral bands in all autocorrelation patterns are clearly greater than 1.645. B 1 B2 B3 B4 B5 B6 B7 B8 B9 B 1 0 B 1 1 B 1 2 B l 3 B l 4 Band Figure 2.-z-statistics of each AVHRR band in each autocorrelated pattern. Validation of the Autocorrelated Error Model The difference of the sum of squares of the residuals between the OLS and the autocorrelated error model are shown in Figure 3. The residual sum of squares was reduced in the autocorrelated error model for most of the bands in all four quadrants. The spectral bands acquired in late June and in mid July show remarkable improvement. Except for bands five and six in the upper right quadrant, all other bands in all quadrants show an improvement with the autocorrelated error model. The magnitude of the improvement is dependent on the spectral bands and the distribution of the LULC class patterns. The data acquired in the infrared band showed better improvement over the visible band. Figure 3.-Difference of sum of squares of the residuals between the OLS and the autocorrelated error model. CONCLUSION The AVHRR data have autocorrelated errors. Moran's I statistics and correlation coefficients among neighboring pixels in different spatial patter clearly demonstrated the spatial association of the AVHRR data. The spatial autocorrelation parameters were evaluated, and were corrected separately for each spectral band. The magnitude and the sign of the spatial autocorrelation parameters were dependent on many factors: size of the pixel, the spectral band of the image, direction of the autocorrelation pattern and the shape and size of the region. Wilson (1992) observed lower autocorrelation with the adjacent diagonal pixels than with the adjacent vertical and horizontal pixels. In this study also, models using diagonal pixels for the autocorrelation pattems showed the least autocorrelation. The Rook and Bishop models further confirmed this observation. The decrease of the sum of squares of residuals in the autocorrelated error model demonstrated the advantage in accounting for the autocorrelated errors. The magnitude of the improvement certainly depends on the spectral band, acquisition period, and the distribution of the L U / ' class pattems. This study identified the advantage of analyzing the effects of autocorrelated errors in unmixing mixed pixels using simulated data with adequate background knowledge. The background knowledge is important to understanding and interpreting the effects of autocorrelated errors in unmixing mixed pixels. REFERENCES Belward, A. S. (1992). Spatial Attributes of AVHRR Imagery for Environmental Monitoring. International Journal of Remote Sensing, 13(2): 193-208. Campbell, N. A., and H.T. Kiiveri, (1988). Neighbor Relations and Remotely Sensed Data, Report No. Internal Report:. Wembley 60 14, Western Australia: Division of Mathematics and Statistics, CSIRO, pp. 23. Ediriwickrema, D.J. (1995). Modeling and Analysis of AVHRR Data for Biogenic Emission Inventory System (BEIS). Ph. D. Dissertation. Raleigh: North Carolina State University, Raleigh, NC 27695. Haining, R. (1990). Spatial Data Analysis in the Social and Environmental Sciences. First ed. New York: Cambridge University Press. Isaaks, E.H., and R.M. Srivastava. (1 989). An Introduction to Applied Geostatistics. New York: Oxford University Press. Iverson, L.R., E.A. Cook, and R.L. Graham. (1989). A Technique for Extrapolating and Validating Forest Cover Across Large Regions. International Journal of Remote Sensing, lO(l1): 1805-1812. Khorram, S., L. Allen, J. Aanstoos, J. Brockhaus, A. Sampson, H.M. Cheshire, and D.J. Ediriwickrema, (1994). Vegetation Canopv and Land Use Characterization, Report No. #68D20009: Final Report Submitted to U. S . Environmental Protection Agency, Research Triangle Park, NC 277 11. Raleigh, NC 27695-7106: Computer Graphics Center, North Carolina State University, pp. 54. Pech, R.P., A.W. Davis, R.R. Lamacraft, and R.D. Graetz. (1986). Calibration of LANDSAT Data for Sparsely Vegetated Semi-Arid Rangelands. International Journal of Remote Sensing, 7(l2): 1729-1750. Upton, G.J.G., and B. Fingleton. (1985). Spatial Data Analysis by Example. Vol. 1. Chicheter, New York: John Wiley & Sons, 2 vols. Wilson, J.D. (1992). A Comparison of Procedures for Classifying RemotelySensed Data Using Simulated Data Sets Incorporating Autocorrelations Between Spectral Responses. International Journal of Remote Sensing, 13( 14):270 12725.