A COMPARISON OF ESTIMATING FOREST CANOPY HEIGHT INTEGRATING MULTI-SENSOR DATA SYNERGY ——A CASE STUDY IN MOUNTAIN AREA OF THREE GORGES Lixin Dong, Bingfang Wu * Institute of Remote Sensing Applications, Chinese Academy of Sciences, Beijing 100101, P. R. China Datun Road No. 3, P. O. Box 9718, 100101, Beijing, P. R. China, dlx_water@163.com KEY WORDS: Forest Canopy Height, Lidar, Multisensor Integration, Three Gorges, Spatial models ABSTRACT: Forest canopy height is an important input for ecosystem and highly correlated with aboveground biomass at the landscape scale. In this paper, we make efforts to extracte the maximum canopy height using GLAS waveform combination with the terrain index in sloped area where LiDAR data were present. Where LiDAR data were not present, the optical remote sensing data were used to estimate the canopy height at broad scale regions. we compared four aspatial and spatial methods for estimating canopy height integrating large footprint Lidar system (GLAS) and Landsat ETM+: ordinary least squares regression, ordinary kriging, cokriging, and cokriging of regression residuals. The results show that (1) the terrain index will help to extract the forest canopy height over a range of slopes. Regression modles explained 51.0% and 84.0% of variance for broadleaf and needle forest respectively.(2) some improvements were achieved by adding additional remote sensing data sets. The integrated models that cokriged regression residuals were preferable to either the aspatial or spatial models alone.The integrated modeling strategy is most suitable for estimating forest canopy height at locations unsampled by lidar. 1. INTRODUCTION Measurements of forest structure are critical for biomass estimation (Sun et al., 2008), biodiversity studies (North et al.1999), fire modelling (Finney, 1998), carbon stock estimation (Skole & Tucker, 1993), et,al.. Meantime, forest canopy height is an important input for ecosystem and is highly correlated with biomass (Lefsky, 2005). Traditionally, these attributes have been measured in field using handheld equipment, which are timeconsuming and limited in scope to mapping at the landscape scale (Hyde et al. 2006). For passive optical sensors (Landsat TM/ETM+), it provide useful structure information in the horizontal plane (Cohen & Spies, 1992), but is difficult to penetrating beyond upper canopy layers (Weishampel et al., 2000). Full waveform digitizing, large footprint LiDAR provides highly accurate measurements of forest canopy structure in the vertical plane (Nilsson, 1996, Lefsky et al. 1999). However, current lidar sensors have limited coverage in horizontal plane (Lefsky et al. 2002, Hyde et al. 2006). In present time, due to no single technology is capable of provide all broad scale information of vertical structure, there have been several calls for improving the applicability of remote sening data through multisensor integration (Hudak et al. 2002), combining information from multiple sensor is a promising efforts to improving the accuracy of canopy height estimation at landscape scales (Slatton et al., 2001, Wulder et al. 2004). scales by integrating the lidar and Landsat TM/ETM+ at the lidar sample locations. The basic data from GLAS (maximum canopy height) and biophysical attributes from Landsat TM/ETM+ (LAI, Forest cover and vegetation indices) were used for estimation of canopy height. We compared and tested four widely used empirical estimation methods: ordinary least squares (OLS) regression, ordinary kriging (OK), and ordinary cokriging (OCK), and the mixed model of cokriging of regression residuals. Our study area lies in mountain area of Three Gorges, which is representative of the age and structural classes common in the region. For forests on level ground, the waveform peaks of canopy surfaces and underlying ground within the footprint were easily separate. However, over sloped areas, the extent of waveform increased which inducing some errors. Harding and Carabajal (2005) point out that the vertical extent of each waveform increases as a function of the product of the slope and the footprint size, and returns from both canopy and ground surfaces can occur at the same elevation. In order to effectively extracting the canopy height in the sloped area, a new technique using Terrain Index algorithms was used for estimation of canopy height GLAS system in Three Gorges. 2. STUDY AREAS AND DATA PROCESSING 2.1 Sduty Areas Three Gorges area is a key area of the natural protective regions in China. It is located in 106º00′~111º50′E, 29º16′~31º25′N, and covers an area of approximately 5.8×104 km2. It lies in the lower part of the upper reaches of the Yangtze. To the north is Lidar-Landsat TM/ETM+ integration has immediate relevance due to the anticipated launches of the Ice, Cloud, and Land Elevation Satellite (ICESat) (Hudak et al. 2002). In this study, our mainly objective was to estimate canopy height at broad * Corresponding author. Email: wubf@irsa.ac.cn.. 379 The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences. Vol. XXXVII. Part B1. Beijing 2008 location to canopy height. Terrain Indices was defined as the range of ground surface elevations within n × n sample windows applied to DEM at the footprint location (Lefsky, 2005). The following equation was used to estimation the forest canopy height on the sloped area: the Daba Mountain and to the south is the Yunnan-Guizhou Plateau. The areas of river valley and flatland in the reservoir area account for 4.3%; hilly areas, 21.7%; and mountainous areas, 74%. The reservoir area is having a typical subtropical monsoon climate, which is rainy, humid, and foggy in fall, warm in winter, hot and dry in summer. The average annual temperature is about 15ºC~19ºC. The average annual precipitation is 1140mm~1450mm and the multi-year average run-off is 405.6×108 m3 in the local river. The main vegetation there is Evergreen needleleaf forests, Deciduous needleleaf forests, Evergreen broadleaf forests, Mixed forests, Brushwood and Croplands. Due to agricultural development and human activities, the hilly vegetation and natural vegetation will gradually be replaced by agricultural crops. The forest cover is low in the reservoir area, with that in the eastern Sichuan section being only 16~17% and that in the western Hubei section being 25~38%. The standing timber structure was simple, being mostly pure forests, with masson pine occupying 70%, mostly young trees. Statistical data in 2001 shows that the total population in the reservoir area was 19.621,200, including 14.389,300 agricultural population and 5,231,900 nonagricultural population. h = b 0 ( w − b1 g + b 2 l ) (Lefsky, 2005) (1) Where h is the measured canopy height w is the waveform extent in meters g is the terrain index in meters l is the extent of the leading edge in meters b0 is the coefficient applied to the waveform, when corrected for the scaled terrain index b1 is the coefficient applied to th e terrain index b2 is the coefficient applied to the leading edge Then,the equation was fit algorithm (Craig Markwarddt). using Levenberg-Marquardt 2.2 Lidar Data and Processing 2.3 Landsat TM/ETM+ ICESat is a spaceborne, waveform sampling lidar system that using for measuring and monitoring ice sheet and land topography as well as cloud, atmospheric, and vegetation properties. The Geo-science Laser Altimeter System(GLAS) instrument aboard the ICESat satellite launched on 12 January 2003. GLAS received waveforms record 1064nm wavelength laser energy as a function of time reflected from an ellipsoidal footprint. It has 70m spot footprints spaced at 175m intervals (http://icesat.gsfc.nasa.gov/intro.html). In this study, GLAS data from L2A(October to November 19,2003), L3A(October, 2004),L3C(February-March,2005),L3D(October to November, 2005) and L3F(May-June,2006) were used. In this study, five Landsat (Enhanced) Thematic Mapper (TM/ ETM+) scenes (path/row: 125/39,126/39,127/39,127/38 and 128/39) from 2002 served as the primary data source to estimate several spectral vegetation indices(SVIs). Firstly, geometric correction and atmospheric correction were performed using the Image Geometric Correction and ATCOR modules of the ERDAS image processing software respectively. These images were then rectified to the Gaussian Kruge projection (Spheroid: Krasovsky; Central meridian: 111º E; Central latitude: 0; False easting: 500000 meters; False nothing: 0), and were resampled using the nearest neighbour algorithm with a pixel size of 30m×30m for all bands. The resultant RMS (Root Mean Square) error was found to be less than 0.5 pixels. Then some SVIs were calculated which including EVI, NDVI, ARVI , MSAVI , SARVI , SAVI and the following vegetation indices: GLAS have many products (GLA01-GLA16). GLA01 products provide the waveforms for each shot. The product GLA14 provides Global Land Surface and Canopy elevation. GLA14 doesn’t contain the waveform, but some parameters derived from the waveform. Firstly, the GLAS waveforms were smoothed using Gaussian filters with width similar to the transmitted laser pulse. The noise level before the signal beginning and after the signal ending were estimated using a histogram method (Sun et al.2008). And then the signal beginning and end were identified by a noise threshold. In this paper, the threshold was set to the noise plus 4 times the standard deviation. The waveform extent is defined as the vertical distance between the first and last elevations at which the waveform energy exceeds a threshold level (Harding and Carabajal, 2005). Since the canopy height is related to the ground surface, not the signal ending, the ground peak in the waveform was found by comparing a bin’s value with those of the two neighboring bins. If the distance between the first peak and the signal ending is greater than the half width of the transmitted laser pulse, the first significant peak found is the ground peak (Sun et al.2008). The canopy height is defined as the distance between the signal beginning and the ground peak. ρ b 2 − ρ b 5 − ρ b1 ρ b 2 + ρ b 5 + ρ b1 ρ − ρb3 − ρb7 VI2 = b 2 ρb2 + ρb3 + ρb7 (2) ρ b2 × ρ b4 ρ b7 ρ b2 − ρ b7 − ρ b1 VI4 = ρ b 2 + ρ b7 + ρ b1 (4) VI5 = (6) VI1 = VI3 = Where ρ b7 ρ b2 + 1 (3) (5) ρ bi is the reflectance of band bi. Ecologically relevant structural attributes such as LAI and forest cover have been estimated from these SVIs. Prior research has shown that due to Landsat imagery’s widespread availability and the grain. extent, and multispectral features make it suitable for a variety of environment applications at landscape to regional scales. The distance between the signal beginning and the ground peak was extracted by above methods when the surface is flat. Over sloped area, we used the Terrain Indices at the GLAS footprint 380 The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences. Vol. XXXVII. Part B1. Beijing 2008 Where, Z * ( μ ) is the primary variable and λα and μα are the weights and locations of n neighboring samples respectively. The kriging weights was forced to sum to one: 3. ESTIMATION METHODS Firstly, the canopy height datasets were normalized with a square root transformation (SQRTHT); afterwards, all estimated SQRTHT values were backtransformed before comparing to measured height values (Hudak et al. 2002). n( μ ) λα ( μ ) = 1 ∑ α (11) =1 3.1 Ordinary Least Squares(OLS) regression The OLS multiple regression model takes the general form: 3.3 Ordinary CoKriging (OCK) n Z = α + ∑ βi ( X i ) + ε Cokriging is a multivariate extension of kriging and relies on a linear model of co-regionalization that exploits not only the autocorrelation in the primary variable, but also the crosscorrelation between the primary variable and a secondary variable. Cokriging can be graphically represented by the crosssemivariogram, defined as: (7) i =1 Where, Z is the forest canopy height, Xi is the i explanatory variable (SVIs, LAI, forest cover and X and Y location), β i is the linear slope coefficient corresponding to Xi, residual error (Kleinbaum et al.1998). ε is the 1 N (h) ∑ (zi ( μα ) − zi ( μα + h) ) 2 N ( h ) α =1 × (z j ( μα ) − z j ( μα + h ) ) γ ij ( h ) = 3.2 Ordinary Kriging(OK) Kriging interpolates the sample data to estimate values at unsampled locations, based solely on a linear model of regionalization. The linear model of regionalization essentially is a weighting function required to krig and can be graphically represented by a semivariogram. The semivariance γ (h) is the following equations. γ ( h) = 1 N (h) 2 ∑ ( z ( μ α ) − z ( μ α + h) ) 2 N (h) α =1 (12) Where, γ ij (h) is the cross-semivariance between variables I and j, z i and z j is the data value of variable i and j at locations μα and μα + h respectively. The OCK estimator of Z * at location μ takes the form: (8) Z * (μ) = n1 ( μ ) 1 =1 γ (h) is semivariance as a function of lag distance h, N (h) is the number of pairs of data locations separated by h, n2 ( μ ) ∑ λα (μ )Z (μα ) + α∑ λα (μ )Z (μα ) α 1 1 2 =1 2 (13) 2 Where and z is the data value at locations μα The traditional OCK operates under two nonbias constraints: and μα + h n1 ( μ ) ∑ λα α (Goovaerts, 1997). And in this study, the exponential models was used to simulate the nugget, sill, range and the shape of the sample semivariogram: ⎡ ⎣ γ (h ) = c ⎢1 − exp( 1 =1 n( μ ) λα ( μ )Z ( μα ) ∑ α , n (μ ) ∑ λα ( μ ) = 0 2 α 2 =1 (14) 2 (9) Residuals from the OLS regression models were exported and imported into ARCGIS software for kriging/cokriging. The same rules and procedures were followed for modelling the residuals as for modelling the SQRTHT data. Where, a is the practical range of the semivariogram, c is sill. Z * (μ ) = (μ ) = 1 3.4 Integrated − 3h ⎤ ) a ⎥⎦ The OK model estimates a value Z * at each location takes the general form: 1 μ and 4. RESULTS AND DISCUSSION In this paper, the needle forest and broadleaf forest were classified for estimation of the canopy height. The accuracy of classification was validation by our fieldwork. (10) =1 381 The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences. Vol. XXXVII. Part B1. Beijing 2008 4.1 Estimation of Plot Maximum Canopy Height Broadleaf Regression was used to estimate maximum canopy height as a function of waveform extent and the 3×3 terrain index. The regression models are following equations for each forest type. HNeedle =-1.379+0.702(w+0.0028g+0.104l) (15) (16) HBroadleaf =14.716+0.316(w-0.0127g-3.386l) F 17.517 2.772 9.762 Sig. .000 .111 .000 Needle Forest 18 Counts. 14 12 30 14 12 10 8 6 4 needle Linear 2 0 2 4 6 8 10 12 14 16 18 20 22 24 Field Measured Height 24 Field Measured C anopy Height(m) Field Measured C anopy Height(m) N eedle F orest T errain index m odel R S q Linear = 0.675 22 20 18 14 16 18 20 22 24 26 28 20 15 10 5 Broadleaf Linear 0 2 4 6 8 10 12 14 16 18 20 22 24 Field Measured Height 20 In order to compare the difference of the results between two forest type, we have modeled the results with the linear model. For the needle forest, the R2 coefficient is 69.2%, which greater than that of the broadleaf forest(50.62%).The regression model results preserved actual vegetation pattern, but underestimated taller canopies and overestimated shorter canopies. 18 16 14 16 18 20 22 24 26 Model Predicted C anopy Height(m) Model Predicted C anopy Height(m) 25 Figure 2. Measured canopy height Vs. predicted canopy height R S q Linear = 0.673 12 16 R2 = 0.5062 B roadleaf F orest T errain index m odel 22 59 0 0 24 .000 Broadleaf Forest 30 R2 = 0.692 16 Table 1. Regressions relating Waveform Extent and Terrain Index to field measured maximum canopy height 26 4.973 The mape of forest canopy height based on the OLS model is in the Figure 6(a). The predicted canopy height of OLS was compared with the measured canopy height in field(Fig.2). Model Predicted Height Std. 2.635 4.264 3.913 3.053 Table 2. Test of OLS Regression Equations Model Predicted Height R2 .840 .510 .530 Forest type Needle Broadleaf All .634 4.3 Results of OK/COK Figure 1. Observed maximum canopy height Vs. estimates of the same, for needle and broadleaf forest Firstly, the canopy height data were checked(Fig.3). From the following figure, we can find that the data is submit to the normal distribution. Hence, we performed the OK and OCK model in the ARCGIS software. The result of OK and OCK models are illustrated in Figure4. When all forest sites were considered in a single regression, the resulting equation explained 53.0% of variance with an Std. of 3.913. However, individual sites had clear biases. Regression equations explained between 51.0% and 84.0% of variance for each forest type(Fig.1 and Tabl.1). Through comparing observed maximum canopy height and estimation of the same, the R2 coefficient of needle forest is 67.5%, which greater than that of the broadleaf forest(67.3%). 4.2 Results of OLS In this study, the multiple regression models were developed for needle forest and broadleaf forest respectively. The sample size of Landsat TM/ETM+ is 60m which similar to the GLAS footprint size. The models are following regression equation. (a) Data Histogram (b) QQPlot Figure 3. normal distribution of canopy height data Hneedle =39.118-7.0E-007X(GK) -1E-005 Y(GK)+2.65LAI+ 48.482ARVI-0.001EVI+1.532MSAVI+0.032NDVI-20.311 SARVI+8.771SAVI+35.685VI1+5.619VI2+0.029VI3-15.311 VI4+5.701VI5 -3.85FC (17) Hbroadleaf =-81.368-2E-007X(GK)+5.7E-005Y(GK)+2.875LAI +125.038ARVI+0.001EVI-263.005MSAVI+0.083NDVI113.512SARVI+151.085SAVI-11.133VI1+157.214VI20.205VI3-52.567VI4-22.557VI5+1.258FC (18) Where, FC is the forest cover, X(GK) and Y(GK) is the x y location in Gaussian Kruge projection. These models explained between 55.8%-63.4% of variance at the study area (Table 2). Forest type Needle R2 .558 Std. 3.323 F 5.556 Sig. .000 Counts. 82 (a)Kriging 382 (b)CoKriging The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences. Vol. XXXVII. Part B1. Beijing 2008 2 R = 0.4738 needle Linear Model Predicted Height 0 24 22 20 18 16 14 12 10 8 6 4 2 0 2 Model Predicted Height 24 22 20 18 16 14 12 10 8 6 4 2 0 Needle Forest + CoKriging 6 8 10 12 14 16 18 20 22 24 Field Measured Height Broadleaf Forest + Kriging broadleaf Linear 2 4 6 R2 = 0.4923 2 4 6 8 10 12 14 16 18 20 22 24 Field Measured Height 24 22 20 18 16 14 12 10 8 6 4 2 0 R 2 = 0.6095 broadleaf Linear 0 2 4 6 8 10 12 14 16 18 20 22 24 Field Measured Height 4.5 Discussion needle Linear 24 22 20 18 16 14 12 10 8 6 4 2 0 8 10 12 14 16 18 20 22 24 needle 线性 (needle) 0 2 4 6 The results from this study confirm that forest height can be estimated using GLAS waveform combination with the terrain index in sloped area. Regression equations explained 51.0% and 84.0% of variance for broadleaf and needle forest respectively. the result of this work indicate that the terrain index will help to extract the forest canopy height over a range of slopes. 8 10 12 14 16 18 20 22 24 Field Measured Height Broadleaf Forest + CoKriging 2 R = 0.5955 Integration of GLAS and Landsat TM/ETM+ data using empirical modeling procedures can be used to improve the utility of both datasets for forestry applications. In this study, four integration techniques: OLS,OK,OCK and OCK+OLS models, were compared. In total, the integrated technique of ordinary cokriging of the height residuals from an OLS regression model proved the best method for estimating the forest canopy height. In future work, to improve the accuracy of the cnaopy height estimations and test the integration models in the sloped area once lidar sample data become readily available. broadleaf Linear 0 Field Measured Height R 2 = 0.6444 Figure 7. Measured canopy height Vs. predicted canopy height 0 2 R = 0.5008 0 24 22 20 18 16 14 12 10 8 6 4 2 0 4 Model Predicted Height Model Predicted Height Needle Forest + Kriging Broadleaf Forest+CoKriging+Regression Needele Forest+CoKriging+Regression 24 22 20 18 16 14 12 10 8 6 4 2 0 Model Predicted Height From figure4 and figure5, we can find that the result of OK is similar to that of OCK model. But comparing the predicted canopy height with the field measurement of the canopy height, the precision of OCK model is prior to that of the OK. However, the R2 coefficient of needle forest is less than that of the broadleaf forest. Cokriging proved slightly more accurate than kriging. The spatial models, kriging and cokriging, produced greater biased results than regression and poorly reproduced vegetation pattern. This mar be related with the distribution of lidar sampling points. OK/OCK. The the R2 coefficient of broadleaf forest is also greater than that of the OLS (50.62%). However, the R2 coefficient of needle forest is less than that of the OLS(69.2%). This is related with the distribution of lidar sampling points. An equitable distribution of lidar sampling points proved critical for efficient lidar_Landsat TM/ETM+ integration (Hudak et al. 2002). Model Predicted Height Figure 4. canopy height of OK and OCK models 2 4 6 8 10 12 14 16 18 20 22 24 Field Measured Height Figure 5. Compare of the results of OK and OCK models 4.4 Results of integration model Due to the precision of OCK is greater than that of OK. Here, we have only discussed the integration model of ‘OCK+regression’. Residuals from the OLS regression were imported into cokriging. The result was illustrated in the figure6(b). REFERENCES Craig Markwarddt, idl.html. http://cow.physics.wisc.edu/~craigm/idl/ Cohen, W.B., Spies, T.A., 1992. Estimating structure attributes of Douglas-fir/western hemlock forest stands from Landsat and SPOT imagery. Remote Sensing of Environment, 41, pp.1-17. Finney, M.A., 1998. FARSITE: Fire area simulator –model development and evaluation. (RP-4):1-+. Hudak, A.T., Lefsky, M.A., Cohen, W.B., Berterretche, M., 2002. Integration of lidar and Landsat ETM+ for estimating and mapping forest canopy height. Remote Sensing of Environment, 82, pp.397-416. (a)Regression Hyde, P., Dubayah, R., Walker, W., Blair, J.B., Hofton, M., Hunsaker, C., 2006. Mapping forest structure for wildlife habitat analysis using multi-sensor(LiDAR,SAR/INSAR,ETM+, Quickbird) synergy. Remote Sensing of Environment, 102, pp.63-73. (b)CoKriging+Regression Figure 6. canopy height of integration model Through comparing the field measurement of canopy height with the predicted canopy height, the R2 coefficient of needle forest is 64.44%, which greater than that of the broadleaf forest(60.95%). Obviously, this results are greater than that of Goovaerts P., 1997, Geostatistics for natural resources evaluation. New York: Oxford University Press, pp.483-484. 383 The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences. Vol. XXXVII. Part B1. Beijing 2008 surface topography and vegetation height. IEEE Transactions on Geoscience and Remote Sensing, 37(5), pp.2620-2624. Kleinbaum, D.G., Kupper, L.L.,Muller, K.E., Nizam, A.,1998. Applied regression analysis and other multivariable methods (3rd ed.), Pacific Grove, CA: Duxbury Press. Sun, G., Ranson, K.J., Masek, A., Fu, A., Wang, D., 2008. Predicting tree height and biomass from GLAS data. Science, 260, pp.1905-1909. Lefsky, M.A., Harding D.J., Keller, M., Cohen, W.B., 2005. Estimations of forest canopy height and aboveground biomass using ICESAT. Geophysical Research Letters,32, L22s02, doi:10.1029/2005GL0239712. Lefsky, M.A., Harding, D., Cohen, W.B., Parker, G.G., Shugart, H.H., 1999. Surface lidar remote sensing of basal area and biomass in deciduous forests of eastern Maryland, USA. Remote Sensing of Environment, 67, pp.83-98. North,M.P., Franklin, J.F., Carey, A.B., Forsman, E.D., Hamer, T., 1999. Forest stand structure of the northern spotted owl’s foraging habitat. Forest Science, 45(4), pp.520-527. Weishampel, J.F., Blair, J.B., Knox, R.G., Dubayah, R., Clark, D.B., 2000. Volumetric lidar return patterns from an oldgrowth tropical rainforest canopy. International journal of Remote Sensing, 21(2), pp.409-415. Wulder, M., Hall, R., Coops, N.C., Franklin, S., 2004. High spatial resolution remoterly sensed data for ecosystem characterization. Bioscience, 54(6), pp.511-521. http://icesat.gsfc.nasa.gov/intro.html Nilsson, M., 1996. Estimation of tree heights and stand volume using an airborne lidar system, Remote Sensing of Environment, 56,pp.1–7. ACKNOWLEDGEMENTS This work was supported by Three Gorges Project Construction Committee of the State Council Project ‘The Dynamic and Real-time Monitoring of Ecology and Environment in Three Gorges Project’ (SX 2002-004). The authors thank Distributed Active Archive Center of National Snow and Ice Data Center (NSIDC) for providing the GLAS data and many helps. Skole, D., Tucker, C., 1993. Tropical deforestation and habitat fragmentation in the Amazon: Satellite data from 1978 to 1988. Science, 260, pp.1905-1909. Slatton, K.C., Crawford, M.M., Evans, B.L., 2001. Fusing interferometrc radar and laser altimeter data to estimate 384