Tree height estimation using a stochastic height-diameter relationship M. Barrio1, U. Diéguez-Aranda1, F. Castedo2, J.G. Álvarez-González1 and A. Rojo1 Unidade de Xestión Forestal Sostible Web: http://www.lugo.usc.es/uxfs/ 1 Departamento de Ingeniería Agroforestal, Universidad de Santiago de Compostela. Escuela Politécnica Superior, Campus universitario, 27002 Lugo, Spain. 2 Departamento de Ingeniería Agraria, Universidad de León. Escuela Superior y Técnica de Ingeniería Agraria, Campus de Ponferrada, 24400 Ponferrada, Spain. Abstract Measuring the height of a tree takes longer than measuring its diameter at breast height and often only the heights of a subset of trees of known diameter are measured in forest inventories. Since trees with the same diameter are not usually of the same height, even within the same stand, a stochastic component was added to the deterministic height function. This approach mimics the natural variability of heights and therefore provides more realistic height predictions, as demonstrated by the results of the Kolmogorov Smirnov test and by visual analysis of box plot graphs. Introduction Individual tree heights and diameters are essential measurements in forest inventories, and are used for estimating timber volume, site index and other important variables related to forest growth and yield, succession and carbon budget models (Peng, 2001). The time taken to measure tree heights takes longer than measuring the diameter at breast height. For this reason, often only the heights of a subset of trees of known diameter are measured, and accurate height-diameter equations must be used to predict the heights of the remaining trees to reduce the costs involved in data acquisition. If stand conditions vary greatly within a forest, a height regression may be derived separately for each stand, or a generalized function, which includes stand variables to account for the variability, may be developed (Curtis, 1967; Zhang et al., 1997; Sharma & Zhang, 2004). Two trees within the same stand and that have the same diameter are not necessarily of the same height; therefore a deterministic model does not seem appropriate for mimicking the real natural variability in height (Parresol & Lloyd, 2004). The objective of the present study was to evaluate the performance of a stochastic height-diameter approach in mimicking the observed natural variability in tree height. Material and methods We used data from two thinning trials installed in Pinus pinaster and Pinus sylvestris even-aged stands located in northwestern Spain. The stand conditions within each thinning trial were similar and thus we considered the data obtained as belonging only to two different stands. Table 1. Characteristics of the tree samples used for model fitting. Thinning trial P. pinaster P. sylvestris Number of trees 1062 488 Mean 11.2 22.4 d (cm) Min. Max. 2.9 19.8 7.0 34.8 SD 3.3 7.2 Mean 9.2 19.6 h (m) Min. Max. 4.8 12.4 9.6 25.4 SD 1.2 2.8 Diameter at breast height (d) was measured to the nearest 0.1 cm in all the trees. Total tree height (h) was measured to the nearest 0.1 m in a sample of one third of the trees in each of the 12 plots belonging to each thinning trial. Summary statistics, including the mean, minimum, maximum and standard deviation (SD) of the above-mentioned variables are shown in Table 1. A large number of local height-diameter equations have been reported in the forestry literature (e.g. Curtis, 1967; García, 1974; Wykoff et al., 1982; Huang et al., 1992, 2000; Fang & Bailey, 1998; Peng, 1999; Soares & Tomé, 2002), most of which exhibit S-shaped or concave-shaped patterns. From among all the models we selected the Schnute growth model (Schnute, 1981) for the purposes of our study. This function is easy to fit and quick to achieve convergence for any database (Bredenkamp & Gregoire, 1988; Lei & Parresol, 2001), even with small data sets (Castedo et al., 2005). The expression of Schnute’s function is as follows: ( ) 1b ⎛ 1 − e −ad ⎞ b ⎟ +ε (1) h = ⎜⎜1.3 b + h2 − 1.3 b 1 − e −ad 2 ⎟⎠ ⎝ where: h is total height of the tree; d is tree diameter at breast height; d2 is diameter at breast height of a large tree (upper range of data); h2 is parameter representing mean tree height at d2; a and b are the model parameters to be estimated and ε the residual error. Since the scatter plots of total tree height against diameter did not show an unequal error variance for any of the two data sets, parameters of equation (1) were estimated using nonlinear least squares, using the NLIN procedure of SAS/STAT® (SAS Institute Inc., 2004). The goodness-of-fit of the model was evaluated by means of the coefficient of determination for non-linear regression (R2) (see Ryan, 1997) and the root mean squared error (RMSE). The stochastic approach uses this standard error in a similar way as the prediction interval is obtained for a new individual in a regression model. Instead of using the t value corresponding to a fixed limit for all the trees, it is substituted by a value randomly generated from the inverse of the normal distribution function for each individual. Thus, the expression used to assign stochastically the height to each tree in a sample is: yˆ i (stoch ) = yˆ i + FU−1s yˆ i (new ) (2) where ŷ i (stoch ) is the stochastic height estimation, ŷ i is the deterministic height obtained from equation (1) (the fitted model without considering the stochastic component), FU−1 is the inverse of the standard normal distribution function for U – a uniform random variable in the interval (0,1) – and syˆ i (new ) is the standard error of prediction of a new observation and is expressed as: s yˆ i ( new ) = var ( yˆ i ( new ) ) = RMSE 2 + z (b )' i S 2 (b ) z (b )i (3) 2 where the scalar RMSE is the regression mean squared error, z (b )i is the partial derivatives of the least squared estimator b of the parameter vector β . The performance of the stochastic approach proposed was evaluated for diameter classes of 5 cm width and for each thinning trial. A comparison between the observed height frequencies and the variability of the deterministic and stochastic approaches was carried out. The Kolmogorov-Smirnov test was used to compare the observed and the estimated height frequencies of the two approaches. Box plot graphs were used to visually contrast the means, medians, and 25/75 percentiles of the height distributions. Results and discussion The Schnute function provided a reasonable fit, taking into account the high variability in the height-diameter relationship in the databases (see Table 1). All the parameters were significant at the P = 95% level and showed reasonable values. The low variability explained by the local model indicates that stand variables must be taken into account to improve this relationship. Deterministic values of the height ŷ i were obtained using equation (1) with the parameter estimates obtained for each data set (Table 2). The prediction of the stochastic values ŷ i (stoch ) involved the following steps: a) calculation of the standard error of the prediction syˆ i (new ) by means of equation (3); b) calculation of the inverse of the standard normal distribution function for one pseudo-random number in the interval (0,1) and c) estimation of the ŷ i (stoch ) from equation (2) by substituting the values obtained in the previous steps. Comparison of the observed, deterministic and one random stochastic height distribution per diameter class showed substantial differences between the two approaches, even though the mean height values per diameter class obtained by each approach were very similar (Figure 1, Table 3). Table 2.- Parameter estimates (standard error in brackets) and goodness-of-fit statistics for the Schnute (1981) model and the two data sets under analysis. Thinning trial P. pinaster P. sylvestris Parameter estimates h2 a b 0.0896 1.9051 10.9793 (0.0246) (0.3197) (0.1536) 0.1069 0.9516 21.9859 (0.0176) (0.2575) (0.2805) R2 RMSE 0.5091 0.828 0.5188 1.918 The observed height distribution generally followed a normal distribution within diameter classes with a high degree of variability in the two data sets evaluated. The deterministic estimate provided a height distribution with low variability located around the observed mean value, whereas the random stochastic approach provided greater variability, consistent with the observed distribution. This is evident from the results of the Kolmogorov-Smirnov test for each diameter class (Table 3) and the interquartile and maximum ranges of the distributions (Figure 1). Figure 1.- Box plots of height distributions (Y-axis) against diameter classes (X-axis) for Pinus sylvestris (left) and Pinus pinaster (right) data sets. The triangles represent the means of height estimates. The boxes represent the interquartile ranges. The maximum and minimum height estimates are represented by the upper and lower small horizontal lines crossing the vertical bars, respectively. In white: observed distribution; in light grey: deterministic estimation; in dark grey: stochastic estimation. Table 3. Results of the Kolmogorov-Smirnov test for diameter classes in the two data sets. Pinus sylvestris Pinus pinaster n Determ. Stoch. n Determ. Stoch. 2.5 ----15 0.334 0.134** 7.5 15 0.334 0.267* 390 0.297 0.049* 12.5 64 0.344 0.219 509 0.261 0.080 17.5 111 0.324 0.189 148 0.263 0.169 22.5 121 0.256 0.074** ----27.5 96 0.229 0.062** ----32.5 81 0.432 0.062** ----** Indicates significance at P = 0.05. * Indicates significance at P = 0.01. Diameter class (cm) Conclusions The relationship between tree height and diameter is one of the most important elements in natural and managed forest dynamics and structure. However a deterministic model is not capable of completely describing the nature of the height-diameter relationship. Thus, if real variability is required, a stochastic component should be added to the deterministic estimation. The suggested approach improves the realism and accuracy of height predictions at stand level. This feature is considered very important because stand growth level models used for these species in northwestern Spain are disaggregated by diameter classes to estimate merchantable volume and biomass using taper functions and individual tree biomass equations. The stochastic approach proposed may also be used in most cases where different predictions are required for the same values of the independent variables. In each situation, the process should involve the study of the specific distribution function to be used for generating variability in the simulated data (Parresol & Lloyd, 2004). Acknowledgments The fieldwork involved in the collection of the data was financed by research project PGIDIT02RFO29103PR, entitled “Estudio de las cortas de mejora en las masas de coníferas de Galicia”. We thank Dr. Christine Francis for correcting the English grammar of the text. Reference Bredenkamp, B.V. and Gregoire, T.G. (1988) A Forestry application of Schnute’s generalized growth function. For. Sci. 34: 790-797. Castedo, F.; Barrio, M.; Parresol, B.R. and Álvarez-González, J.G. (2005) A stochastic heightdiameter model for maritime pine ecoregions in Galicia (northwestern Spain). Ann. For. Sci. 62 (5) (in press). Curtis, R.O. (1967) Height-diameter and height-diameter-age equations for second-growth Douglas-fir. For. Sci. 13: 365-375. Fang, Z. and Bailey, R.L. (1998) Height-diameter models for tropical forests on Hainan Island in Southern China. For. Ecol. Manage. 110: 315-327. García, O. (1974) Ecuaciones altura-diámetro para pino insigne. Instituto Forestal, Chile. Nota Técnica 19. Huang, S.; Titus, S.J. and Wiens, D.P. (1992) Comparison of nonlinear height-diameter functions for major Alberta tree species. Can. J. For. Res. 22: 1297-1304. Huang, S.; Price, D. and Titus, S.J. (2000) Development of ecoregion-based height-diameter models for white spruce in boreal forests. For. Ecol. Manage. 129: 125-141. Lei, Y. and Parresol, B.R. (2001) Remarks on height-diameter modelling. USDA Forest Service Res. Note SRS-10. Parresol, B.R. and Lloyd, F.T. (2004) The stochastic tree modelling approach used to derive tree lists for the GIS/CISC identified stands at the Savannah River Site, Internal Report, USDA Forest Service, Southern Research Station, Asheville, North Carolina. Peng, C. (1999) Nonlinear height-diameter models for nine boreal forest tree species in Ontario. Ministry of Natur. Resour., Ontario For. Res. Inst. OFRI-Rep.155. Peng, C. (2001) Developing and validating nonlinear height-diameter models for major tree species of Ontario’s boreal forest. North. J. Appl. For. 18: 87-94. Ryan, T.P. (1997) Modern regression methods. John Wiley & Sons, New York. SAS Institute Inc. (2004) SAS/STAT® 9.1 User’s Guide. Cary, NC: SAS Institute Inc. Schnute, J. (1981) A versatile growth model with statistically stable parameters. Can. J. Fish. Aquatic. Sci. 38: 1128-1140. Sharma, M. and Zhang, S.Y. (2004) Height-diameter models using stand characteristics for Pinus banksiana and Picea mariana. Scand. J. For. Res. 19: 442-451. Soares, P. and Tomé M. (2002) Height-diameter equation for first rotation eucalypt plantations in Portugal. For. Ecol. Manage. 166: 99-109. Wykoff, W.R.; Crookston, N.L. and Stage, A.R. (1982) User’s guide to the Stand Prognosis model. USDA For. Serv. Gen. Tech. Rep. GTR-INT-133. Zhang, S.; Amateis, R. and Burkhart, H.E. (1997) The influence of thinning on tree height and diameter relationships in loblolly pine plantations. South. J. Appl. For. 21: 199-205.