This file was created by scanning the printed publication. Errors identified by the software have been corrected; however, some errors may remain. Monte Carlo Simulations of Nonlinear Size-Age Relationships Ronald E. ~ c ~ o b e r t s ' Abstract.-Monte Carlo simulations are an accepted and useful procedure for determining the effects of uncertainty in input variables and parameters on the uncertainty of derived output variables. Monte Carlo procedures are particularly useful for growth model applications when the model is nonlinear in the parameters. For example, a nonlinear growth model may be fit to variables derived from size-age data, and then Monte Carlo simulations may be used to estimate the parameter covariance matrix and a confidence interval resulting from repeatedly applying the growth model beginning with a set of initial conditions. However, when simulated values of size variables contain randomly generated components, negative growth may occur. An example using height-age data and the Weibull growth model is used to illustrate and compare approaches to this problem and to justify an assertion that it may be difficult to precisely predict future height. INTRODUCTION Monte Carlo simulation techniques have been recognized as a valid method for determining the effects of uncertainty in input variables and parameters on the uncertainty of derived output variables (McRoberts 1996, 1995, 1994, Mowrer 1994, 1992, Makela 1988, Gertner and Dzialowy 1984). They are particularly useful for estimating parameter covariance structures and for propagating uncertainty through nonlinear growth models when linear approximations perform poorly, when the mathematical formulation is complex, and when predictor variables depend on predicted variables at a previous stage. The underlying objective is to estimate the bias and uncertainty in size predictions obtained from repeatedly applying an annual growth model beginning with an observed initial condition. Calibration data for the growth model consist of a series of size-age observations for a typical tree whose future size is to be predicted. Nonlinear regression is used to estimate the model parameters, and Monte Carlo simulations are used to estimate the covariance structure of the parameter estimates and confidence intervals for future size predictions. Two difficulties arise when applying simulations to this problem: (1) accommodating the repeated measures nature of the data when simulating observations and when calibrating the growth model; and (2) dealing with negative growth that occurs Mathematical Statistician, North Central Forest Experiment Station, 1992 Folwell Ave., St. Paul, MN 55108. 659 when a simulated size observation is less than a simulated size observation at a previous age due to randomly generated residuals. DATA The calibration data consist of either 13 or 14 height-age observations for each of six co-dominant yellow birch trees from the same forest stand. The data for this study are a subset of the yellow birch stem analysis data collected by Carmean (1978) and used to construct northern hardwoods site index curves. They were collected from a plot in a fully stocked, even-aged stand on the Chequamegon National Forest in northern Wisconsin. Annual rings were counted on disks cut at each section point. For Carmean's analyses, a height-age graph was constructed for each tree and was examined for signs of early suppression, top breakage, or dieback. Such disturbed trees were discarded as were those from even-aged stands whose age range for dominant and codominant trees exceeded 10 years. These height-age data are characterized as longitudinal, repeated measures data: (1) they were obtained by repeatedly observing the ages of the same trees at different heights; (2) they were necessarily ordered in time and position and allowed no randomization; and (3) the trees from which they were obtained were assumed to constitute a representative sample from the population of dominant and co-dominant trees. The serial correlation exhibited by longitudinal, repeated measures data due to the lack of randomization must be accommodated in parameter estimation routines if unbiased confidence intervals are to be obtained. METHODS Height-age model McRoberts (1996, 1995) has shown that the height-age relationship for these data is adequately described using a Weibull model, where Y is height, x is age, and /3=(PI,P2,P3)is a parameter vector to be estimated. The residuals, E , are distributed N(O,c?A) with elements, kj, of the correlation matrix, A, structured as where p is the annual serial correlation and c? is the residual variance. To obtain the height-age relationship for a typical tree, population parameters for Eq. [I] were estimated using a nonlinear mixed effects modeling approach. Models representing the height-age relationship for a population of trees contain both fixed effects (population parameters) and random effects (individual parameters) and are described as mixed effects models. To simultaneously accommodate the mixed effects and longitudinal, repeated measures nature of the data, the first-order linearization procedure for nonlinear mixed effects models described by Davidian and Giltinan (1995) was used. Two assumptions underlie this procedure: (1) the same model form, but with possibly different parameter values, is adequate to describe the height-age relationship for all trees; and (2) the same residual variance, correlation structure, and correlation parameter value apply for all trees. The procedure applies least squares procedures to estimate the fixed and random model parameters and maximum likelihood procedures to estimate the common residual variance, d,and annual serial correlation, p. The residuals for estimated individual tree height-age curves exhibited homogeneity of variance. This result is consistent with the assumption that the standard deviation of growth is proportional to the expectation of growth, because height increments between adjacent observations were always approximately four feet. An estimate, c,, of the proportion was obtained by a second application of the nonlinear mixed effects model routine using a covariance structure based on this assumption and an assumption of annual height growth correlation, p. Height growth model Three variables were derived from the original height-age observations: (1) average growth, the difference in height between adjacent sections divided by the corresponding difference in age; (2) average height of adjacent sections; and (3) average age of adjacent sections. These variables were used to develop a model for predicting average annual growth as a function of average age and average height. Because the Weibull model has been shown to adequately describe the height-age relationship for these data, height growth was modeled using a differential form of Eq. [I], where Y is average height growth, x, is average age, x, is average height, and a=(a,,%,q) is a parameter vector to be estimated. The parameters, a,of Eq. [3] q=P,-1, a,=Pl. correspond to the parameters, 8, of Eq. [I] as follows: a1=aP3, The residuals, r , are distributed N(0,V) where the elements of V are calculated on the basis of the following assumptions: (1) variance is proportional to the square of expected growth; (2) variance is inversely proportional to the number of years over which average growth is calculated; and (3) the correlation among observations is derived from Eq. [2]. Confidence Intervals A 6-step Monte Carlo procedure was used to estimate an approximate 95% confidence interval around the estimated height-age curve obtained from repeated, annual applications of the growth model. To estimate these confidence intervals, random numbers were generated to simulate uncertainty from two sources: (1) variability in the estimates of model parameters, and (2) residual variability around the curves resulting from the parameters estimates. Step 1. Selection of a typical tree Estimates of the population parameters, B, and the common, individual tree estimates of d,c,, and p were used to generate data for a typical tree. Step 2. Simulation of height-age observations Four-foot increments, beginning at 8 feet and ending at 64 feet, were selected, and Eq. [I] was solved using the population parameter estimates selected in Step 1 to determine the corresponding expected ages. A multivariate normal vector of residuals, E , was randomly generated using the common values of 8 and p and was added to the selected heights to simulate a height-age series. Random selection of residuals may cause simulated height at a subsequent age to be less than simulated height at a previous age. Three approaches to this negative height growth problem were implemented: (Approach A) ignore the problem; (Approach B) reject simulated height-age series exhibiting negative height growth; and (Approach C) reject simulated height-age series whenever the absolute value of a residual is greater than expected height growth. Average height growth, average age, and average height were calculated from these simulated data. Step 3. Estimation of growth model parameters The parameters for Eq. [3] were estimated from the simulated average annual data calculated in Step 2 with weighted nonlinear least squares (WNLS). The values of c, and p selected in Step 1 were used to calculate a weight matrix that was constant for all simulations. Step 4. Simulation of annual height predictions Separate 50-year series of annual height-age observations were simulated beginning with the first age calculated in Step 2, and beginning with ages in ten year increments thereafter. For each year in each 50-year series, height was calculated as the sum of previous height, predicted annual height growth, and a randomly generated residual. Predicted height was obtained from Eq. [3] using the parameters estimated in Step 3. The residuals were multivariate normal with serial correlation, p, and standard deviations calculated as the product of c, and predicted height growth. The simulated height-age series were saved for calculation of confidence intervals in Step 6. Additional values of c, (c,=0.50, c,=0.60) were also used to determine the effect of this parameter on bias and uncertainty. The same three approaches to negative height growth described in Step 2 were implemented in the simulation of these annual height-age observations. Step 5. Replication Steps 2-4 were replicated 1000 times for each approach to the negative height growth problem and each value of c,. Step 6. Construction of confidence intervals For each year in the 50-year series, the 1000 annual height predictions were ordered from smallest to largest, and those corresponding to the 2.5,50.0 and 97.5 percentiles were selected. The curve resulting from the 50th percentile predictions across all 50 ages approximates a median curve, while the other two percentiles jointly approximate a 95% confidence interval. Finally, an expected height-age curve with no random variability added was calculated using Eq. [3] with parameters derived directly from the population parameters selected in Step 1. Strictly for comparison purposes, 1000 simulations were also conducted using an ordinary nonlinear least squares (ONLS) approach. These simulations were similar to those previously described with the following exceptions: (1) independent, normal residuals with variance, 02, were used in Steps 2 and 4; (2) the negative height growth problem was completely ignored in Steps 2 and 4; and (3) ONLS was used for parameter estimation in Step 3. RESULTS AND CONCLUSIONS Parameter estimates for the height-age population c w e (Fig. 1) were obtained using the nonlinear mixed effects model routine: j?=(82.0107,-0.0123,1.1409); d=1-92; c,=0.39; and p=0.67. The cumulative distribution of standardized residuals indicated the assumption of normal residuals was justified. For each simulation of 15 height-age observations in Step 2, there were approximately 0.15 rejections for Approach B and approximately 0.50 rejections for Approach C. In Step 4, rejections occurred approximately 0.20 times for each simulation of 50 annual observations for Approach B and approximately 0.50 times for Approach C. With respect to Approach A, confidence intervals for both Approaches B and C exhibited bias. For Approach B, both the median curve and lower confidence limit exhibited positive bias, while for Approach C the median curve was unbiased, but the confidence interval was too narrow. Although the magnitude of the bias was not pronounced for c,=0.39, it increased with larger values of c, (Fig. 2). The median curve for the ONLS method was unbiased, but the 95% confidence interval was too large. This result confirms the necessity of accommodating serial correlation and heterogeneity of variance when using Monte Carlo simulations and when estimating parameters from longitudinal, repeated measures data. The results suggest that for c, 20.40, an unpleasant dilemma may arise: either (1) accept statistical bias as the price to pay for biological realism, or (2) tolerate biologically impossible conditions in order to avoid statistical bias. From a strictly statistical perspective, however, negative height growth poses no inherent conceptual difficulties. Even though the expected curve is monotonically increasing, the underlying statistical assumptions do not require subsequent observations to be larger than previous observations; the only requirement is that the residuals satisfy the distributional assumptions. Thus, procedures for constructing confidence intervals that properly reflect residual uncertainty and uncertainty in parameter estimates must ignore negative height growth. 70 60 - 8 Observation Individual tree - -- 50 40 - 30 - 20 10 - 8 0= 10 I I I I I I 20 30 40 50 60 70 Age (yd Figure 1. Height-age relationships. Age (yr) Figure 2. Median curves and 95% confidence intervals for c0=050. Overall, by the fiftieth year of predicted height, confidence intervals increased to a width of about 15-20 feet, depending on the initial condition (Fig. 3). Coefficients of variation were estimated by dividing ?Athe width of the confidence interval by predicted height. The estimates were generally greater when the initial age and the length of the prediction interval were small (Table 1). The general guideline that coefficients of variation ought not exceed about 0.10 (Gertner 1994) suggests that for some situations it may be extremely difficult to precisely predict future height using an annual growth model. In addition, the results obtained in this study must be considered nearly optimal, because the calibration data represented only the tree to which the growth model was applied. For an actual application, the calibration data would include variation among a large sample of trees as an additional source of uncertainty and would produce much larger coefficients of variation. Age (F) Figure 3. 95% confidence intervals. Table 1. Coefficients of variation. Initial age Orr) 6 16 26 36 10 20 0.17 0.09 0.05 0.04 0.03 0.13 0.08 0.05 0.05 0.04' 46 * Beyond range of calibration data. Length of prediction Olr) 30 0.11 0.07 0.06 0 05' 0.06' 40 50 0.08 0.07 0.05' 0.07' 0.08* 0.07 0.06' 0.06' 0.08' 0.09' REFERENCES Carrnean, W.H. 1978. Site index curves for northern hardwoods in Northern Wisconsin and Upper Michigan. USDA For. Serv. Res. Pap. NC- 160. Gertner, G.Z. 1994. A quality assessment of a Weibull based growth projection system. &: Growth and Yield from Estimation from Successive Forest Inventories. Roc. IUFRO Conf., Copenhagen, Denmark, June 14-17,1993. Gertner, G.Z., and Dzialowy, P.J. 1984. Effects of measurement errors on an individual tree-based growth projection system. Can. J. For. Res. 14:311316. Makela, A. 1988. Performance analysis of a process-based stand growth model using Monte Carlo techniques. Scand. J. For. Res. 3:3 15-331. McRoberts, R.E. 1996. Estimating variation in field crew estimates of site index. Can. J. For. Res. (in press). McRoberts, R.E. 1995. Assumptions underlying estimation methods for nonlinear mixed effects models. In: Am. Stat. Assn. 1995 Roc. Biom. Section, August 13-17, 1995, Orlando. FL. (in press). McRoberts, R.E. 1994. Variation in forest inventory field measurements. Can. J. For. Res. 24:1766-1770. Mowrer, H.T. 1994. Monte Carlo techniques for propagating uncertainty through simulation models and raster-based GIs. In: Proceedings of the International Symposium on the Spatial Accuracy of Natural Resource Data Bases, May 16-20,1994, Williamsburg, VA, R.G.Congalton, ed. Am. Soc. Photogrammetry and Rem. Sens. pp 179-188. Mower, H.T. 1992. Open-architecture Monte Carlo uncertainty analysis for forest stand dynamics Models. In: Research on Growth and Yield with Emphasis on Mixed Stands, IUFRO Centennial Meeting Session of Section 4.01, Berlin/Eberswalde, German, August 31-September 4,1992. pp 75-84. BIOGRAPHICAL SKETCH Ron McRoberts is a member of the Forest Inventory and Analysis unit of the North Central Forest Experiment Station, USDA Forest Service, where he develops individual tree growth models. His interests include nonlinear regression modeling, repeated measures, and propagation of variance.