Document 11863987

advertisement
This file was created by scanning the printed publication.
Errors identified by the software have been corrected;
however, some errors may remain.
Monte Carlo Simulations of Nonlinear
Size-Age Relationships
Ronald E. ~ c ~ o b e r t s '
Abstract.-Monte Carlo simulations are an accepted and useful procedure
for determining the effects of uncertainty in input variables and
parameters on the uncertainty of derived output variables. Monte Carlo
procedures are particularly useful for growth model applications when the
model is nonlinear in the parameters. For example, a nonlinear growth
model may be fit to variables derived from size-age data, and then Monte
Carlo simulations may be used to estimate the parameter covariance
matrix and a confidence interval resulting from repeatedly applying the
growth model beginning with a set of initial conditions. However, when
simulated values of size variables contain randomly generated
components, negative growth may occur. An example using height-age
data and the Weibull growth model is used to illustrate and compare
approaches to this problem and to justify an assertion that it may be
difficult to precisely predict future height.
INTRODUCTION
Monte Carlo simulation techniques have been recognized as a valid method for
determining the effects of uncertainty in input variables and parameters on the
uncertainty of derived output variables (McRoberts 1996, 1995, 1994, Mowrer
1994, 1992, Makela 1988, Gertner and Dzialowy 1984). They are particularly
useful for estimating parameter covariance structures and for propagating
uncertainty through nonlinear growth models when linear approximations perform
poorly, when the mathematical formulation is complex, and when predictor
variables depend on predicted variables at a previous stage.
The underlying objective is to estimate the bias and uncertainty in size
predictions obtained from repeatedly applying an annual growth model beginning
with an observed initial condition. Calibration data for the growth model consist
of a series of size-age observations for a typical tree whose future size is to be
predicted. Nonlinear regression is used to estimate the model parameters, and
Monte Carlo simulations are used to estimate the covariance structure of the
parameter estimates and confidence intervals for future size predictions. Two
difficulties arise when applying simulations to this problem: (1) accommodating
the repeated measures nature of the data when simulating observations and when
calibrating the growth model; and (2) dealing with negative growth that occurs
Mathematical Statistician, North Central Forest Experiment Station, 1992 Folwell Ave., St. Paul, MN 55108.
659
when a simulated size observation is less than a simulated size observation at a
previous age due to randomly generated residuals.
DATA
The calibration data consist of either 13 or 14 height-age observations for each
of six co-dominant yellow birch trees from the same forest stand. The data for
this study are a subset of the yellow birch stem analysis data collected by
Carmean (1978) and used to construct northern hardwoods site index curves.
They were collected from a plot in a fully stocked, even-aged stand on the
Chequamegon National Forest in northern Wisconsin. Annual rings were counted
on disks cut at each section point. For Carmean's analyses, a height-age graph
was constructed for each tree and was examined for signs of early suppression, top
breakage, or dieback. Such disturbed trees were discarded as were those from
even-aged stands whose age range for dominant and codominant trees exceeded
10 years.
These height-age data are characterized as longitudinal, repeated measures data:
(1) they were obtained by repeatedly observing the ages of the same trees at
different heights; (2) they were necessarily ordered in time and position and
allowed no randomization; and (3) the trees from which they were obtained were
assumed to constitute a representative sample from the population of dominant and
co-dominant trees. The serial correlation exhibited by longitudinal, repeated
measures data due to the lack of randomization must be accommodated in
parameter estimation routines if unbiased confidence intervals are to be obtained.
METHODS
Height-age model
McRoberts (1996, 1995) has shown that the height-age relationship for these
data is adequately described using a Weibull model,
where Y is height, x is age, and /3=(PI,P2,P3)is a parameter vector to be estimated.
The residuals, E , are distributed N(O,c?A) with elements, kj, of the correlation
matrix, A, structured as
where p is the annual serial correlation and c? is the residual variance.
To obtain the height-age relationship for a typical tree, population parameters
for Eq. [I] were estimated using a nonlinear mixed effects modeling approach.
Models representing the height-age relationship for a population of trees contain
both fixed effects (population parameters) and random effects (individual
parameters) and are described as mixed effects models. To simultaneously
accommodate the mixed effects and longitudinal, repeated measures nature of the
data, the first-order linearization procedure for nonlinear mixed effects models
described by Davidian and Giltinan (1995) was used. Two assumptions underlie
this procedure: (1) the same model form, but with possibly different parameter
values, is adequate to describe the height-age relationship for all trees; and (2) the
same residual variance, correlation structure, and correlation parameter value apply
for all trees. The procedure applies least squares procedures to estimate the fixed
and random model parameters and maximum likelihood procedures to estimate the
common residual variance, d,and annual serial correlation, p.
The residuals for estimated individual tree height-age curves exhibited
homogeneity of variance. This result is consistent with the assumption that the
standard deviation of growth is proportional to the expectation of growth, because
height increments between adjacent observations were always approximately four
feet. An estimate, c,, of the proportion was obtained by a second application of
the nonlinear mixed effects model routine using a covariance structure based on
this assumption and an assumption of annual height growth correlation, p.
Height growth model
Three variables were derived from the original height-age observations: (1)
average growth, the difference in height between adjacent sections divided by the
corresponding difference in age; (2) average height of adjacent sections; and (3)
average age of adjacent sections. These variables were used to develop a model
for predicting average annual growth as a function of average age and average
height. Because the Weibull model has been shown to adequately describe the
height-age relationship for these data, height growth was modeled using a
differential form of Eq. [I],
where Y is average height growth, x, is average age, x, is average height, and
a=(a,,%,q)
is a parameter vector to be estimated. The parameters, a,of Eq. [3]
q=P,-1, a,=Pl.
correspond to the parameters, 8, of Eq. [I] as follows:
a1=aP3,
The residuals, r , are distributed N(0,V) where the elements of V are calculated on
the basis of the following assumptions: (1) variance is proportional to the square
of expected growth; (2) variance is inversely proportional to the number of years
over which average growth is calculated; and (3) the correlation among
observations is derived from Eq. [2].
Confidence Intervals
A 6-step Monte Carlo procedure was used to estimate an approximate 95%
confidence interval around the estimated height-age curve obtained from repeated,
annual applications of the growth model. To estimate these confidence intervals,
random numbers were generated to simulate uncertainty from two sources: (1)
variability in the estimates of model parameters, and (2) residual variability around
the curves resulting from the parameters estimates.
Step 1. Selection of a typical tree
Estimates of the population parameters, B, and the common, individual tree
estimates of d,c,, and p were used to generate data for a typical tree.
Step 2. Simulation of height-age observations
Four-foot increments, beginning at 8 feet and ending at 64 feet, were selected,
and Eq. [I] was solved using the population parameter estimates selected in Step
1 to determine the corresponding expected ages. A multivariate normal vector of
residuals, E , was randomly generated using the common values of 8 and p and
was added to the selected heights to simulate a height-age series. Random
selection of residuals may cause simulated height at a subsequent age to be less
than simulated height at a previous age. Three approaches to this negative height
growth problem were implemented: (Approach A) ignore the problem; (Approach
B) reject simulated height-age series exhibiting negative height growth; and
(Approach C) reject simulated height-age series whenever the absolute value of
a residual is greater than expected height growth. Average height growth, average
age, and average height were calculated from these simulated data.
Step 3. Estimation of growth model parameters
The parameters for Eq. [3] were estimated from the simulated average annual
data calculated in Step 2 with weighted nonlinear least squares (WNLS). The
values of c, and p selected in Step 1 were used to calculate a weight matrix that
was constant for all simulations.
Step 4. Simulation of annual height predictions
Separate 50-year series of annual height-age observations were simulated
beginning with the first age calculated in Step 2, and beginning with ages in ten
year increments thereafter. For each year in each 50-year series, height was
calculated as the sum of previous height, predicted annual height growth, and a
randomly generated residual. Predicted height was obtained from Eq. [3] using
the parameters estimated in Step 3. The residuals were multivariate normal with
serial correlation, p, and standard deviations calculated as the product of c, and
predicted height growth. The simulated height-age series were saved for
calculation of confidence intervals in Step 6. Additional values of c, (c,=0.50,
c,=0.60) were also used to determine the effect of this parameter on bias and
uncertainty. The same three approaches to negative height growth described in
Step 2 were implemented in the simulation of these annual height-age
observations.
Step 5. Replication
Steps 2-4 were replicated 1000 times for each approach to the negative height
growth problem and each value of c,.
Step 6. Construction of confidence intervals
For each year in the 50-year series, the 1000 annual height predictions were
ordered from smallest to largest, and those corresponding to the 2.5,50.0 and 97.5
percentiles were selected. The curve resulting from the 50th percentile predictions
across all 50 ages approximates a median curve, while the other two percentiles
jointly approximate a 95% confidence interval. Finally, an expected height-age
curve with no random variability added was calculated using Eq. [3] with
parameters derived directly from the population parameters selected in Step 1.
Strictly for comparison purposes, 1000 simulations were also conducted using
an ordinary nonlinear least squares (ONLS) approach. These simulations were
similar to those previously described with the following exceptions: (1)
independent, normal residuals with variance, 02, were used in Steps 2 and 4; (2)
the negative height growth problem was completely ignored in Steps 2 and 4; and
(3) ONLS was used for parameter estimation in Step 3.
RESULTS AND CONCLUSIONS
Parameter estimates for the height-age population c w e (Fig. 1) were obtained
using the nonlinear mixed effects model routine: j?=(82.0107,-0.0123,1.1409);
d=1-92; c,=0.39; and p=0.67. The cumulative distribution of standardized
residuals indicated the assumption of normal residuals was justified.
For each simulation of 15 height-age observations in Step 2, there were
approximately 0.15 rejections for Approach B and approximately 0.50 rejections
for Approach C. In Step 4, rejections occurred approximately 0.20 times for each
simulation of 50 annual observations for Approach B and approximately 0.50
times for Approach C. With respect to Approach A, confidence intervals for both
Approaches B and C exhibited bias. For Approach B, both the median curve and
lower confidence limit exhibited positive bias, while for Approach C the median
curve was unbiased, but the confidence interval was too narrow. Although the
magnitude of the bias was not pronounced for c,=0.39, it increased with larger
values of c, (Fig. 2).
The median curve for the ONLS method was unbiased, but the 95% confidence
interval was too large. This result confirms the necessity of accommodating serial
correlation and heterogeneity of variance when using Monte Carlo simulations and
when estimating parameters from longitudinal, repeated measures data.
The results suggest that for c, 20.40, an unpleasant dilemma may arise: either
(1) accept statistical bias as the price to pay for biological realism, or (2) tolerate
biologically impossible conditions in order to avoid statistical bias. From a
strictly statistical perspective, however, negative height growth poses no inherent
conceptual difficulties. Even though the expected curve is monotonically
increasing, the underlying statistical assumptions do not require subsequent
observations to be larger than previous observations; the only requirement is that
the residuals satisfy the distributional assumptions. Thus, procedures for
constructing confidence intervals that properly reflect residual uncertainty and
uncertainty in parameter estimates must ignore negative height growth.
70 60
-
8
Observation
Individual tree
- --
50 40 -
30
-
20 10 -
8
0=
10
I
I
I
I
I
I
20
30
40
50
60
70
Age (yd
Figure 1. Height-age relationships.
Age (yr)
Figure 2. Median curves and 95% confidence intervals for c0=050.
Overall, by the fiftieth year of predicted height, confidence intervals increased
to a width of about 15-20 feet, depending on the initial condition (Fig. 3).
Coefficients of variation were estimated by dividing ?Athe width of the confidence
interval by predicted height. The estimates were generally greater when the initial
age and the length of the prediction interval were small (Table 1). The general
guideline that coefficients of variation ought not exceed about 0.10 (Gertner 1994)
suggests that for some situations it may be extremely difficult to precisely predict
future height using an annual growth model. In addition, the results obtained in
this study must be considered nearly optimal, because the calibration data
represented only the tree to which the growth model was applied. For an actual
application, the calibration data would include variation among a large sample of
trees as an additional source of uncertainty and would produce much larger
coefficients of variation.
Age (F)
Figure 3. 95% confidence intervals.
Table 1. Coefficients of variation.
Initial age
Orr)
6
16
26
36
10
20
0.17
0.09
0.05
0.04
0.03
0.13
0.08
0.05
0.05
0.04'
46
* Beyond range of calibration data.
Length of prediction
Olr)
30
0.11
0.07
0.06
0 05'
0.06'
40
50
0.08
0.07
0.05'
0.07'
0.08*
0.07
0.06'
0.06'
0.08'
0.09'
REFERENCES
Carrnean, W.H. 1978. Site index curves for northern hardwoods in Northern
Wisconsin and Upper Michigan. USDA For. Serv. Res. Pap. NC- 160.
Gertner, G.Z. 1994. A quality assessment of a Weibull based growth projection
system. &: Growth and Yield from Estimation from Successive Forest
Inventories. Roc. IUFRO Conf., Copenhagen, Denmark, June 14-17,1993.
Gertner, G.Z., and Dzialowy, P.J. 1984. Effects of measurement errors on an
individual tree-based growth projection system. Can. J. For. Res. 14:311316.
Makela, A. 1988. Performance analysis of a process-based stand growth model
using Monte Carlo techniques. Scand. J. For. Res. 3:3 15-331.
McRoberts, R.E. 1996. Estimating variation in field crew estimates of site index.
Can. J. For. Res. (in press).
McRoberts, R.E. 1995. Assumptions underlying estimation methods for nonlinear
mixed effects models. In: Am. Stat. Assn. 1995 Roc. Biom. Section,
August 13-17, 1995, Orlando. FL. (in press).
McRoberts, R.E. 1994. Variation in forest inventory field measurements. Can.
J. For. Res. 24:1766-1770.
Mowrer, H.T. 1994. Monte Carlo techniques for propagating uncertainty through
simulation models and raster-based GIs. In: Proceedings of the
International Symposium on the Spatial Accuracy of Natural Resource Data
Bases, May 16-20,1994, Williamsburg, VA, R.G.Congalton, ed. Am. Soc.
Photogrammetry and Rem. Sens. pp 179-188.
Mower, H.T. 1992. Open-architecture Monte Carlo uncertainty analysis for
forest stand dynamics Models. In: Research on Growth and Yield with
Emphasis on Mixed Stands, IUFRO Centennial Meeting Session of Section
4.01, Berlin/Eberswalde, German, August 31-September 4,1992. pp 75-84.
BIOGRAPHICAL SKETCH
Ron McRoberts is a member of the Forest Inventory and Analysis unit of
the North Central Forest Experiment Station, USDA Forest Service, where he
develops individual tree growth models. His interests include nonlinear regression
modeling, repeated measures, and propagation of variance.
Download