A Finite Mixture Model for Characterizing the Diameter Distributions of Mixed-Species Forest Stands

advertisement
A Finite Mixture Model for
Characterizing the Diameter
Distributions of Mixed-Species
Forest Stands
Chuangmin Liu, Lianjun Zhang, Craig J. Davis, Dale S. Solomon, and
Jeffrey H. Gove
ABSTRACT. A finite mixture model is used to describe the diameter distributions of mixed-species
forest stands. A three-parameter Weibull function is assumed as the component probability density
function in the finite mixture model. Four example plots, each with two species, are selected to
demonstrate model fitting and comparison. It appears that the finite mixture model is flexible enough
to fit irregular, multimodal, or highly skewed diameter distributions. Compared with traditional methods
in which a single Weibull function is fit to either the whole plot or each species component separately,
the finite mixture model produces much smaller root mean square error and bias, and fits the entire
distribution of the plots with extreme peaks, bimodality, or heavy-tails well. In some cases, a single
Weibull function fitted to individual species separately may produce more accurate estimations for the
component distributions of the two species than the finite mixture model. The summation of the two
independent species results, however, may not produce a better prediction for the entire plot. This
study shows that the finite mixture model is a promising alternative method for modeling the diameter
distribution of multispecies mixed forest stands. For. Sci. 48(4):653–661.
Key Words: Weibull function, maximum likelihood, goodness-of-fit test, model comparison.
D
IAMETER-CLASS DISTRIBUTION MODELS
have become a
useful tool in forest management, growth and yield
modeling, and forest inventories. Various probability density functions (pdf) such as normal, log-normal, gamma,
beta, Johnson’s SB,and Weibull have been utilized to characterize the diameter frequency distributions of forest stands
(e.g., Bailey and Dell 1973, Burkhart and Strub 1974, Hafley
and Schreuder 1977, Little 1983, Kilkki and Paivinen 1986,
Kilkki et al. 1989). Of these pdfs, the Weibull distribution is
popular due to the relative simplicity of estimating its parameters and its flexibility in fitting a variety of shapes and
degrees of skewness. Over the last 30 yr, many studies have
been conducted to: (1) estimate the parameters of various
pdfs by different statistical methods such as maximum likelihood (ML), moments, and percentiles (e.g., Garcia 1981,
Burk and Newberry 1984, Cao and Burkhart 1984, Zarnoch
and Dell 1985, Borders et al. 1987, Borders and Patterson
1990); (2) compare the suitability of pdfs or parameter
estimation methods for fitting tree diameter distributions
(e.g., Hafley and Schreuder 1977, Maltamo et al. 1995,
Nanang 1998, Maltamo et al. 2000); (3) model the pdf’s
parameters as a function of stand variables (parameter prediction) (e.g., Hyink and Moser 1983, Kilkki and Paivinen
1986, Kilkki et al. 1989); (4) solve the pdf’s parameters from
Chuangmin Liu, Research Assistant, Phone: (315) 446-0980, E-mail: cliu06@syr.edu; Lianjun Zhang, Associate Professor, Phone: (315) 470-6558,
Fax: (315) 470-6535, E-mail: lizhang@syr.edu, and Craig J. Davis, Professor, Phone: (315) 470-6569, E-mail: cjdavis@syr.edu; all at Faculty of
Forestry, State University of New York, College of Environmental Science and Forestry, One Forestry Drive, Syracuse, NY 13210. Dale S. Solomon,
Project Leader, Phone: (603) 868-7666, E-mail: dsolomon@fs.fed.us, and Jeffrey H. Gove, Research Forester, Phone: (603) 868-7667, E-mail:
jgove@fs.fed.us, USDA Forest Service Northeastern Research Station, Durham, NH 03824.
Acknowledgments: The authors thank Peter D.M. Macdonald, Professor of Statistics, Department of Mathematics and Statistics, McMaster
University, Canada, for helping us with S-Plus functions. We also appreciate the Associate Editor and three anonymous reviewers for their constructive
comments and suggestions.
Manuscript received November 11, 2000, accepted August 20, 2001.
Copyright © 2002 by the Society of American Foresters
Forest Science 48(4) 2002
653
the moments of the diameter distribution which are expressed
as a function of stand characteristics (parameter recovery)
(e.g., Hyink and Moser 1983, Bowling et al. 1989, Lindsay et
al. 1996); and (5) characterize tree diameter distributions by
nonparametric approaches (e.g., Haara et al. 1997, Maltamo
and Kangas 1998). More recently, researchers have attempted
to calibrate the predictions from diameter distribution models to obtain compatible estimates for other stand characteristics (Gove and Patil 1998, Kangas and Maltomo 2000).
Basically, most diameter-class distribution models are “wholestand” models and fit to the distribution of the entire stand (e.g.,
Bailey and Dell 1973, Hafley and Schreuder 1977, Rennolls et
al. 1985, Magnussen 1986). Attempts have been made to apply
the distribution models to mixed-species stands and unevenaged stands (e.g., Little 1983, Lynch and Moser 1986, Bare and
Opalach 1987, Tham 1988, Bowling et al. 1989, Maltamo 1997,
Siipilehto 1999). The problem with fitting the diameter frequency data of mixed-species stands is that these stands, unlike
single-species stands, may have highly irregular shapes. The use
of unimodal statistical distributions can lead to oversimplified
descriptions of stand structure (Cao and Burkhart 1984, Maltamo
and Kangas 1998, Maltamo et al. 2000). Distribution-free methods such as percentile prediction (Borders et al. 1987) and
nonparametric statistical methods such as kernel estimation
(Dressler and Burk 1989), and k-nearest-neighbor regression
(Haara et al. 1997, Maltamo and Kangas 1998) have been tried
to describe multimodal distributions. Although nonparametric
methods are flexible for fitting multimodal distributions, they
need a large amount of individual tree data for fitting the models
and also require appropriate reference sample stands in order to
obtain the estimation for a target stand (Maltamo and Kangas
1998). In most of the cases studied, Haara et al. (1997) found that
the Weibull-based method was more accurate than the nearestneighbor method. Tham (1988) used the Johnson SB distribution
to investigate the structure of mixed Norway spruce and two
birch species, and found that the Johnson SB fitted well to all
three species separately and to the entire stand. Similarly,
Maltamo (1997) applied the Weibull function to study the
distributions of mixed Scots pine and Norway spruce stands. The
Weibull was fitted to the entire stand as well as the separate
distributions of each species. When the model for the entire stand
was used to predict the distribution of each species, it underestimated the species that is relatively larger in size and overestimated the one with smaller trees. Note that the distributions of
the entire stand in the above studies were basically unimodal.
However, studies have shown that neither the Johnson SB nor
Weibull distribution can accurately represent bimodal distributions (Eriksson and Sallnas 1987, Tham 1988). These studies
treated each tree species in the mixed stand independently and
ignored the relationship between the species.
A frequency distribution made up of two or more component distributions is defined as a “mixture” distribution.
Finite mixture models (FMM) have been used extensively to
analyze such distributions in many fields including medicine,
biology, fisheries, environmental science, engineering, and
economics (e.g., Hasselblad 1966, Bhattacharya 1967,
Lepeltier 1969, Titterington 1976, Macdonald and Pitcher
1979, Schnute and Fournier 1980, Leytham 1984, Miller
654
Forest Science 48(4) 2002
1987, Macdonald 1987, Lui et al. 1988, McLachlan and
Gordon 1989, Crawford et al. 1992, McLachlan and McGiffin
1994, Jiang and Murthy 1998). Everitt (1996) and Titterington
(1997) provide an introduction to finite mixture distributions.
The reference books by Everitt and Hand (1981) and
Titterington et al. (1985) define continuous and finite mixture
distributions, discuss the most frequently used mixtures, and
describe the main problems of inference related to mixture
data. Although the most commonly used component distribution is Gaussian, mixtures with other types of component
such as Poisson, binomial, exponential, and Weibull have
also been studied (e.g., Rider 1961, Cohen 1965a, 1965b).
Titterington et al. (1985, p. 16–21) provides an extensive list
of applications of the finite mixture models, including the
types of distribution and estimation methods used. Research
on finite mixture distributions has focused on methods for
estimating model parameters and statistical tests for identifying the number of components in the mixture distribution
underlying a particular set of data (Titterington 1990). Graphical, moment, ML, minimum distance, and Bayesian methods
have been applied for the parameter estimation of the finite
mixture models. In recent years, the dominant approach has
been ML primarily because of the advance of high-speed
electronic computers (Everitt and Hand 1981, Titterington et
al. 1985, Redner and Walker 1984, Everitt 1996).
To date no work that we are aware of has been published
on modeling the diameter frequency distributions of mixedspecies stands using the FMM. In this study we are particularly interested in the finite mixture of two Weibull distributions. Although the finite mixture distribution is capable of
modeling any distribution with multiple components, we
only consider the mixture distribution with two different tree
species for simplicity in introducing the topic. The Weibull is
chosen because it is the most commonly used pdf for fitting
tree diameter distributions as discussed above. Although
other nonparametric approaches have been used to fit irregular diameter distributions, we choose to compare the FMM
against two traditional parametric methods: (1) fitting the
Weibull function to the entire plot (treat all trees of the two
different species as a whole) and (2) fitting the Weibull
function to each of the two species separately. To demonstrate model fitting and facilitate comparison, four example
forest plots, each composed of two species, were selected.
Theoretical Background
Suppose a mixture distribution consisting of k components; then the distribution of the ith individual component is
described by a specific pdf, fi(x), and the general pdf, f(x), for
the mixture distribution can be expressed as
f ( x) =
k
∑ ρ f (x) = ρ f (x) + ⋅⋅⋅ + ρ f (x)
i i
1 1
k k
i =1
where the ρi is the relative abundance of the ith component as
a proportion of the total population, and must satisfy the
constraints
0 ≤ ρi ≤ 1
and
Therefore, this particular mixture distribution is characterized by seven parameters, a location, shape, and scale
parameter for each of the two components (i.e., α1, β1, γ1, α2,
β2, and γ2) and a proportion parameter (i.e., ρ) characterizing
the mixture. For estimating these Weibull parameters, researchers have applied graphical procedures (Kao 1959),
method of moments (Rider 1961, Falls 1970), and ML
(Mason 1968, Jiang and Murphy 1998). In this study, the ML
method was used for the parameter estimation. The joint
likelihood density function is as follows
k
∑ρ
i
= 1.
i =1
We will restrict our exposition to the simplest case, where
f1(x),…,fk(x) have a common pdf with different means and,
possibly, different variances.
In this study we assume that the component pdf in the
finite mixture distribution of a random variable X (i.e., tree
diameters) is a three-parameter Weibull function given by
γ
 
x −α 

exp −
  β  


α ≤ x < ∞ , α ≥ 0, β > 0 , γ > 0
 γ   x – α
f ( x ; θ) =   

 β  β 
n
L=
γ –1
1
1
2
2
j =1
where n is the total number of tree diameter observations in
a sample. The natural logarithm of the likelihood function is
expressed by
(1)
where θ = (α, β, γ)′, and α, β, and γ are the location, scale and
shape parameters, respectively. Then the cumulative distribution function (cdf) is
∑ log [ ρ ( f (x; θ ) − f (x; θ )) + f (x; θ )]
n
log L =
1
1
2
2
2
2
j =1
γ
 
x −α 
F ( x ; θ) = 1 − exp  −
.
  β  


The first partial derivatives of log L are taken with respect
to each of the seven parameters of the mixture distribution.
These partial derivatives are set equal to 0 and then solved by
a numerical iterative algorithm such as the Newton-Raphson
approach to yield the ML estimates.
Since we only consider a finite mixture distribution with
two components following the Weibull distribution in this
study, the pdf of the mixture distribution is
f ( x ; ψ ) = ρ f1( x ; θ1) + (1 − ρ) f 2 ( x ; θ 2 )
∏ [ρ f (x; θ ) + (1 − ρ) f (x; θ )]
Example Plots and Modeling Methods
Four plots were selected from the database used for the
development of FIBER 3.0 (Solomon et al. 1995) representing the mixed spruce-fir forest type in the Northeast. Each
plot had two tree species comprising the majority of stand
basal area: balsam fir (Abies balsamea [L.] Mill) and red
spruce (Picea rubens Sarg). One plot had white spruce (Picea
glauca [Moench] Voss) instead of red spruce. Table 1 gives
(2)
where ψ = (ρ, θ1, θ2) with θi = (αi, βi, γi)′, and i = 1, 2, and
0 ≤ ρ ≤ 1. Similarly, the corresponding cdf of the mixture
distribution is
F ( x ; ψ ) = ρ F1( x ; θ1) + (1 – ρ) F2 ( x ; θ 2 ) .
Table 1. Descriptive statistics of tree diameters for the four example plots and the two species components in each plot.
Plot and
species
Number of
trees
Plot 1
BF*
RS*
Plot 2
BF*
RS*
Plot 3
BF*
RS*
Plot 4
BF*
WS*
58
22
36
116
87
29
84
70
14
37
25
12
*
†
Observed
proportion
0.38
0.62
0.75
0.25
0.83
0.17
0.68
0.32
BF—balsam fir, RS—red spruce, WS—white spruce.
The moment estimator for skewness ( b1 ) is
∑ (d i − d )
n
b1 =
(
††
2
) 
3/ 2
∑ (d i − d )
n
⋅ n
Kurtosis††
–1.09
–0.24
–0.11
–0.61
–0.05
–0.79
1.24
0.9
–1.01
–0.3
0.49
–1.02
The moment estimator for kurtosis (b2) is
3
i =1
n
∑ d i − d
i =1
Mean
SD
Min
Max
Skewness †
..............................................(cm) ...............................................
13.3
5.2
3.5
22.7
–0.15
8.5
3.5
3.5
15.9
0.8
16.2
3.6
8.4
22.7
–0.42
3.7
2.1
0.6
8.8
0.58
3
1.6
0.6
7.2
0.73
6
1.8
2.6
8.8
–0.38
10.7
4.2
4.5
24
1.11
9.7
3.2
4.5
20.6
0.8
15.5
5.2
6.8
24
0.18
9.1
3.7
2.9
17.9
0.34
8.6
3.6
2.9
17.9
0.47
10.3
4
4.7
16.5
0.04
b2 =
4
i =1
n
∑ d i − d
i =1
(
2
) 
2
⋅n
where d is tree diameter and n is number of observations.
Forest Science 48(4) 2002
655
the descriptive statistics of tree diameters (measured to the
nearest 0.1 cm) for each plot and each species in the plot. The
observed frequency distribution of each plot is illustrated in
Figure 1. The histograms show the frequencies by 2 cm
diameter classes of the entire plot (1 cm diameter classes are
used for Plot 2 due to its young age), while the curves
represent the observed distributions of two tree species within
the plot.
The frequency distribution of Plot 1 has two distinct
modes (one at 8 cm and the other at 16 cm). Balsam fir has
diameters ranging from 3.5 to 15.9 cm, while red spruce
ranges from 8.4 to 22.7 cm. The two component species,
however, have opposite skewness. The distribution of balsam
fir is positively skewed, and the distribution of red spruce is
negatively skewed. Plot 2 is a young balsam fir–red spruce
stand with a positively skewed diameter distribution. Seventy-five percent of the trees are balsam fir (diameters ranging from 0.6 to 7.2 cm), and 25% are red spruce (diameter
ranging from 2.6 to 8.8 cm). The observed component distributions of the two species are similar to those of Plot 1. Plot
3 has a skewed distribution with a heavy tail to the right due
to a few large-sized red spruce trees. Balsam fir (83%) has a
positively skewed frequency distribution, and red spruce has
a flat distribution across from 6.8 cm to 24.0 cm. The
frequency distributions of the two components (balsam fir
and white spruce) and the whole plot of Plot 4 have a similar
diameter range and curve shape (close to normal).
For the model comparison purpose, we selected the example plots with known information on each species component. However, it should be emphasized that the FMM
simultaneously estimates the proportion and component diameter distributions of different tree species in the mixedspecies stand. Thus, there is no need to separate the individual
species distributions for estimation. In other words, it is not
necessary to classify the two or more components of a
multispecies distribution a priori during data collection.
Three methods were utilized for fitting the frequency
distribution of each plot as follows:
Method 1. The finite mixture model of the two Weibull
distributions [Equation (2)] was fit to the tree diameter
data of each plot. The predicted number of trees per
diameter class for the whole plot and the predicted proportion or number of trees per class for each tree species were
obtained.
Method 2. The three-parameter Weibull function [Equation
(1)] was fit to the entire growing stock of each plot; i.e.,
treating the trees as a whole regardless of tree species. The
predicted number of trees was computed for each diameter
class only for the whole plot.
Method 3. The three-parameter Weibull distribution [Equation (1)] was fit to each tree species separately in the mixed
stand. Thus the predicted number of trees per diameter
class can be calculated for each tree species. To obtain the
prediction for the whole plot, the estimations from the
component models of the two species were summed.
In this study, special functions for maximum likelihood
estimation were written for S-Plus 2000 (Mathsoft, Inc.
1999) to estimate the parameters for the finite mixture models
and the Weibull distributions.
The criteria for model comparison were the root mean
square error and bias (Maltamo et al. 1995). Denote the model
residual (R) as the difference between observation and prediction for the diameter sums of each diameter-class in a plot:
R j = D j − Dˆ j
Figure 1. Observed frequency distribution of tree diameters for the four example plots. The histogram
represents the distribution of the entire plot, and the two curves exhibit the component distributions
of the two species (symbol “x” for balsam fir and “o” for red spruce or white spruce) for (a) Plot 1, (b)
Plot 2, (c) Plot 3, and (d) Plot 4.
656
Forest Science 48(4) 2002
where Dj and D̂ j are observed and predicted diameter sum
of trees, respectively, in the jth diameter class. Positive
residuals represent underprediction by the model and negative residuals represent overprediction by the model. Since
Dj and D̂ j are basically the product of the midpoint tree
diameter and the number of trees in the jth diameter class,
the R j emphasizes the large-sized trees. In other words,
large-sized diameter classes have a larger impact on the
residuals given the same number of trees. Then the average residual across all diameter classes is the model bias
and calculated as follows
m
∑ (D
Bias =
− Dˆ j )
j
j =1
m
where m is the number of diameter classes. The root mean
square error (RMSE) for the diameter sums was computed as
follows:
m
∑ (D
j
− Dˆ j ) 2
j =1
RMSE =
.
m
The likelihood-ratio χ2 test was chosen for testing “goodness of fit” such that
m
∑O
χ 2 = −2
j
j =1
E 
j
⋅ log 
O j 
where Oj is the observed frequency for the jth diameter class,
and Ej is the predicted frequency from the models for the jth
class. The χ2 has (m – k – 1) degrees of freedom, where k is
the number of estimated parameters.
Results and Discussion
Table 2 presents the estimated parameters for each of the
three fitting methods and the four plots. For Method 1, the
estimated proportion for the first component in the mixture
distribution (i.e.. balsam fir in each plot) is given by ρ̂ , while
the proportion for the second component is (1 – ρ̂ ). It appears
that ρ̂ was close to the observed proportion for balsam fir:
0.38 vs. 0.38 (observed) for Plot 1; 0.76 vs. 0.75 (observed)
for Plot 2; 0.95 vs. 0.83 (observed) for Plot 3; and 0.68 vs.
0.68 (observed) for Plot 4. This feature of the finite mixture
model is useful when the species information is not available
in a reliable form, e.g., from remote sensed data. Thus, the
proportions of species or species groups in the multispecies
stands can be estimated.
Model Comparison for the Entire Distribution of the Plots
After using the three methods to fit the plots, the predicted
frequencies by diameter classes were obtained from each
model for each plot. Recall that the prediction for the entire
plot by Method 3 was the summation of the two Weibull
functions fitted separately to each tree species. Then the
predictions from each method were compared with the observed frequencies. The root mean square error (RMSE),
bias, χ2, and P-value for the χ2 test were computed for each
method and each plot (Table 3). The observed frequency
distribution (histograms) and the three prediction curves are
illustrated for each plot in Figure 2. The residuals computed
across diameter classes for the three models are exhibited for
each plot in Figure 3.
Since Plot 1 had two distinct modes, it required a flexible
distribution to fit the two peaks and the valley between them.
For this plot, the FMM (Method 1) was the only one to meet
the requirement [Figure 2(a)]. The RMSE for Method 1 of
10.76 was 3.3 times smaller than Method 2, and 1.5 times
smaller than Method 3 (Table 3). A single Weibull function
fitted to Plot 1 (Method 2) missed the two peaks as well as the
valley, producing positive biases (underprediction) at the 8,
Table 2. Parameter estimates of the three fitting methods for the four example plots.
Plot 1
Plot 2
Plot 3
Plot 4
ρ̂
0.38
0.76
0.95
0.68
Plot 1
Plot 2
Plot 3
Plot 4
α̂
0.0000
0.4939
4.3615
1.7067
α̂ 1
2.6061
0.4932
4.1644
0.0000
Method 2
β̂
15.0245
3.5683
7.0095
8.3861
Plot 1
Plot 2
Plot 3
Plot 4
α̂ fir
3.2463
0.5013
4.2389
1.6952
β̂ fir
5.7998
2.7378
6.1224
7.7827
β̂1
5.6119
2.6743
6.6021
9.6148
Method 1
γ̂ 1
2.4054
1.6127
1.8386
2.6970
γ̂
2.8808
1.5386
1.5467
2.1103
Method 3
γ̂ fir
α̂ spruce
1.5347
1.5628
1.7522
2.0414
0.0000
0.0000
4.3978
4.3401
α̂ 2
8.3488
0.0000
6.3691
0.0000
β̂2
9.6038
5.4682
16.0979
11.5244
β̂ spruce
17.7193
6.5905
12.5652
6.4185
γ̂ spruce
5.3817
3.9631
2.3912
1.3832
γ̂ 2
3.3641
6.8893
12.3585
2.9244
Forest Science 48(4) 2002
657
Figure 2. Model comparison for the four example plots. The histogram represents the observed
diameter distribution, with Method 1 (), Method 2 (− − −), and Method 3 (⋅⋅⋅⋅⋅⋅) for (a) Plot 1, (b) Plot
2, (c) Plot 3, and (d) Plot 4.
16, 18 and 20 cm diameter classes and large negative biases
(overprediction) at the 10, 12, and 14 cm diameter classes
[Figure 3(a)]. When the two species were fitted separately by
the Weibull function (Method 3), it improved the prediction
for the whole plot to some degree. But Method 3 still
generated positive bias at the 8 cm diameter class and large
negative biases at the 10 and 12 cm diameter classes [Figure
3(a)]. On the average, the bias of Method 2 was 4.4 times
larger than that of Method 1, and the bias of Method 3 was
about 2 times larger than that of Method 1 (Table 3).
Plot 2 was a younger stand in which balsam fir maintained 75% of the stems and larger sized red spruce held
the remaining 25%. The diameter distribution of balsam
fir peaked at 3 cm while the distribution of red spruce had
a mode at 6 cm, but with a magnitude of approximately
one-half of its counterpart. Similar to Plot 1, the FMM fit
the plot better than the other two methods. The RMSE of
Method 1 was 1.9 times smaller than Method 2, and 1.3
times smaller than Method 3 (Table 3). Figures 2(b) and
3(b) show that both Methods 2 and 3 produced larger
Figure 3. Residuals (unit: diameter sums) produced by the three methods across diameter classes for
the four example plots with Method 1 (), Method 2 (− − −), and Method 3 (⋅⋅⋅⋅⋅⋅) for (a) Plot 1, (b) Plot
2, (c) Plot 3, and (d) Plot 4.
658
Forest Science 48(4) 2002
Table 3. The root mean square error (RMSE), bias, and χ2 test of the three fitting methods for the four example plots.
Plot
Plot 1
Plot 2
Plot 3
Plot 4
RMSE
10.76
8.78
20.84
6.38
Method 1
Bias
χ2
0.2101
1.5839
1.0473
2.9601
0.7570
4.4578
0.3642
0.9096
P-value
0.4530
0.0853
0.2161
0.3402
RMSE
35.47
16.51
29.04
6.45
Method 2
Bias
χ2
0.9245
14.2380
1.4914
8.0595
1.0183
8.9324
0.3483
0.8334
overprediction at the 4 and 5 cm diameter classes, and
Method 2 also underpredicted at the 6 and 7 cm diameter
classes.
Plot 3 had a heavy right tail (skewness = 1.11) and was
relatively peaked at the 10 cm diameter class (kurtosis =
1.24). The skewness was due to a few large-sized red spruce
trees. Again, the FMM fit the plot well at the peak as well as
the heavy tail. On the other hand, Methods 2 and 3
underpredicted at the peak and overpredicted at the right tail
(Figures 2(c) and 3(c)).
Compared with the other three plots, Plot 4 was relatively
close to normal, and the two species components had similar
diameter ranges. In this situation, all three methods fit the plot
equally well on the average [Table 3 and Figure 2(d)]. All
three methods yield similar residuals in pattern and magnitude across diameter classes [Figure 3(d)].
Model Comparison for the Species Component Distributions of the Plots
Since both Method 1 and Method 3 can estimate the
component distributions of the two species, the RMSE and
bias of the two models were summarized by species for each
plot in Table 4. It appears that, on average, Method 3
predicted the diameter sums for the two species components
more accurately than the FMM for Plots 1 and 3. For Plots 2
and 4, however, both methods produced similar biases for the
two species components. The estimations for the species
components in the FMM are constrained to the combined
species distribution for each plot. A single Weibull function
may fit the distribution of each individual species better in
some cases because it fits the species distribution independently. However, the summation of the two independent
species results may not produce a better prediction for the
entire plot as discussed above. It is also worthwhile to note
that the residuals, RMSE, and bias are computed based on the
Table 4. The root mean square error (RMSE) and bias of
Method 1 and Method 3 for each species component of four
example plots.
Method 1
Plot
Plot 1
BF
RS
Plot 2
BF
RS
Plot 3
BF
RS
Plot 4
BF
WS
RMSE
Bias
Method 3
RMSE
Bias
14.63
12.00
2.26
–2.05
11.70
8.53
0.39
0.03
53.92
5.42
0.88
0.17
6.72
6.04
0.70
0.37
18.57
20.10
–10.11
10.87
16.14
19.52
0.69
0.42
6.38
9.97
0.47
–0.11
6.45
11.86
0.41
0.28
P-value
0.0271
0.1530
0.2576
0.9749
RMSE
16.54
11.20
26.77
8.53
Method 3
Bias
χ2
0.4096
4.0733
1.0661
3.8386
1.1088
7.7840
0.6904
1.5453
P-value
0.2537
0.1467
0.0998
0.4618
differences between observed and predicted diameter sums
of each diameter class, and, thus, the impact of a mispredicted
individual tree increases as tree diameter increases.
Conclusion
It appears that the FMM is a promising alternative
method for modeling the diameter distributions of mixedspecies forest stands. There are advantages and disadvantages to this approach. Advantages include (1) it is more
flexible than a traditional Weibull function fit either to the
whole plot or to individual species separately, especially
when the diameter distribution is multimodal or highly
skewed. This is because the FMM simultaneously considers the proportion and component diameter distributions
of different tree species in the mixed-species stand; (2)
unlike fitting a single Weibull function to each species
separately (Method 3), it is not necessary to classify the
two or more components of a multimodal distribution a
priori during data collection; and (3) the proportions of
each species component or species group can be estimated
when the information is not available in the data. One
disadvantage of the FMM approach is that it may not
predict each species component as accurately as fitting the
Weibull to each species separately in some cases.
In this study, we considered mixed stands with only two
species for simplicity. The FMM is capable of modeling the
diameter frequency distribution of multispecies stands. In the
modeling process it is important and necessary to decide on
the number of components in a mixture distribution. For
modeling the diameter distributions of a mixed-species stand,
foresters usually have good knowledge of how many dominant tree species exist in the stand from a field inventory,
while minor species can be classified into major species
categories if desired. This can be done in an exploratory way
by fitting several finite mixtures and settling on the one with
the best inherent fit statistics. When the number of component species has been decided, the tree diameter data can be
easily input into available software for estimating the model
parameters. It should be emphasized that, unlike Method 3,
the estimation of parameters in the FMM works on the full
distribution of all species at once; thus, there is no need to
separate the individual species distributions for estimation.
Finally, we have only discussed fitting the FMMs for
diameter distributions based on individual species classifications. It is easy, however, to envision the use of the FMMs for
distributions that can be categorized by other attributes such
as age, since different cohorts of multi-aged stands may
appear as distinct modes in the overall stand distribution.
Certain timber-marking procedures may also cause this same
phenomenon over time when coupled with underlying growth
Forest Science 48(4) 2002
659
dynamics. Commercial packages are also available for fitting
the FMMs with various component distributions such as
normal, lognormal, gamma, exponential, or Weibull
(Haughton 1997).
In this article, we have presented the necessary rudiments
for fitting the FMM to the diameter distribution of twospecies mixed stands, with a limited number of examples.
Further research is necessary to fit the FMM to a large number
of mixed-species forest stands to evaluate the suitability of
fitting various species compositions and distribution shapes.
An additional subject for the future would be to examine how
well the estimated parameters of the FMMs could be regressed on stand attributes to facilitate a forest growth and
yield modeling of mixed-species forest. The suitability of the
FMM fit to irregular diameter distributions should also be
compared with other modeling methods such as nonparametric approaches.
Literature Cited
BAILEY, R.L., AND T.R. DELL. 1973. Quantifying diameter distributions with
the Weibull function. For. Sci. 19:97–104.
BARE, B.R., AND D. OPALACH. 1987. Optimizing species composition in
uneven-aged forest stands. For. Sci. 33:958–970.
BHATTACHARYA, C.G. 1967. A simple method of resolution of a distribution
into Gaussian components. Biometrics 23:115–135.
BORDERS, B.E., AND W.D. PATTERSON. 1990. Projecting stand tables: A
comparison of the Weibull diameter distribution method, a percentilebased method, and a basal area growth projecting method. For. Sci.
36:413–424.
GARCIA, O. 1981. Simplified methods-of-moments estimation for the Weibull
distribution. N. Z.J. For. Sci. 11:304–306.
GOVE, J.H., AND G.P. PATIL. 1998. Modeling basal area-size distribution of
forest stands: A çompatible approach. For. Sci. 44:285–297.
HAARA, A., M. MALTAMO, AND T. TOKOLA. 1997. The k-nearest-neighbor
method for estimating basal area diameter distribution. Scand. J. For. Res.
12:200–208.
HAFLEY , W.L., AND H.T. SCHREUDER . 1977. Statistical distributions for
fitting diameter and height data in even-aged stands. Can. J. For.
Res. 4:481–487.
HASSELBLAD, V. 1966. Estimation of parameters for a mixture of normal
distribution. Technometrics 8:431–444.
HAUGHTON, D. 1997. Packages for estimating finite mixtures: A review. Am.
Stat. 51:194–205.
HYINK, D.M., AND J.W. MOSER, JR. 1983. A generalized framework for
projecting forest yield and stand structure using diameter distributions.
For. Sci. 29:85–95.
JIANG, R., AND D.N.P. MURTHY. 1998. Mixture of Weibull distributions—
parametric characterization of failure rate function. Appl. Stoch. Models
and Data Anal. 14:47–65.
KANGAS, A., AND M. MALTAMO. 2000. Calibrating predicted diameter distribution with additional information. For. Sci. 46:390–396.
KAO, J.H.K. 1959. A graphical estimation of mixed Weibull parameters in
life-testing of electron tubes. Technometrics 1:389–407.
KILKKI, P. AND R. PAIVINEN. 1986. Weibull function in the estimation of the
basal area dbh-distribution. Silva Fenn. 20:149–156.
KILKKI, P., M. MALTAMO, R. MYKKANEN, AND R. PAIVINEN. 1989. Use of the
Weibull function in estimating the basal area dbh-distribution. Silva
Fenn. 23:311–318.
LEPELTIER, C. 1969. A simplified statistical treatment of geochemical data by
graphical representation. Econ. Geol. 64:538–550.
BORDERS, B.E., R.A. SOUTER, R.L. BAILEY, AND K.D. WARE. 1987. Percentilebased distributions characterize forest stand tables. For. Sci. 33:570–576.
LEYTHAM, K.M. 1984. Maximum likelihood estimation for the parameters of
mixture distributions. Water Resour. Res. 20:896–902.
BOWLING, E.H., H.E. BURKHART, T.E. BURK, AND D.E. BECK. 1989. A standlevel multispecies growth model for Appalachian hardwoods. Can. J. For.
Res. 19:405–412.
LINDSAY, S.R., G.R. WOOD, AND R.C. WOOLLONS. 1996. Stand table modeling
through the Weibull distribution and usage of skewness information. For.
Ecol. Manage. 81:19–23.
BURK, T.E., AND J.D. NEWBERRY. 1984. A simple algorithm for moment-based
recovery of Weibull distribution parameters. For. Sci. 30:329–332.
LITTLE, S.N. 1983. Weibull diameter distributions for mixed stands of western
conifers. Can. J. For. Res. 13:85–88.
BURKHART, H.E., AND M.R. STRUB. 1974. A model for simulation of planted
loblolly pine stands. P. 128–135 in Growth models for tree and stand
simulation, Fries, J. (ed.). Royal Coll. of For., Res. Notes No. 30,
Stockholm, Sweden.
LUI, K.J., W.W. DARROW, AND G.W. RUTHERFORD. 1988. A model-based
estimate of the mean incubation period for AIDS in homosexual men.
Science 20:1333–1335.
CAO, Q.V., AND H.E. BURKHART. 1984. A segmented distribution approach for
modeling diameter frequency data. For. Sci. 30:129–137.
LYNCH, T.B., AND J.W. MOSER. 1986. A growth model for mixed species
stands. For. Sci. 32:697–706.
COHEN, A.C., JR. 1965a. Estimation in mixture of two normal distributions.
Univ. of Georgia, Inst. of Statist., TR No. 13.
MACDONALD, P.D.M. 1987. Analysis of length-frequency distributions. P
371–384 in Age and growth of fish. Summerfelt, R.C., and G.E. Hall
(eds.). Iowa State University Press, Ames, Iowa.
C OHEN, A.C., J R. 1965b. Estimation in mixture of Poisson, and mixture
of exponential distributions. NASA Tech. Memo. TM X-53245.
Unclassified.
MACDONALD, P.D.M., AND T.J. PITCHER. 1979. Age-groups from size-frequency data: A versatile and efficient method of analyzing distribution
mixtures. J. Fish. Res. Board Can. 36:987–1001.
CRAWFORD, S.L., M.H. DEGROOT, J.B. KADANE, AND M.J. SMALL. 1992.
Modeling lake-chemistry distributions: Approximate Bayesian methods
for estimating a finite-mixture model. Technometrics 34:441–453.
MASON, T.J. 1968. Maximum likelihood estimation in a mixture of two
Weibull distributions. M.S. thesis, Univ. of Georgia. 25 p.
DROESSLER, T.D., AND T.E. BURK. 1989. A test of nonparametric smoothing of
diameter distributions. Scand. J. For. Res. 4:407–415.
ERIKSSON, L.O., AND O. SALLNAS. 1987. A model for predicting log yield from
stand characteristics. Scand. J. For. Res. 2:253–261.
EVERITT, B.S. 1996. An introduction to finite mixture distributions. Stat.
Meth. Med. Res. 5:107–127.
EVERITT B.S., AND D.J. HAND. 1981. Finite mixture distributions. Chapman
and Hall, London and New York. 143 p.
FALLS, L.W. 1970. Estimation of parameters in compound Weibull distributions. Technometrics 12:399–407.
660
Forest Science 48(4) 2002
MAGNUSSEN, S. 1986. Diameter distributions in Picea abies described by the
Weibull model. Scand. J. For. Res. 1:493–502.
MALTAMO, M. 1997. Comparing basal area diameter distributions estimated
by tree species and for the entire growing stock in a mixed stand. Silva
Fenn. 31:53–65.
MALTAMO, M., AND A. KANGAS. 1998. Methods based on k-nearest neighbor
regression in estimation of basal area diameter distribution. Can. J. For.
Res. 28:1107–1115.
MALTAMO , M., J. PUUMALAINEN , AND R. P AIVINEN . 1995. Comparison of
beta and Weibull functions for modeling basal area diameter distribution in stands of Pinus sylvestris and Picea abies. Scand. J. For.
Res. 10:284–295.
MALTAMO, M., A. KANGAS, J. UUTTERA, T. TORNIAINEN, AND J. SARAMAKI. 2000.
Comparison of percentile based prediction methods and the Weibull
distribution in describing the diameter distribution of heterogeneous
Scots pine stands. For. Ecol. Manage. 133:263–274.
MATHSOFT, INC. 1999. S-Plus 2000 Programmer’s Guide. Mathsoft Inc.,
Seattle, WA.
MCLACHLAN, G.J., AND R.D. GORDON. 1989. Mixture models for partially
unclassified data: A case study of renal venous renin in hypertension. Stat.
Med. 8:1291–1300.
SCHNUTE, J., AND D. FOURNIER. 1980. A new approach to length-frequency
analysis: Growth structure. Can. J. Fish. Aquat. Sci. 37:1337–1351.
SIIPILEHTO, J. 1999. Improving the accuracy of predicted basal –area diameter
distribution in advanced stands by determining stem number. Silva Fenn.
33:281–301.
SOLOMON, D.S., D.A. HERMAN, AND W.B. LEAK. 1995. FIBER 3.0: An
ecological growth model for Northeastern forest types. USDA For. Serv.
Gen. Tech. Rep. NE-204.
MCLACHLAN, G.J., AND D.C. MCGIFFIN. 1994. On the role of finite mixture
models in survival analysis. Stat. Meth. Med. Res. 3:211–226.
THAM, Å. 1988. Structure of mixed Picea abies (L.) Karst. and Betula pendula
Roth and Betula pubescens Ehrh. stands in south and middle Sweden.
Scand. J. For. Res. 3:355–370.
MILLER, R.B. 1987. Maximum likelihood estimation of mixed stock fishery
composition. Can. J. Fish. Aquat. Sci. 44:583–590.
TITTERINGTON, D.M. 1976. Updating a diagnostic system using unconfirmed
cases. Appl. Stat. 25:238–247.
NANANG, D.M. 1998. Suitability of the normal, log-normal and Weibull
distributions for fitting diameter distributions of neem plantations in
Northern Ghana. For. Ecol. Manage. 103:1–7.
TITTERINGTON, D.M. 1990. Some recent research in the analysis of mixture
distributions. Statistics. 21:619–641.
REDNER, R.A., AND H.F. WALKER. 1984. Mixture densities, maximum likelihood and the EM algorithm. SIAM Review 26:195–239.
RENNOLLS., K., D.N. GEARY, AND T.J.D. ROLLINSON. 1985. Characterizing
diameter distributions by the use of the Weibull distribution. Forestry
58:58–66.
RIDER, P.R. 1961. Estimating the parameters of mixed Poisson, binomial and
Weibull distributions by the methods of moments. Bull. de l’institute
international de statistique 38 part 2.
TITTERINGTON, D.M. 1997. Mixture distributions (update). P 399–407 in
Encyclopedia of statistical sciences. Update Volume 1. Kotz, S.M. (ed.).
Wiley, New York.
TITTERINGTON, D. M., A.F.M. SMITH, AND U.E. MAKOV. 1985. Statistical
analysis of finite mixture distributions. Wiley, New York. 243 p.
ZARNOCH, S.J., AND T.R. DELL. 1985. An evaluation of percentile and maximum likelihood estimators of Weibull parameters. For. Sci. 31:260–268.
Forest Science 48(4) 2002
661
Download