A Finite Mixture Model for Characterizing the Diameter Distributions of Mixed-Species Forest Stands Chuangmin Liu, Lianjun Zhang, Craig J. Davis, Dale S. Solomon, and Jeffrey H. Gove ABSTRACT. A finite mixture model is used to describe the diameter distributions of mixed-species forest stands. A three-parameter Weibull function is assumed as the component probability density function in the finite mixture model. Four example plots, each with two species, are selected to demonstrate model fitting and comparison. It appears that the finite mixture model is flexible enough to fit irregular, multimodal, or highly skewed diameter distributions. Compared with traditional methods in which a single Weibull function is fit to either the whole plot or each species component separately, the finite mixture model produces much smaller root mean square error and bias, and fits the entire distribution of the plots with extreme peaks, bimodality, or heavy-tails well. In some cases, a single Weibull function fitted to individual species separately may produce more accurate estimations for the component distributions of the two species than the finite mixture model. The summation of the two independent species results, however, may not produce a better prediction for the entire plot. This study shows that the finite mixture model is a promising alternative method for modeling the diameter distribution of multispecies mixed forest stands. For. Sci. 48(4):653–661. Key Words: Weibull function, maximum likelihood, goodness-of-fit test, model comparison. D IAMETER-CLASS DISTRIBUTION MODELS have become a useful tool in forest management, growth and yield modeling, and forest inventories. Various probability density functions (pdf) such as normal, log-normal, gamma, beta, Johnson’s SB,and Weibull have been utilized to characterize the diameter frequency distributions of forest stands (e.g., Bailey and Dell 1973, Burkhart and Strub 1974, Hafley and Schreuder 1977, Little 1983, Kilkki and Paivinen 1986, Kilkki et al. 1989). Of these pdfs, the Weibull distribution is popular due to the relative simplicity of estimating its parameters and its flexibility in fitting a variety of shapes and degrees of skewness. Over the last 30 yr, many studies have been conducted to: (1) estimate the parameters of various pdfs by different statistical methods such as maximum likelihood (ML), moments, and percentiles (e.g., Garcia 1981, Burk and Newberry 1984, Cao and Burkhart 1984, Zarnoch and Dell 1985, Borders et al. 1987, Borders and Patterson 1990); (2) compare the suitability of pdfs or parameter estimation methods for fitting tree diameter distributions (e.g., Hafley and Schreuder 1977, Maltamo et al. 1995, Nanang 1998, Maltamo et al. 2000); (3) model the pdf’s parameters as a function of stand variables (parameter prediction) (e.g., Hyink and Moser 1983, Kilkki and Paivinen 1986, Kilkki et al. 1989); (4) solve the pdf’s parameters from Chuangmin Liu, Research Assistant, Phone: (315) 446-0980, E-mail: cliu06@syr.edu; Lianjun Zhang, Associate Professor, Phone: (315) 470-6558, Fax: (315) 470-6535, E-mail: lizhang@syr.edu, and Craig J. Davis, Professor, Phone: (315) 470-6569, E-mail: cjdavis@syr.edu; all at Faculty of Forestry, State University of New York, College of Environmental Science and Forestry, One Forestry Drive, Syracuse, NY 13210. Dale S. Solomon, Project Leader, Phone: (603) 868-7666, E-mail: dsolomon@fs.fed.us, and Jeffrey H. Gove, Research Forester, Phone: (603) 868-7667, E-mail: jgove@fs.fed.us, USDA Forest Service Northeastern Research Station, Durham, NH 03824. Acknowledgments: The authors thank Peter D.M. Macdonald, Professor of Statistics, Department of Mathematics and Statistics, McMaster University, Canada, for helping us with S-Plus functions. We also appreciate the Associate Editor and three anonymous reviewers for their constructive comments and suggestions. Manuscript received November 11, 2000, accepted August 20, 2001. Copyright © 2002 by the Society of American Foresters Forest Science 48(4) 2002 653 the moments of the diameter distribution which are expressed as a function of stand characteristics (parameter recovery) (e.g., Hyink and Moser 1983, Bowling et al. 1989, Lindsay et al. 1996); and (5) characterize tree diameter distributions by nonparametric approaches (e.g., Haara et al. 1997, Maltamo and Kangas 1998). More recently, researchers have attempted to calibrate the predictions from diameter distribution models to obtain compatible estimates for other stand characteristics (Gove and Patil 1998, Kangas and Maltomo 2000). Basically, most diameter-class distribution models are “wholestand” models and fit to the distribution of the entire stand (e.g., Bailey and Dell 1973, Hafley and Schreuder 1977, Rennolls et al. 1985, Magnussen 1986). Attempts have been made to apply the distribution models to mixed-species stands and unevenaged stands (e.g., Little 1983, Lynch and Moser 1986, Bare and Opalach 1987, Tham 1988, Bowling et al. 1989, Maltamo 1997, Siipilehto 1999). The problem with fitting the diameter frequency data of mixed-species stands is that these stands, unlike single-species stands, may have highly irregular shapes. The use of unimodal statistical distributions can lead to oversimplified descriptions of stand structure (Cao and Burkhart 1984, Maltamo and Kangas 1998, Maltamo et al. 2000). Distribution-free methods such as percentile prediction (Borders et al. 1987) and nonparametric statistical methods such as kernel estimation (Dressler and Burk 1989), and k-nearest-neighbor regression (Haara et al. 1997, Maltamo and Kangas 1998) have been tried to describe multimodal distributions. Although nonparametric methods are flexible for fitting multimodal distributions, they need a large amount of individual tree data for fitting the models and also require appropriate reference sample stands in order to obtain the estimation for a target stand (Maltamo and Kangas 1998). In most of the cases studied, Haara et al. (1997) found that the Weibull-based method was more accurate than the nearestneighbor method. Tham (1988) used the Johnson SB distribution to investigate the structure of mixed Norway spruce and two birch species, and found that the Johnson SB fitted well to all three species separately and to the entire stand. Similarly, Maltamo (1997) applied the Weibull function to study the distributions of mixed Scots pine and Norway spruce stands. The Weibull was fitted to the entire stand as well as the separate distributions of each species. When the model for the entire stand was used to predict the distribution of each species, it underestimated the species that is relatively larger in size and overestimated the one with smaller trees. Note that the distributions of the entire stand in the above studies were basically unimodal. However, studies have shown that neither the Johnson SB nor Weibull distribution can accurately represent bimodal distributions (Eriksson and Sallnas 1987, Tham 1988). These studies treated each tree species in the mixed stand independently and ignored the relationship between the species. A frequency distribution made up of two or more component distributions is defined as a “mixture” distribution. Finite mixture models (FMM) have been used extensively to analyze such distributions in many fields including medicine, biology, fisheries, environmental science, engineering, and economics (e.g., Hasselblad 1966, Bhattacharya 1967, Lepeltier 1969, Titterington 1976, Macdonald and Pitcher 1979, Schnute and Fournier 1980, Leytham 1984, Miller 654 Forest Science 48(4) 2002 1987, Macdonald 1987, Lui et al. 1988, McLachlan and Gordon 1989, Crawford et al. 1992, McLachlan and McGiffin 1994, Jiang and Murthy 1998). Everitt (1996) and Titterington (1997) provide an introduction to finite mixture distributions. The reference books by Everitt and Hand (1981) and Titterington et al. (1985) define continuous and finite mixture distributions, discuss the most frequently used mixtures, and describe the main problems of inference related to mixture data. Although the most commonly used component distribution is Gaussian, mixtures with other types of component such as Poisson, binomial, exponential, and Weibull have also been studied (e.g., Rider 1961, Cohen 1965a, 1965b). Titterington et al. (1985, p. 16–21) provides an extensive list of applications of the finite mixture models, including the types of distribution and estimation methods used. Research on finite mixture distributions has focused on methods for estimating model parameters and statistical tests for identifying the number of components in the mixture distribution underlying a particular set of data (Titterington 1990). Graphical, moment, ML, minimum distance, and Bayesian methods have been applied for the parameter estimation of the finite mixture models. In recent years, the dominant approach has been ML primarily because of the advance of high-speed electronic computers (Everitt and Hand 1981, Titterington et al. 1985, Redner and Walker 1984, Everitt 1996). To date no work that we are aware of has been published on modeling the diameter frequency distributions of mixedspecies stands using the FMM. In this study we are particularly interested in the finite mixture of two Weibull distributions. Although the finite mixture distribution is capable of modeling any distribution with multiple components, we only consider the mixture distribution with two different tree species for simplicity in introducing the topic. The Weibull is chosen because it is the most commonly used pdf for fitting tree diameter distributions as discussed above. Although other nonparametric approaches have been used to fit irregular diameter distributions, we choose to compare the FMM against two traditional parametric methods: (1) fitting the Weibull function to the entire plot (treat all trees of the two different species as a whole) and (2) fitting the Weibull function to each of the two species separately. To demonstrate model fitting and facilitate comparison, four example forest plots, each composed of two species, were selected. Theoretical Background Suppose a mixture distribution consisting of k components; then the distribution of the ith individual component is described by a specific pdf, fi(x), and the general pdf, f(x), for the mixture distribution can be expressed as f ( x) = k ∑ ρ f (x) = ρ f (x) + ⋅⋅⋅ + ρ f (x) i i 1 1 k k i =1 where the ρi is the relative abundance of the ith component as a proportion of the total population, and must satisfy the constraints 0 ≤ ρi ≤ 1 and Therefore, this particular mixture distribution is characterized by seven parameters, a location, shape, and scale parameter for each of the two components (i.e., α1, β1, γ1, α2, β2, and γ2) and a proportion parameter (i.e., ρ) characterizing the mixture. For estimating these Weibull parameters, researchers have applied graphical procedures (Kao 1959), method of moments (Rider 1961, Falls 1970), and ML (Mason 1968, Jiang and Murphy 1998). In this study, the ML method was used for the parameter estimation. The joint likelihood density function is as follows k ∑ρ i = 1. i =1 We will restrict our exposition to the simplest case, where f1(x),…,fk(x) have a common pdf with different means and, possibly, different variances. In this study we assume that the component pdf in the finite mixture distribution of a random variable X (i.e., tree diameters) is a three-parameter Weibull function given by γ x −α exp − β α ≤ x < ∞ , α ≥ 0, β > 0 , γ > 0 γ x – α f ( x ; θ) = β β n L= γ –1 1 1 2 2 j =1 where n is the total number of tree diameter observations in a sample. The natural logarithm of the likelihood function is expressed by (1) where θ = (α, β, γ)′, and α, β, and γ are the location, scale and shape parameters, respectively. Then the cumulative distribution function (cdf) is ∑ log [ ρ ( f (x; θ ) − f (x; θ )) + f (x; θ )] n log L = 1 1 2 2 2 2 j =1 γ x −α F ( x ; θ) = 1 − exp − . β The first partial derivatives of log L are taken with respect to each of the seven parameters of the mixture distribution. These partial derivatives are set equal to 0 and then solved by a numerical iterative algorithm such as the Newton-Raphson approach to yield the ML estimates. Since we only consider a finite mixture distribution with two components following the Weibull distribution in this study, the pdf of the mixture distribution is f ( x ; ψ ) = ρ f1( x ; θ1) + (1 − ρ) f 2 ( x ; θ 2 ) ∏ [ρ f (x; θ ) + (1 − ρ) f (x; θ )] Example Plots and Modeling Methods Four plots were selected from the database used for the development of FIBER 3.0 (Solomon et al. 1995) representing the mixed spruce-fir forest type in the Northeast. Each plot had two tree species comprising the majority of stand basal area: balsam fir (Abies balsamea [L.] Mill) and red spruce (Picea rubens Sarg). One plot had white spruce (Picea glauca [Moench] Voss) instead of red spruce. Table 1 gives (2) where ψ = (ρ, θ1, θ2) with θi = (αi, βi, γi)′, and i = 1, 2, and 0 ≤ ρ ≤ 1. Similarly, the corresponding cdf of the mixture distribution is F ( x ; ψ ) = ρ F1( x ; θ1) + (1 – ρ) F2 ( x ; θ 2 ) . Table 1. Descriptive statistics of tree diameters for the four example plots and the two species components in each plot. Plot and species Number of trees Plot 1 BF* RS* Plot 2 BF* RS* Plot 3 BF* RS* Plot 4 BF* WS* 58 22 36 116 87 29 84 70 14 37 25 12 * † Observed proportion 0.38 0.62 0.75 0.25 0.83 0.17 0.68 0.32 BF—balsam fir, RS—red spruce, WS—white spruce. The moment estimator for skewness ( b1 ) is ∑ (d i − d ) n b1 = ( †† 2 ) 3/ 2 ∑ (d i − d ) n ⋅ n Kurtosis†† –1.09 –0.24 –0.11 –0.61 –0.05 –0.79 1.24 0.9 –1.01 –0.3 0.49 –1.02 The moment estimator for kurtosis (b2) is 3 i =1 n ∑ d i − d i =1 Mean SD Min Max Skewness † ..............................................(cm) ............................................... 13.3 5.2 3.5 22.7 –0.15 8.5 3.5 3.5 15.9 0.8 16.2 3.6 8.4 22.7 –0.42 3.7 2.1 0.6 8.8 0.58 3 1.6 0.6 7.2 0.73 6 1.8 2.6 8.8 –0.38 10.7 4.2 4.5 24 1.11 9.7 3.2 4.5 20.6 0.8 15.5 5.2 6.8 24 0.18 9.1 3.7 2.9 17.9 0.34 8.6 3.6 2.9 17.9 0.47 10.3 4 4.7 16.5 0.04 b2 = 4 i =1 n ∑ d i − d i =1 ( 2 ) 2 ⋅n where d is tree diameter and n is number of observations. Forest Science 48(4) 2002 655 the descriptive statistics of tree diameters (measured to the nearest 0.1 cm) for each plot and each species in the plot. The observed frequency distribution of each plot is illustrated in Figure 1. The histograms show the frequencies by 2 cm diameter classes of the entire plot (1 cm diameter classes are used for Plot 2 due to its young age), while the curves represent the observed distributions of two tree species within the plot. The frequency distribution of Plot 1 has two distinct modes (one at 8 cm and the other at 16 cm). Balsam fir has diameters ranging from 3.5 to 15.9 cm, while red spruce ranges from 8.4 to 22.7 cm. The two component species, however, have opposite skewness. The distribution of balsam fir is positively skewed, and the distribution of red spruce is negatively skewed. Plot 2 is a young balsam fir–red spruce stand with a positively skewed diameter distribution. Seventy-five percent of the trees are balsam fir (diameters ranging from 0.6 to 7.2 cm), and 25% are red spruce (diameter ranging from 2.6 to 8.8 cm). The observed component distributions of the two species are similar to those of Plot 1. Plot 3 has a skewed distribution with a heavy tail to the right due to a few large-sized red spruce trees. Balsam fir (83%) has a positively skewed frequency distribution, and red spruce has a flat distribution across from 6.8 cm to 24.0 cm. The frequency distributions of the two components (balsam fir and white spruce) and the whole plot of Plot 4 have a similar diameter range and curve shape (close to normal). For the model comparison purpose, we selected the example plots with known information on each species component. However, it should be emphasized that the FMM simultaneously estimates the proportion and component diameter distributions of different tree species in the mixedspecies stand. Thus, there is no need to separate the individual species distributions for estimation. In other words, it is not necessary to classify the two or more components of a multispecies distribution a priori during data collection. Three methods were utilized for fitting the frequency distribution of each plot as follows: Method 1. The finite mixture model of the two Weibull distributions [Equation (2)] was fit to the tree diameter data of each plot. The predicted number of trees per diameter class for the whole plot and the predicted proportion or number of trees per class for each tree species were obtained. Method 2. The three-parameter Weibull function [Equation (1)] was fit to the entire growing stock of each plot; i.e., treating the trees as a whole regardless of tree species. The predicted number of trees was computed for each diameter class only for the whole plot. Method 3. The three-parameter Weibull distribution [Equation (1)] was fit to each tree species separately in the mixed stand. Thus the predicted number of trees per diameter class can be calculated for each tree species. To obtain the prediction for the whole plot, the estimations from the component models of the two species were summed. In this study, special functions for maximum likelihood estimation were written for S-Plus 2000 (Mathsoft, Inc. 1999) to estimate the parameters for the finite mixture models and the Weibull distributions. The criteria for model comparison were the root mean square error and bias (Maltamo et al. 1995). Denote the model residual (R) as the difference between observation and prediction for the diameter sums of each diameter-class in a plot: R j = D j − Dˆ j Figure 1. Observed frequency distribution of tree diameters for the four example plots. The histogram represents the distribution of the entire plot, and the two curves exhibit the component distributions of the two species (symbol “x” for balsam fir and “o” for red spruce or white spruce) for (a) Plot 1, (b) Plot 2, (c) Plot 3, and (d) Plot 4. 656 Forest Science 48(4) 2002 where Dj and D̂ j are observed and predicted diameter sum of trees, respectively, in the jth diameter class. Positive residuals represent underprediction by the model and negative residuals represent overprediction by the model. Since Dj and D̂ j are basically the product of the midpoint tree diameter and the number of trees in the jth diameter class, the R j emphasizes the large-sized trees. In other words, large-sized diameter classes have a larger impact on the residuals given the same number of trees. Then the average residual across all diameter classes is the model bias and calculated as follows m ∑ (D Bias = − Dˆ j ) j j =1 m where m is the number of diameter classes. The root mean square error (RMSE) for the diameter sums was computed as follows: m ∑ (D j − Dˆ j ) 2 j =1 RMSE = . m The likelihood-ratio χ2 test was chosen for testing “goodness of fit” such that m ∑O χ 2 = −2 j j =1 E j ⋅ log O j where Oj is the observed frequency for the jth diameter class, and Ej is the predicted frequency from the models for the jth class. The χ2 has (m – k – 1) degrees of freedom, where k is the number of estimated parameters. Results and Discussion Table 2 presents the estimated parameters for each of the three fitting methods and the four plots. For Method 1, the estimated proportion for the first component in the mixture distribution (i.e.. balsam fir in each plot) is given by ρ̂ , while the proportion for the second component is (1 – ρ̂ ). It appears that ρ̂ was close to the observed proportion for balsam fir: 0.38 vs. 0.38 (observed) for Plot 1; 0.76 vs. 0.75 (observed) for Plot 2; 0.95 vs. 0.83 (observed) for Plot 3; and 0.68 vs. 0.68 (observed) for Plot 4. This feature of the finite mixture model is useful when the species information is not available in a reliable form, e.g., from remote sensed data. Thus, the proportions of species or species groups in the multispecies stands can be estimated. Model Comparison for the Entire Distribution of the Plots After using the three methods to fit the plots, the predicted frequencies by diameter classes were obtained from each model for each plot. Recall that the prediction for the entire plot by Method 3 was the summation of the two Weibull functions fitted separately to each tree species. Then the predictions from each method were compared with the observed frequencies. The root mean square error (RMSE), bias, χ2, and P-value for the χ2 test were computed for each method and each plot (Table 3). The observed frequency distribution (histograms) and the three prediction curves are illustrated for each plot in Figure 2. The residuals computed across diameter classes for the three models are exhibited for each plot in Figure 3. Since Plot 1 had two distinct modes, it required a flexible distribution to fit the two peaks and the valley between them. For this plot, the FMM (Method 1) was the only one to meet the requirement [Figure 2(a)]. The RMSE for Method 1 of 10.76 was 3.3 times smaller than Method 2, and 1.5 times smaller than Method 3 (Table 3). A single Weibull function fitted to Plot 1 (Method 2) missed the two peaks as well as the valley, producing positive biases (underprediction) at the 8, Table 2. Parameter estimates of the three fitting methods for the four example plots. Plot 1 Plot 2 Plot 3 Plot 4 ρ̂ 0.38 0.76 0.95 0.68 Plot 1 Plot 2 Plot 3 Plot 4 α̂ 0.0000 0.4939 4.3615 1.7067 α̂ 1 2.6061 0.4932 4.1644 0.0000 Method 2 β̂ 15.0245 3.5683 7.0095 8.3861 Plot 1 Plot 2 Plot 3 Plot 4 α̂ fir 3.2463 0.5013 4.2389 1.6952 β̂ fir 5.7998 2.7378 6.1224 7.7827 β̂1 5.6119 2.6743 6.6021 9.6148 Method 1 γ̂ 1 2.4054 1.6127 1.8386 2.6970 γ̂ 2.8808 1.5386 1.5467 2.1103 Method 3 γ̂ fir α̂ spruce 1.5347 1.5628 1.7522 2.0414 0.0000 0.0000 4.3978 4.3401 α̂ 2 8.3488 0.0000 6.3691 0.0000 β̂2 9.6038 5.4682 16.0979 11.5244 β̂ spruce 17.7193 6.5905 12.5652 6.4185 γ̂ spruce 5.3817 3.9631 2.3912 1.3832 γ̂ 2 3.3641 6.8893 12.3585 2.9244 Forest Science 48(4) 2002 657 Figure 2. Model comparison for the four example plots. The histogram represents the observed diameter distribution, with Method 1 (), Method 2 (− − −), and Method 3 (⋅⋅⋅⋅⋅⋅) for (a) Plot 1, (b) Plot 2, (c) Plot 3, and (d) Plot 4. 16, 18 and 20 cm diameter classes and large negative biases (overprediction) at the 10, 12, and 14 cm diameter classes [Figure 3(a)]. When the two species were fitted separately by the Weibull function (Method 3), it improved the prediction for the whole plot to some degree. But Method 3 still generated positive bias at the 8 cm diameter class and large negative biases at the 10 and 12 cm diameter classes [Figure 3(a)]. On the average, the bias of Method 2 was 4.4 times larger than that of Method 1, and the bias of Method 3 was about 2 times larger than that of Method 1 (Table 3). Plot 2 was a younger stand in which balsam fir maintained 75% of the stems and larger sized red spruce held the remaining 25%. The diameter distribution of balsam fir peaked at 3 cm while the distribution of red spruce had a mode at 6 cm, but with a magnitude of approximately one-half of its counterpart. Similar to Plot 1, the FMM fit the plot better than the other two methods. The RMSE of Method 1 was 1.9 times smaller than Method 2, and 1.3 times smaller than Method 3 (Table 3). Figures 2(b) and 3(b) show that both Methods 2 and 3 produced larger Figure 3. Residuals (unit: diameter sums) produced by the three methods across diameter classes for the four example plots with Method 1 (), Method 2 (− − −), and Method 3 (⋅⋅⋅⋅⋅⋅) for (a) Plot 1, (b) Plot 2, (c) Plot 3, and (d) Plot 4. 658 Forest Science 48(4) 2002 Table 3. The root mean square error (RMSE), bias, and χ2 test of the three fitting methods for the four example plots. Plot Plot 1 Plot 2 Plot 3 Plot 4 RMSE 10.76 8.78 20.84 6.38 Method 1 Bias χ2 0.2101 1.5839 1.0473 2.9601 0.7570 4.4578 0.3642 0.9096 P-value 0.4530 0.0853 0.2161 0.3402 RMSE 35.47 16.51 29.04 6.45 Method 2 Bias χ2 0.9245 14.2380 1.4914 8.0595 1.0183 8.9324 0.3483 0.8334 overprediction at the 4 and 5 cm diameter classes, and Method 2 also underpredicted at the 6 and 7 cm diameter classes. Plot 3 had a heavy right tail (skewness = 1.11) and was relatively peaked at the 10 cm diameter class (kurtosis = 1.24). The skewness was due to a few large-sized red spruce trees. Again, the FMM fit the plot well at the peak as well as the heavy tail. On the other hand, Methods 2 and 3 underpredicted at the peak and overpredicted at the right tail (Figures 2(c) and 3(c)). Compared with the other three plots, Plot 4 was relatively close to normal, and the two species components had similar diameter ranges. In this situation, all three methods fit the plot equally well on the average [Table 3 and Figure 2(d)]. All three methods yield similar residuals in pattern and magnitude across diameter classes [Figure 3(d)]. Model Comparison for the Species Component Distributions of the Plots Since both Method 1 and Method 3 can estimate the component distributions of the two species, the RMSE and bias of the two models were summarized by species for each plot in Table 4. It appears that, on average, Method 3 predicted the diameter sums for the two species components more accurately than the FMM for Plots 1 and 3. For Plots 2 and 4, however, both methods produced similar biases for the two species components. The estimations for the species components in the FMM are constrained to the combined species distribution for each plot. A single Weibull function may fit the distribution of each individual species better in some cases because it fits the species distribution independently. However, the summation of the two independent species results may not produce a better prediction for the entire plot as discussed above. It is also worthwhile to note that the residuals, RMSE, and bias are computed based on the Table 4. The root mean square error (RMSE) and bias of Method 1 and Method 3 for each species component of four example plots. Method 1 Plot Plot 1 BF RS Plot 2 BF RS Plot 3 BF RS Plot 4 BF WS RMSE Bias Method 3 RMSE Bias 14.63 12.00 2.26 –2.05 11.70 8.53 0.39 0.03 53.92 5.42 0.88 0.17 6.72 6.04 0.70 0.37 18.57 20.10 –10.11 10.87 16.14 19.52 0.69 0.42 6.38 9.97 0.47 –0.11 6.45 11.86 0.41 0.28 P-value 0.0271 0.1530 0.2576 0.9749 RMSE 16.54 11.20 26.77 8.53 Method 3 Bias χ2 0.4096 4.0733 1.0661 3.8386 1.1088 7.7840 0.6904 1.5453 P-value 0.2537 0.1467 0.0998 0.4618 differences between observed and predicted diameter sums of each diameter class, and, thus, the impact of a mispredicted individual tree increases as tree diameter increases. Conclusion It appears that the FMM is a promising alternative method for modeling the diameter distributions of mixedspecies forest stands. There are advantages and disadvantages to this approach. Advantages include (1) it is more flexible than a traditional Weibull function fit either to the whole plot or to individual species separately, especially when the diameter distribution is multimodal or highly skewed. This is because the FMM simultaneously considers the proportion and component diameter distributions of different tree species in the mixed-species stand; (2) unlike fitting a single Weibull function to each species separately (Method 3), it is not necessary to classify the two or more components of a multimodal distribution a priori during data collection; and (3) the proportions of each species component or species group can be estimated when the information is not available in the data. One disadvantage of the FMM approach is that it may not predict each species component as accurately as fitting the Weibull to each species separately in some cases. In this study, we considered mixed stands with only two species for simplicity. The FMM is capable of modeling the diameter frequency distribution of multispecies stands. In the modeling process it is important and necessary to decide on the number of components in a mixture distribution. For modeling the diameter distributions of a mixed-species stand, foresters usually have good knowledge of how many dominant tree species exist in the stand from a field inventory, while minor species can be classified into major species categories if desired. This can be done in an exploratory way by fitting several finite mixtures and settling on the one with the best inherent fit statistics. When the number of component species has been decided, the tree diameter data can be easily input into available software for estimating the model parameters. It should be emphasized that, unlike Method 3, the estimation of parameters in the FMM works on the full distribution of all species at once; thus, there is no need to separate the individual species distributions for estimation. Finally, we have only discussed fitting the FMMs for diameter distributions based on individual species classifications. It is easy, however, to envision the use of the FMMs for distributions that can be categorized by other attributes such as age, since different cohorts of multi-aged stands may appear as distinct modes in the overall stand distribution. Certain timber-marking procedures may also cause this same phenomenon over time when coupled with underlying growth Forest Science 48(4) 2002 659 dynamics. Commercial packages are also available for fitting the FMMs with various component distributions such as normal, lognormal, gamma, exponential, or Weibull (Haughton 1997). In this article, we have presented the necessary rudiments for fitting the FMM to the diameter distribution of twospecies mixed stands, with a limited number of examples. Further research is necessary to fit the FMM to a large number of mixed-species forest stands to evaluate the suitability of fitting various species compositions and distribution shapes. An additional subject for the future would be to examine how well the estimated parameters of the FMMs could be regressed on stand attributes to facilitate a forest growth and yield modeling of mixed-species forest. The suitability of the FMM fit to irregular diameter distributions should also be compared with other modeling methods such as nonparametric approaches. Literature Cited BAILEY, R.L., AND T.R. DELL. 1973. Quantifying diameter distributions with the Weibull function. For. Sci. 19:97–104. BARE, B.R., AND D. OPALACH. 1987. Optimizing species composition in uneven-aged forest stands. For. Sci. 33:958–970. BHATTACHARYA, C.G. 1967. A simple method of resolution of a distribution into Gaussian components. Biometrics 23:115–135. BORDERS, B.E., AND W.D. PATTERSON. 1990. Projecting stand tables: A comparison of the Weibull diameter distribution method, a percentilebased method, and a basal area growth projecting method. For. Sci. 36:413–424. GARCIA, O. 1981. Simplified methods-of-moments estimation for the Weibull distribution. N. Z.J. For. Sci. 11:304–306. GOVE, J.H., AND G.P. PATIL. 1998. Modeling basal area-size distribution of forest stands: A çompatible approach. For. Sci. 44:285–297. HAARA, A., M. MALTAMO, AND T. TOKOLA. 1997. The k-nearest-neighbor method for estimating basal area diameter distribution. Scand. J. For. Res. 12:200–208. HAFLEY , W.L., AND H.T. SCHREUDER . 1977. Statistical distributions for fitting diameter and height data in even-aged stands. Can. J. For. Res. 4:481–487. HASSELBLAD, V. 1966. Estimation of parameters for a mixture of normal distribution. Technometrics 8:431–444. HAUGHTON, D. 1997. Packages for estimating finite mixtures: A review. Am. Stat. 51:194–205. HYINK, D.M., AND J.W. MOSER, JR. 1983. A generalized framework for projecting forest yield and stand structure using diameter distributions. For. Sci. 29:85–95. JIANG, R., AND D.N.P. MURTHY. 1998. Mixture of Weibull distributions— parametric characterization of failure rate function. Appl. Stoch. Models and Data Anal. 14:47–65. KANGAS, A., AND M. MALTAMO. 2000. Calibrating predicted diameter distribution with additional information. For. Sci. 46:390–396. KAO, J.H.K. 1959. A graphical estimation of mixed Weibull parameters in life-testing of electron tubes. Technometrics 1:389–407. KILKKI, P. AND R. PAIVINEN. 1986. Weibull function in the estimation of the basal area dbh-distribution. Silva Fenn. 20:149–156. KILKKI, P., M. MALTAMO, R. MYKKANEN, AND R. PAIVINEN. 1989. Use of the Weibull function in estimating the basal area dbh-distribution. Silva Fenn. 23:311–318. LEPELTIER, C. 1969. A simplified statistical treatment of geochemical data by graphical representation. Econ. Geol. 64:538–550. BORDERS, B.E., R.A. SOUTER, R.L. BAILEY, AND K.D. WARE. 1987. Percentilebased distributions characterize forest stand tables. For. Sci. 33:570–576. LEYTHAM, K.M. 1984. Maximum likelihood estimation for the parameters of mixture distributions. Water Resour. Res. 20:896–902. BOWLING, E.H., H.E. BURKHART, T.E. BURK, AND D.E. BECK. 1989. A standlevel multispecies growth model for Appalachian hardwoods. Can. J. For. Res. 19:405–412. LINDSAY, S.R., G.R. WOOD, AND R.C. WOOLLONS. 1996. Stand table modeling through the Weibull distribution and usage of skewness information. For. Ecol. Manage. 81:19–23. BURK, T.E., AND J.D. NEWBERRY. 1984. A simple algorithm for moment-based recovery of Weibull distribution parameters. For. Sci. 30:329–332. LITTLE, S.N. 1983. Weibull diameter distributions for mixed stands of western conifers. Can. J. For. Res. 13:85–88. BURKHART, H.E., AND M.R. STRUB. 1974. A model for simulation of planted loblolly pine stands. P. 128–135 in Growth models for tree and stand simulation, Fries, J. (ed.). Royal Coll. of For., Res. Notes No. 30, Stockholm, Sweden. LUI, K.J., W.W. DARROW, AND G.W. RUTHERFORD. 1988. A model-based estimate of the mean incubation period for AIDS in homosexual men. Science 20:1333–1335. CAO, Q.V., AND H.E. BURKHART. 1984. A segmented distribution approach for modeling diameter frequency data. For. Sci. 30:129–137. LYNCH, T.B., AND J.W. MOSER. 1986. A growth model for mixed species stands. For. Sci. 32:697–706. COHEN, A.C., JR. 1965a. Estimation in mixture of two normal distributions. Univ. of Georgia, Inst. of Statist., TR No. 13. MACDONALD, P.D.M. 1987. Analysis of length-frequency distributions. P 371–384 in Age and growth of fish. Summerfelt, R.C., and G.E. Hall (eds.). Iowa State University Press, Ames, Iowa. C OHEN, A.C., J R. 1965b. Estimation in mixture of Poisson, and mixture of exponential distributions. NASA Tech. Memo. TM X-53245. Unclassified. MACDONALD, P.D.M., AND T.J. PITCHER. 1979. Age-groups from size-frequency data: A versatile and efficient method of analyzing distribution mixtures. J. Fish. Res. Board Can. 36:987–1001. CRAWFORD, S.L., M.H. DEGROOT, J.B. KADANE, AND M.J. SMALL. 1992. Modeling lake-chemistry distributions: Approximate Bayesian methods for estimating a finite-mixture model. Technometrics 34:441–453. MASON, T.J. 1968. Maximum likelihood estimation in a mixture of two Weibull distributions. M.S. thesis, Univ. of Georgia. 25 p. DROESSLER, T.D., AND T.E. BURK. 1989. A test of nonparametric smoothing of diameter distributions. Scand. J. For. Res. 4:407–415. ERIKSSON, L.O., AND O. SALLNAS. 1987. A model for predicting log yield from stand characteristics. Scand. J. For. Res. 2:253–261. EVERITT, B.S. 1996. An introduction to finite mixture distributions. Stat. Meth. Med. Res. 5:107–127. EVERITT B.S., AND D.J. HAND. 1981. Finite mixture distributions. Chapman and Hall, London and New York. 143 p. FALLS, L.W. 1970. Estimation of parameters in compound Weibull distributions. Technometrics 12:399–407. 660 Forest Science 48(4) 2002 MAGNUSSEN, S. 1986. Diameter distributions in Picea abies described by the Weibull model. Scand. J. For. Res. 1:493–502. MALTAMO, M. 1997. Comparing basal area diameter distributions estimated by tree species and for the entire growing stock in a mixed stand. Silva Fenn. 31:53–65. MALTAMO, M., AND A. KANGAS. 1998. Methods based on k-nearest neighbor regression in estimation of basal area diameter distribution. Can. J. For. Res. 28:1107–1115. MALTAMO , M., J. PUUMALAINEN , AND R. P AIVINEN . 1995. Comparison of beta and Weibull functions for modeling basal area diameter distribution in stands of Pinus sylvestris and Picea abies. Scand. J. For. Res. 10:284–295. MALTAMO, M., A. KANGAS, J. UUTTERA, T. TORNIAINEN, AND J. SARAMAKI. 2000. Comparison of percentile based prediction methods and the Weibull distribution in describing the diameter distribution of heterogeneous Scots pine stands. For. Ecol. Manage. 133:263–274. MATHSOFT, INC. 1999. S-Plus 2000 Programmer’s Guide. Mathsoft Inc., Seattle, WA. MCLACHLAN, G.J., AND R.D. GORDON. 1989. Mixture models for partially unclassified data: A case study of renal venous renin in hypertension. Stat. Med. 8:1291–1300. SCHNUTE, J., AND D. FOURNIER. 1980. A new approach to length-frequency analysis: Growth structure. Can. J. Fish. Aquat. Sci. 37:1337–1351. SIIPILEHTO, J. 1999. Improving the accuracy of predicted basal –area diameter distribution in advanced stands by determining stem number. Silva Fenn. 33:281–301. SOLOMON, D.S., D.A. HERMAN, AND W.B. LEAK. 1995. FIBER 3.0: An ecological growth model for Northeastern forest types. USDA For. Serv. Gen. Tech. Rep. NE-204. MCLACHLAN, G.J., AND D.C. MCGIFFIN. 1994. On the role of finite mixture models in survival analysis. Stat. Meth. Med. Res. 3:211–226. THAM, Å. 1988. Structure of mixed Picea abies (L.) Karst. and Betula pendula Roth and Betula pubescens Ehrh. stands in south and middle Sweden. Scand. J. For. Res. 3:355–370. MILLER, R.B. 1987. Maximum likelihood estimation of mixed stock fishery composition. Can. J. Fish. Aquat. Sci. 44:583–590. TITTERINGTON, D.M. 1976. Updating a diagnostic system using unconfirmed cases. Appl. Stat. 25:238–247. NANANG, D.M. 1998. Suitability of the normal, log-normal and Weibull distributions for fitting diameter distributions of neem plantations in Northern Ghana. For. Ecol. Manage. 103:1–7. TITTERINGTON, D.M. 1990. Some recent research in the analysis of mixture distributions. Statistics. 21:619–641. REDNER, R.A., AND H.F. WALKER. 1984. Mixture densities, maximum likelihood and the EM algorithm. SIAM Review 26:195–239. RENNOLLS., K., D.N. GEARY, AND T.J.D. ROLLINSON. 1985. Characterizing diameter distributions by the use of the Weibull distribution. Forestry 58:58–66. RIDER, P.R. 1961. Estimating the parameters of mixed Poisson, binomial and Weibull distributions by the methods of moments. Bull. de l’institute international de statistique 38 part 2. TITTERINGTON, D.M. 1997. Mixture distributions (update). P 399–407 in Encyclopedia of statistical sciences. Update Volume 1. Kotz, S.M. (ed.). Wiley, New York. TITTERINGTON, D. M., A.F.M. SMITH, AND U.E. MAKOV. 1985. Statistical analysis of finite mixture distributions. Wiley, New York. 243 p. ZARNOCH, S.J., AND T.R. DELL. 1985. An evaluation of percentile and maximum likelihood estimators of Weibull parameters. For. Sci. 31:260–268. Forest Science 48(4) 2002 661