This file was created by scanning the printed publication. Errors identified by the software have been corrected; however, some errors may remain. Cross-correlations among Single Tree Growth Models Hubert Hasenauerl, Robert A. Monserud2, Timothy G. Gregoire3 Abstract. - Single tree growth and yield models basically consist of a number of equations to update tree parameters over time. Although it seems reasonable to assert that these equations are interrelated from a biological standpoint, it is customary to consider them independently and apply linear or nonlinear regression techniques separately rather than jointly. Using more than 7,500 Norway spruce (Picea abies) trees, we compare an individual tree basal area increment model, a height increment model, and a crown model using least square methods separately and jointly by applying three stage least square (3SLS) regression techniques. Results indicate a strong cross-equation correlation between the basal area and the height increment model and the height increment and the crown model. This suggests that the use of joint regression techniques would be superior. INTRODUCTION Individual tree forest growth and yield models usually employ a set of equations to describe stand development over time. A typical single-tree stand simulator (Monserud 1975, Wykoff et al. 1982, Burkhart et al. 1987, Hasenauer 1994) consists of different equations for predicting periodic diameter or basal area increment, height increment, and the probability of mortality for each sample tree. These equations are usually developed separately. From a biological standpoint, it seems reasonable to assert that the change in a tree's basal area, height, and risk of mortality are not uncorrelated phenomena (Dixon et a1. 1990). Depending on their interrelationships, joint estimation of the models' parameters may be necessary in order to provide estimates that are consistent, or it may be desirable in order to provide more precise estimates than can be obtained otherwise. Seminal work by Aitken (1934-35), Haavelmo (1943), Theil (1953), ' Biornetrician, Institutfur Waldwachsturnsforschung,Universitatfur Bodenkultur, Wien, Austria. ~iornetrician,Intermountain Research Station, USDA Forest Service, Moscow, ID 83843, USA. Biometrician, Dept. Forestry, Virginia Polytechnic Institute & St. Univ., Blacksburg, VA, USA. Zellner (1962), and Zellner and Theil (1962) resulted in almost all methods currently available for estimating the parameters in intercorrelated (simultaneous) systems of equations. The objective of this paper is to evaluate and compare joint versus separate regression techniques for single tree growth and yield modeling. We estimate the parameters of individual tree models for basal area increment, height increment, and crown ratio using least squares methods separately and jointly by applying two-stage (Theil 1953) and three-stage least squares (Zellner and Theil 1962) techniques. We specifically investigate (1) the differences in the estimated coefficients and (2) the correlation between the predictions. METHODS Independent Regressions We begin with a system of three individual tree growth equations for stand conditions in Austria: basal area increment, height increment, and crown ratio. These equations were developed independently using the same dataset. Basal area increment: After eliminating the qualitative site descriptors chosen by Monserud and Sterba (1996), which reduced the variance explained only by 2.6 % , we are left with the following model for Y, : with MA the 5-year basal area increment (outside bark), D the diameter at breast height (1.3 m) in cm, C= (1ICR)-1 where CR is the crown ratio, BAL the basal area (m2/ha) of trees larger in diameter than the subject tree, CCF the crown competition factor of Krajicek et al. (196 I), ELEV the elevation in hectometers, and SL the tangent of the slope angel (%/loo). Height increment: Hasenauer and Monserud (1996a) used a similar formulation to predict 5-year height increment AH, where H is the tree height, and all other parameters as previously defined. The second equation (Y,) in the system is: I.(aH) = a + bl-h(D) + b 2 - ~ +2 b3*h(C)+ c l * M L + c2-CCF + 4.+ 4 * S L + e, (2) Crown ratio: To ensure that the predictions of crown ratio (defined as the crown length divided by tree height) are bounded between 0 and 1, Hasenauer and Monserud (1996b) chose a logistic function. After linearizing the logistic and rearranging, we are left with the following logarithmic transformation of crown ratio (Y,=ln(C)): where C= (I/CR)-1, H/D is the heightldiameter ratio (mlcm), AZ is the azimuth in radians, and all other parameters are as previously defined. Change in crown ratio is not available, because height to crown base was not remeasured after the initial inventory. Simultaneous Equation Systems In the system above, it is likely that the errors c in equations (1) - (3) are intercorrelated because they are associated with various attributes of the same tree. If this is the only common influence among the three equations, then Zellner's (1962) seemingly unrelated regression (SUR) procedure would be appropriate because the equations are related through contemporaneous correlations in the variance-covariance matrix. We write the multivariate regression model as (4) Y=xb+e where Y is a 3n x 1 vector of dependent (endogenous) variables, X is the 3n x +pZ+ p 3 design matrix, /3 is the @, +p, +p,) x 1 vector of coefficients to be estimated, and t is the 3n x 1 error vector. The errors E have fixed mean E [ E ]=O and variance / W l lqe] = ~ [ e t ? ]= w12 w;, w, w, &I I Wl, w32 w3 = D (5) where W,=o:~ are the main diagonal viriances and VVj=oJ are the covariances, with I the n-dimensional identity matrix. The ordinary least squares (OLS) estimator of P = (pi ,f$,pi) is (6) b = (x'x)" xiy with variance (7) In situations where 0,. + 0 and 0,.+ d for some constant d,then b is inefficient and the usual estimator of (7), namely cSv(b) = (x'x)-' 8' (8) is biased. Zellners' (1962) seemingly unrelated regression procedure is a form of feasible generalized least squares (Judge et al. 1980) in which Q is estimated by h , where has the same form as (5) but with moment estimators, say 8; and cov(b) = (x!x)-' (X'QX) (x'x)-l blj., in place of the unknown parameters of and oy . With efficient estimator of 0 is 8 The variance of B 6 , an asymptotically = (x/Q-lx)-l is estimated by X'Q-ly (9) ++ For our set of single-tree equations there are p, +p, +p, =8 8 10=26 estimated parameters, plus the main diagonal partition of the variance-covariance matrix for each model (d1ye2yd3).The remaining off diagonal p a r t i t i ~ n s ( d , , , ~ ~ , , ~ ~ ~ ) contain the cross-equation covariances between each estimated parameter. If X, =X2=X3then SUR is identical to OLS, even when the cross-equation covariances are nonzero. Otherwise, SUR provides an asymptotically more B precise estimator of p than OLS, and the comparative efficiency of increases with increasing cross equation correlation and with increasing dissimilarity among the regression matrices X,, X, and X,. The use of in (C) = Y, in the models for in (BM) and In (AH), eqns. (1) and (2) respectively, precludes the straightforward use of SUR as outlined. Whenever a response or endogenous variable from one model appears as a regressor variable in another model, both the OLS and SUR estimator of P will be biased and inconsistent (Judge et al. 1980). In this case two stage least squares or 2SLS (see Judge et al. 1980) can be used to provide consistent and asymptotically unbiased estimates of p . An even more efficient estimator was invented by Zellner and Theil (1962) : three-stage least square (3SLS). By combining Thiel 's (1953) 2SLS procedure with Zellner's (1962) SUR procedure, the resulting 3SLS estimators of p are consistent and asymptotically more efficient than 2SLS. We denote the 3SLS estimates of p as and its estimated variance as P" e It is convenient to partition where = C&(B) as e.. is the pj x p, estimated covariance matrix of (li and p,. rJ For details concerning the 3SLS estimator in a forestry context see Murphy and Beltz (1981), Borders and Bailey (1986), and Gregoire (1987). In an individual tree model, estimates of E[YJ , E[Y2] , and E[YJ are needed for each tree. Let x[ denote the p, x 1 row vector of covariate values for a particular tree for which the estimated coefficients will be applied, such that $1 = serve as an estimate of E [ ~ , , X Let ~. 4 ijl y, and (13) y3 be similarly defined. The distributional properties of the random errors e in conjunction with the estimator of pi determine the statistical properties of b. and p1 . The scalar covariance between random variables x p, covariance matrix of p 1 and yl and F, B,. is x1212 x:, where Given an estimate of C12, say, y2 can be estimated by Therefore, the correlation between yl and 5, can be expressed as (12), then the covariance between " Y1 12 and is the p, e12as in 2; 1 12 2 ' between El and j3, and the correlation i2,3 between 9, andf3 The correlation i13 can be estimated similarly. DATA Data were obtained from the Austrian National Forest Inventory (Forstliche Bundesversuchsanstalt 198I), a systematic hidden permanent sample plot design over the whole of Austria, with a 5-yr remeasurement interval. In a given year a fifth of the plots are remeasured, ensuring a representative sample of all Austrian forests each year. The total inventory comprises 22,000 permanent plots. We restricted ourselves to the 4,135 forested plots containing remeasured Norway spruce (Picea abies), not crossed by roads, and in a single ownership. Permanent sample plots were established from 1981 to 1985. Trees with a diameter at breast height (DBH, 1.3 m) larger than 10.4 cm were selected by angle count sampling using a BAF of 4 m2/ha. Trees with a DBH between 5 and 10.4 cm were measured within a circle of 2.6 m radius located at plot center; smaller trees were not recorded. At plot establishment, the following data were recorded for every sample tree: species, DBH to the nearest mm, and distance and azimuth from plot center. Total height and height to the crown base were measured to the nearest decimeter on every fifth tree. Plot descriptors were evaluated within a circle of 300 m2. Elevation is measured to the nearest 100 m, slope is measured to the nearest l o % , and aspect to the nearest 45".Additional site descriptors were measured but not used in this comparison study. Plots were remeasured from 1986 to 1990, 5 years after establishment. In the remeasurement, height and diameter were measured on the same trees, but not height to crown base. In summary, observations of 7,797 Norway spruce trees from 4,135 different permanent sample plots are available throughout Austria. RESULTS Coefficient estimates by OLS vs. 3SLS Using the SYSLIN procedure in the Econometrics-Time Series module of SAS (SAS Institute 1988), the parameters in equations (1) to (3) were first estimated independently by applying ordinary least squares (OLS) and then jointly by using 2SLS and 3SLS. Attention was immediately focused on the in(C) and CCF terms in the height increment model (2). These two terms were both significant (a=0.05) with OLS. With 2SLS, the in (C) term remained significant but the CCF term became strongly non-significant. With 3SLS, both the in(C) and CCF terms were nonsignificant. Thus the height increment 'model (2) was reduced from 8 parameters to 6. The size of the matrix partitions in the variance-covariance matrix in (12) is a 6x6 variance-covariance matrix for f2,and is reduced accordingly. Now &, the off-diagonal partitions E;':,and E;':, are 8x6 and 6x10 matrices containing the cross-equation covariances between the estimated parameters, respectively. Because there are now p, +p, +p, =8 6 10=24 estimated parameters in the three growth equations, there are 576 variance-covariance elements in (12), a reduction of 100 elements. ++ Correlation between predictions To investigate the strength of the interrelationships among our system of equations and evaluate their importance, the correlations between the predictions of in(ABA), in(AH) and in(C) (eqns. (I), (2), and (3)) are calculated for each observation according to equation (14). Fig. 1 displays the cross-equation correlations " between each pair of rij predictions vs. the respective predicted variables In Fig. la and lb, " is 5 2 massed around 0.23, with a maximum correlation of 0.33 between the predictions for in(BAI) ( " ) and i n ( M ) ( " ). This indicates that the first two equations in the &. Y1 3'2 system are fairly interdependent. The correlations the first ( " ) and third equations Y1 " 5 3 between predictions from (y3) are weakly correlated, with a mean of -0.07, and extrema at 0.05 and -0.1 1. The correlations - " r23 between predictions from the second ( ) and third equations ( " ) have a mean of zero (0.01), but y2 Y3 are more dispersed than those for " ranging from 0.26 to -0.10 with a standard r12 ' deviation of 0.08. These correlations between the basal area and height increment models and between the height increment and crown models confirm our assumption of cross-equation correlations. ;between each pair of - predictions; YI the predicted in(AH), and jj3 the predicted tig. 1 : The cross equation correlation r/ indicates the predicted h ( A B A ) , 111( ( I /CR)-I ) . - Y2 Y1 DISCUSSION Because individual tree forest growth models are based on multivariate attributes observed on the same individuals (e-g. basal area increment, height increment, crown ratio), the resulting set of growth equations can be considered a silnultaneous system. Therefore, joint regression techniques should be considered for silnultaneously determining the parameter estimates. If endogenoub variables do not appear on the right hand side (RHS), the seemingly unrelated regression (Zellner 1962) will still improve the efficiency of the parameter estimates. If endogenous variables are used as predictor variables then multi-stage estimation techniques (2SLS or 3SLS) are necessary to obtain parameter estimates that are consistent; such estimates will also be asymptotically efficient. Ignoring the simultaneous nature of the system by separately applying OLS to each model can result in estimates that are biased and inconsistent. Using 3SLS with our system of equations allowed the detection and tleletion of two non-significant terms (CCF and ln(C) in eq. (2)) that OLS had determined as significant. This led to a simplification of the si~nultaneous structure of our system because in(C) was originally considered as an endogenous variable on the RHS. One of the advantages of joint regression techniques is that both the correlation among the predictor variables within a certain equation and across all equations are available. Fig. 1 indicates that cross-equation may change thei 1degree of dependency in the system and that these correlations can be used i l l a simulator to better predict stand and tree dynamics. Predictions between the first two growth equations (basal area and height increment) are fairly well correlated. while correlations between predictions from the first and third (height increment and crown ratio) are rather weak. ACKNOWLEDGEMENTS T h ~ research s was conducted when Hubert Hasenauer was a Visiting Scientist at Virginia Tech Depart nien t of Forestry In Blacksburg, and at the Intermountain Research Station's Forestry S c ~ e n ~ Laboratory es in Moscow, Idaho. Hasenauer was working on a Schriidinger research ?rant trom tht: Austrlan Science Foundat~on.We are g r a t e f ~ ~tol Karl Schieler and Klemens Schadauer ot the Fecieral Forest Research Center In Vienna for making the Forest Inventory data available. LITERATURE Aitken, A.C. 1934-35. On the least-squares and linear combination of observations. Proceedings of the Royal Society of Edinburgh. 55: 42-48. Borders, B.E., and R.L. Bailey. 1986. A compatible system of growth and yield equations for slash pine fitted with restricted tree stage least squares. For. Sci. 32: 185-201. Burkhart, H.E., Farrar, K.D., Amateis, R.A., and Daniels, R.F., 1987. Simulation of individual tree growth and stand development in loblolly pine plantations on cutover, site-prepared areas. Coil. For. and Wildlife Resources. Virg. Tech. Inst., Blacksburg, Va. Publication FWS-1-87: 47 pp. Dixon, R.K., Meldahl, R. S., Ruark, G. A., and Warren, W .G. 1990. Process modeling of forest growth responses to environmental stress. Timber Press, Portland, OR. 441 pp. Forstliche Bundesversuchsanstalt, I98 1 . Instruktionen fiir die Feldarbeit der ~sterreichischen Forstinventur 198 1 - 1985. Forstliche Bundesversuchsanstalt, Wien, 172 pp. Gregoire, T.G. 1987. Generalized error structure for forestry yield models. For. Sci. 33: 423444. Haavelmo. T . 1933, The statistical iiiiplications of a system of simulta~~eous equations. Econometr-ica l 1 : 1 - 12. Hasenauer, H. 1994. Ein Einzelhaumwachstumssi~nuiatorfiir ungleichaltrige Fichten-Kiefern- und Buchen-Fichten~nischhestZnde. Forstl. Schriftenreihe, Univ. f. Bodenkultur, Wien. 6sterr. Ges. f. Waldiikosyste~~iforsch~~ng und experimentelle Baumforschung an der Univ. f. Bodenkultur. Band 8: 152 pp. H:~senautx. H . . and R . A. Monserud. l996a. Biased statistics from height increment predictions. (subnutted to Ecol. Modeling). Hasenauer. H . . and R. A. Monserud. 1 996h. A crown ratio model for Austrian forests. Forest Ecology and Management (in press). Suclge. G.G.. W.E. Griftiths. R.C. Hill, and T.C. Lee. 1980. The theory and practice ot t-c.ono~netrics.John Wiley Kr Sons, Inc., New York. 8 10 pp. Kt-a1rct.h. .I.E.. K . A. Brinkman. and S. F. Gingrich. I96 1. Crown competition: A measure of density. For. Sci. 7: 35-42. Monserucl, R . A. 1975. Methodology for simulating Wisconsin northern hardwood stand dynamics. P1i.D. Thesis, Univ. of Wisconsin, Madison. 156 pp. Monserud, R. A., and H. Sterha. 19%. A basal area increment nod el for individual trees growing In even- and uneven-aged forest stands in Austria. For. Ecol.Manage. (in press). bli~r-phy.P. A.. and R.C. Reltz. 198 I. Growth and yield of shortleaf pine in the West Gulf region. South. Foreht Exp. Station, New Orleans, LA. USDA For.Serv. Res. Paper- SO- 169: 15 PPSAS tnstitute. 1988. SASIETS user's guide, version 6. Cary, NC. 560 pp. Theil. H. 1953. Repeated least-squ~iresapplied to complete equation systems. The Hague: Tht. central planing bur-ail. The Netherlands. Wykoff. W . R . . C~.ookston,N. L. and Stage, A. R., 1982. User's Guide to the Stand Prognosis h1oclt.l. USDA For. Serv. GTR INT-133, 1 12 pp. Zel lnzr. A . 1062. An Efficient method of estimating seemingly unrelated regressions and tests for a~cregationbias. Journal of the American Stat. Association. 57: 348-368. Zel lner-. A , . and Theil. H. 1962. Three-stage least squares: simultaneous estimation of s~nlultanzoi~s quations. Econolnetrica 30: 54-78. L L