Standardization, Residualization, and Adjusted Statistical Thresholds for Forest Health Indicators William A. Bechtold 1 Stanley J. Zarnoch 1 Stand-level CCV Indicators Michael E. Schomaker 2 1 USDA Forest Service, Southern Research Station, Asheville, NC Colorado State Forest Service, Ft. Collins, Co Corresponding author: W.A. Bechtold, USDA Forest Service, Southern Research Station, P.O. Box 2680, Asheville, NC 28802 Phone: 828-257-4357 e-mail: wabechtold@fs.fed.us 2 Exam ple Standardized and Residualized Indicator Values Standardized Indicators An indicator can be adjusted for different statistical distributions among species by standardizing to a mean of zero and a standard deviation of one. Values are thus expressed in ter ms of standard deviation units from the mean for a given species. This adjustment results in more meaningful interpretations w hen comparisons are made among species, and enables summations across species to yield stand-level indicators. The standardization of an indicator is defined as I ij′ = I ij − I j sj (1) where I′ij = the standardized indicator for tree i w ithin species j, Iij = the non-standardized indicator for tree i w ithin species j, Ij sj = the average for the non-standardized indicator for species j, and Composite Crow n Volume ( CCV), a composite indicator derived from multiple tree crow n parameters, w as computed for 6,179 individual trees located on 250 FHM plots distributed across Virginia, Georgia, and Alabama: CC V = 0.5 π R 2 C L C D (4) w here π = 3.14159, R = crow n diameter / 2, H = total tree length, CL = H (crow n ratio) / 100, and CD = (crow n density / 100)(1 – dieback / 100)(1 – transparency / 100) Tree-level CCV Indicators Raw and Standardized Values Trees w ere then grouped by species. Mean CCV differed substantially by species, ranging from 697 cu ft (slash pine) to 6,422 cu ft (American beech), w ith coefficients of variation often exceeding 100%, thus demonstrating the need to standardize across species. Equation (1) w as used to obtain adjusted tree-level CCV values that w ere standardized, by species, to a mean of zero and a standard deviation of one. Indicators standardized in this manner reflect the deviation of all trees from their species mean (in ter ms of standard deviation units) and are invariant to the species distribution on the plot. At this point, it is valid to combine or compare data across species, since all tree-level values are now on the same basis w ith respect to species. Residualized and Standardized-Residualized Values The distributional properties of CCV in its raw (equation 4, figure 1a) and standardized (equation 1, figure 1b) for ms are show n in figure 1. Although standardization results in a mean of zero and standard deviation of one, note that standardization alone has little effect on skew ness. Skew ness is often improved by residualization w ith models, so the follow ing model designed to adjust CCV for tree size and stand density w as employed: CCV = b 0 + b 1 (dbh) + b 2 (BA) 12 0 50 mean = 1905 40 kurtosis = 0.913 20 10 dbh = tree diameter (inches) at breast height, BA = stand-level basal area per acre, and bi = regression parameters estimated from the data. Note that the residuals (equation 2, figure 1c) and standardized residuals (equation 3, figure 1d) associated w ith the model (equation 5) have substantially smaller skew ness coefficients and are more nor mally distributed than standardization alone (figure 1b). Reduction in skew ness is advantageous w hen performing statistical tests based on assumptions of nor mality. It also diminishes reliance on transformations and nonparametric statistical methods. 0 0 1000 2000 3000 4000 5000 me an = 16 11 sd = 2469 3 500 ske wne ss = 4 .6 66 3 000 ku rto sis = 36.28 8 15 00 10 00 Number of Trees 20 00 Number of Trees 4 000 mean = 0 .0 000 sd = 0.997 6 skewne ss = 4 .1 67 kurtosis = 28.67 1 1 2 800 0 5 percent r = ile0.46 600 0 400 0 200 0 -1 0 1 2 Fi gu re 3 . Cl as sifi cati on of p lo ts b a sed o n Me a n C om po si te C row n V ol um e a n d its sta n da rd ize d re si du a l (2 50 pl ots). 2 000 1 500 Conclusions 0 0 5000 10000 15000 2000 0 2500 0 3000 0 3500 0 400 00 -2 0 2 4 6 8 10 12 14 St andardiz ed C omposite C rown Volume 1b) Compos ite C rown Volume (St andardized) C omposite C olume rowm Volume 1a) C om posit e C row n V (R aw) (2) 35 00 (3) 4 000 me an = 0.00 00 mean = 0 .00 00 30 00 sd = 1577 3 500 sd = 0.997 6 25 00 ske wne ss = 3 .5 18 3 000 skewne ss = 3 .0 67 2 500 kurtosis = 26.91 5 Number of Trees Number of Trees Like raw indicator values, residualized indicators can be standardized for comparisons across species: ku rto sis = 39.52 3 20 00 15 00 10 00 2 000 1 500 1 000 5 00 500 0 0 - 10000 0 10000 20000 300 00 1c) C ompos ite C rown Residual Volume (R es idual) -5 0 5 10 15 1d) Compos ite SCtandardized rown Volume R es (Stidual andardized Residual) Rij = the residualized indicator for tree i w ithin species j, = the average for the residualized indicator for species j, and = the standard deviation for the residualized indicator for species j. 0 2 500 500 0 Standardized-Residualized Indicators Rj sj -1 2b) C ompos ite Crow n Volume (Standardized Res idual) Standardized R es idual 1 000 = the predicted value of the indicator for tree i w ithin spec ies j based = the standardized-residualized indicator for tree i w ithin species j, -2 2a) Compos ite Crow n Volume (R aw ) Composite Crown Volume 5 00 on the appropr iate model. Rij′ 7000 Standardized Residual The associated residualized indicator is then defined as where 6000 5 percenti le 25 00 = an indicator for tree i w ithin species j, and sj 40 0 Another method to adjust an indicator is to define the indicator as its residual from a model based on tree and/or stand conditions. Each tree is thus adjusted for its specific competitive situation, resulting in an indicator more suitable for detecting abnor malities because such adjustment identifies individuals that do not confor m to model predictions. Spec ifically, let Rij − Rj kurtosi s = 1.891 60 20 0 -2 Rij′ = 80 Figure 2. Stand-level distributions of Composite Crown Volume and its standardized residual (250 plots). (5) = the standard deviation for the non-standardized indicator for species j. Rij = Yij − Yˆij sd = 0.4659 skewness = 0.361 skewness = 1.0 06 30 where Residualized Indicators Yij Yˆij mean = -0.0139 10 0 sd = 1236 Composite Crown Volume Thresholds are important w hen assessing forest health because they separate the sampled population into categor ies of good and poor. Ideally, thresholds should be developed on a biological basis. Biological thresholds, the point at w hich a tree becomes noticeably stressed and begins to decline, are difficult to pinpoint. This requires extensive research designed to establish correlations betw een indicator variables and other signs of tree stress such as damage symptoms, reduced grow th, and prospective mortality. Threshold establishment is further complicated because thresholds are often species dependent, and because the effect of normal stand dynamics must be partitioned from the analysis. The ultimate goal in establishing any threshold is to identify signals that appear to be beyond the range of w hat is expected. Statistical thresholds have advantages and disadvantages w hen compared to biological thresholds. They are disadvantaged because they are somew hat arbitrary, and alw ays result in a set of observations designated as poor, even in the absence of a problem. On the other hand, they are easily established by isolating observations at the tails of statistical distributions, and they are quite useful for detecting spatial patterns, establishing empirical correlations, and measuring change over time. Since biological thresholds have yet to be developed, arbitrary thresholds for some indicators (especially visual crow n ratings) have been used in a variety of recent forest health reports. Such analyses have the virtue of simplicity, but contain no adjustment for species differences, other tree attr ibutes, or stand conditions. These types of adjustments and related analyses can be accomplished w ith statistical distributions and thresholds. This poster utilizes data from the Crow n Indicator to demonstrate how such adjustments are accomplished by standardization and residualization of indicator values. These techniques w ork for individual indicator variables, as w ell as composite values that may be derived from multiple indicator variables. N umber of Plots Thresholds and Statistical Distributions Standardization of indicators makes it feasible to produce plot- level indicators (by averaging tree- level values across all trees on each plot). Plot-level averages (across all species) for 250 plots w ere calculated from the raw and standardizedresidualized tree-level CCV values. Distributions of these stand-level means are show n in figures 2a and 2b. Note that the standardized residuals do not have a mean of 0 and standard dev iation of 1 (as seen w ith tree level-indicators). This is because standardization w as initially performed at the individual tree and species level, and spec ies distributions differ across the plots. As w ith the tree- level indicators, residualization reduces the s kew ness coefficients of the stand-level indicator. Further comparison of average raw stand-level values and their standardizedresidualized counterparts w as performed by classifying 250 plots into good and poor categor ies. For demonstration, a threshold for the poor class w as set at the low er 5-percentile of the statistical distribution. Each pair of raw and standardizedresidual values w as then plotted on a scatter diagram, w ith the threshold value indicated by reference lines (figure 3). Agreement betw een raw values and their standardized residuals is attained for all plots located in the upper right and low er left quadrants. The other tw o quadrants represent opposite classifications by the indicators. The raw indicator and its standardized-residual counterpart classify the same plots as poor only tw ice, confirming that the raw and adjusted values are measuring different aspects of crow n condition. The reason for differences in classification is based on the adjustment potential of the regression models and its effect on the creation of the standardized-residual indicators. When using raw CCV values, stands w ith high percentages of trees w ith small crow ns are classif ied as poor, even if crow ns are nor mally small for those particular species in those types of stands. How ever, when using the adjusted values, stands are classified as poor only if they have high percentages of trees at the low er end of their respective species statistical distributions after adjusting for tree and stand conditions. N umber of Plots Abstract. Recent forest health reports have focused on arbitrary thresholds for indicator values that are not adjusted for differences among species. To circumvent this difficulty, we show how indicators can be standardized, by species, to a mean of zero and variance of one. This enables direct comparison among species and permits tree-level indicators to be summed across species to yield plot-level (i.e., stand-level) indicators. We also demonstrate how to produce residualized indicators (which can also be standardized) that adjust for tree size and local stand conditions. Indicators based on crown volume are given as examples. Figure 1. Tree-level distributions of Composite Crown Volume and its adjusted counterparts (6,167 trees). FHM Posters home page | FHM 2002 Posters The utility of raw indicators can be extended by standardization across species. Analyses can be further enhanced w ith model residuals that adjust for the effect of tree size, stand density, and other parameters. Model residuals can also be enhanced by standardization. Combinations of these techniques result in a variety of analytical tools that can be tailored to address issues concerning forest health. The appropriateness of standardized, res idualized, or raw values depends on the circumstances. Standardized values are useful w hen compar ing or summing across individuals w ith different statistical distributions for the indicator of interest. Res idualized indicators are useful for detecting deviations from expected values. Raw values are useful for detecting unadjusted net change. Like raw indicator values, adjusted values can be correlated w ith other indicators (e.g., lichen diversity, or soil erosion) to establish biological thresholds. They can be correlated w ith other plot-based data or spatial over lays such as elevation, forest type, or physiographic class to deter mine if there are statistically significant differences betw een categories. They can be used for spatial analyses designed to detect clusters or gradients of unusually good or poor indicator values.