Appendix S2 Issues of discrete or continuous characterizations of functional traits Background Some traits are clearly characterized by continuous variation between species (also within species) e.g. specific leaf area or seed size (Westoby 1998; Westoby et al. 2002). Other functional traits seem best characterized as discrete classes, e.g. growth form, dispersal mode and root symbiont strategy (Cornelissen et al. 2003). There is a long tradition of describing species with reference to growth forms (Raunkiaer 1934). Perhaps as a result, identification of discrete functional types has received more attention than continuous variation (Gitay & Noble 1997; Smith et al. 1997; Gitay et al. 1999). A recent shift has been to ask: can we do better than these traditional categories? This has emerged with the recognition of the great strategic variation within what were considered relatively homogeneous functional types (McIntyre et al. 1999; Weiher et al. 1999; Ackerly in press). There have been suggestions for how to construct functional types (Gitay & Noble 1997; Smith et al. 1997), but little discussion of when it is appropriate to regard them as discrete rather than continuous. One suggestion has been the use of fuzzy categories (Pillar 1999). Issues of defining continua One definition would be if any species’ probability of sprouting could take any value between 0 and 1, without prejudice. Under this definition, the continuum corresponds to the species model. The species model can potentially be compared to models with different numbers of groups of species, using information criterion methods (see next section). A second definition would be if species probabilities were spread uniformly along the range 0-1. This corresponds to the null model simulations used in this paper. An observed distribution can be compared with such simulations using a goodness-of-fit test. This would allow acceptance or discarding of the continuum assumption as opposed to any clumping but would not determine whether a particular alternative model, e.g. two groups, were appropriate. A serious problem is that a continuum could also reasonably be defined as species probabilities spread randomly or evenly, but not across the full range 0-1. Such a continuum would misleadingly test as clumped against either of the 0-1 definitions. Without some a priori basis for setting the edges of the continuum at locations other than 0 and 1, there seems no easy solution to this problem. Issues of model fitting Standard hypothesis testing approaches have two issues for mixture modelling. When comparing two models that differ in the number of component groups, say two and three groups, the null hypothesis is that the probability of belonging to the third group is zero. In this case, the estimated probability is at the boundary of possible values (01) and deviances are not well approximated by χ2. This regularity assumption may possibly be sidestepped by the use of resampling approaches, but this is currently unresolved (McLachlan & Peel 200): p 185). The second issue is philosophical. Hypothesis testing techniques are not designed to find the model best approximating reality. Rather, they test for significance of adding individual parameters and often lead to over-parameterized models (Burnham & Anderson 1998). Techniques based on parsimony and a trade-off between bias and variance are preferred for model selection. Akaike’s Information Criterion (AIC) or Bayes’ Information Criterion (BIC) are such methods. They compare progressively more complicated models by penalizing them for the number of parameters used (Burnham & Anderson 1998; McLachlan & Peel 2000). These criteria are subject to the same regularity assumptions as testing the likelihood ratio (McLachlan & Peel 2000). Nevertheless, information criteria are still used for identifying the numbers of components. Consequences of violation of regularity are unclear (McLachlan & Peel 2000). We have applied AIC, BIC and variants (hereafter referred to collectively as AIC) to comparing the species model (continuum model 1 above) with binomial-mixture models having different numbers (1-4) of species groups to different datasets. Information criteria were previously useful in analysis of a single clip and burn experiment with small and equal sample sizes (Vesk 2002). However, for some of the literature compilation datasets in the present paper, results were less promising (P.A. Vesk unpublished results). In these investigations, AIC favoured larger numbers of component groups than we thought reasonable. In these datasets there is group structure arising from groups of studies and groups of species with differing sample size, as well as whatever group structure may arise from underlying biology. AIC may have been strictly correct in identifying the datasets as better described by numerous groups than by simpler 2, 3, or 4 group models, But the gain in explanatory power did not seem worthwhile to us, when weighed against the degrees of freedom needed to parameterize the distributions. Definitions of mixtures of categories Defining discrete types also raises conceptual issues. In the binomial mixture models we have examined here, the single binomial parameter defines both the location and the spread of the data for a category of species. So if a group of species (a functional type) all have the same probability of sprouting, then the spread within the group is predefined by that mean probability. Such a single parameter definition of a distribution could also apply to counted data (Poisson distribution). This would not be the case for (say) specific leaf area, or seed mass, or any number of other continuous traits. A parameter for spread within the group would have to be defined (a standard deviation or similar), as well as a parameter for mean. And in particular, a plausible model for a continuum could be a single functional type with a wide spread. Statistically, as spreads within types become wider, a 3-group model becomes less distinguishable from a 2-group model, and so forth. Conceptually, having groups only makes sense if the spread within groups is not too wide compared with the separation between the central values. But there is no agreement in place how narrow the spread has to be for it to make sense to identify groups. To construct a test, one would first have to agree on some objective criterion for setting a maximum acceptable spread within groups. REFERENCES Ackerly, D.D. (in press) Functional strategies of chaparral shrubs in relation to seasonal water stress and disturbance. Ecology. Burnham, K.P. & Anderson, D.R. (1998) Model Selection and Inference: A Practical Information-Theoretic Approach. Springer-Verlag, New York. Cornelissen, J.H.C., Lavorel, S., Garnier, E., Diáz, S., Buchmann, N., Gurvich, D.E., Reich, P.B., ter Steege, H., Morgan, H.D., van der Heijden, M.G.A., Pausas, J.G., & Poorter, H. (2003) A handbook of protocols for standardised and easy measurement of plant functional traits worldwide. Australian Journal of Botany, 51, 335-380. Gitay, H. & Noble, I.R. (1997). What are functional types and how should we seek them? In Plant Functional Types: Their Relevance to Ecosystem Properties and Global Change (eds T.M. Smith, H.H. Shugart & F.I. Woodward), pp. 3-19. Cambridge University Press, Cambridge, UK. Gitay, H., Noble, I.R., & Connell, J.H. (1999) Deriving functional types for rainforest trees. Journal of Vegetation Science, 10, 641-650. McIntyre, S., Lavorel, S., Landsberg, J., & Forbes, T.D.A. (1999) Disturbance response in vegetation - towards a global perspective on functional traits. Journal of Vegetation Science, 10, 621-630. McLachlan, G.J. & Peel, D. (2000) Finite Mixture Models. Wiley-Interscience, New York. Pillar, V.D. (1999) On the identification of optimal plant functional types. Journal of Vegetation Science, 10, 631-640. Raunkiaer, C. (1934) The Life Forms of Plants and Statistical Geography. Clarendon Press, Oxford. Smith, T.M., Shugart, H.H., & Woodward, F.I., eds. (1997) Plant Functional Types: Their Relevance to Ecosystem Properties and Global Change. Cambridge University Press, Cambridge, UK. Vesk, P.A. (2002) Plant functional types, grazing and fire in the rangelands. PhD thesis, Macquarie University, Sydney. Weiher, E., van der Werf, A., Thompson, K., Roderick, M.L., Garnier, E., & Eriksson, O. (1999) Challenging Theophrastus: a common core list of plant traits for functional ecology. Journal of Vegetation Science, 10, 609-620. Westoby, M. (1998) A leaf-height-seed (LHS) plant ecology strategy scheme. Plant & Soil, 199, 213-227. Westoby, M., Falster, D.S., Moles, A.T., Vesk, P.A., & Wright, I.J. (2002) Plant ecological strategies: some leading dimensions of variation between species. Annual Reviews of Ecology and Systematics, 33, 125-159.