Fleming, 2014, plant ecology research lesson, S10 Lab 6 Lab 6: Indirect Ordination Methods and Their Interpretation NOTE: This lab contains many terms and concepts that are likely unfamiliar to beginning ecologists. I recommend consulting an experienced colleague for discussion on these topics. Meanwhile, a helpful website with glossary of terms, examples and other useful conceptual information can be found here: http://ordination.okstate.edu/ (last accessed 3/31/15). Your instructor will walk you through some examples before you explore data on your own. General Introduction to Ordination Ordination means, simply, putting things into some rational order. In practice, it includes many methods that are aimed at finding patterns in very messy data composed of species data from sampled locations (a.k.a. “plots” to ecologists). Often when environmental data are available, a second data set of sample plots and environmental data are used for ordination as well. The process of indirect gradient analysis/ordination has evolved from simple graphical and geometric methods to methods based on certain assumptions about how species respond to environmental factors (detrended correspondence analysis, or DCA, the method we’ll learn today), and finally to methods that do not make assumptions about the data (NMS, or nonmetric multidimensional scaling) and direct gradient analysis/ordination, both of which we will use later in the class. The great plant ecologist Robert Whittaker developed the directs gradient analysis/ordination technique based on a keen understanding of environmental gradients and this analytical approach led to concepts such as the climate mosaic diagram and to wide acceptance that vegetation patterns are largely gradational. To better find and explain statistically patterns that can vary across n-dimensions, multivariate methods of data analysis examine data in ways that can reveal associations between the variation in vegetation and environmental variables. Note that multivariate methods of analysis compare the degree of difference between members of a group simultaneously, based on measured values of individual members of the group. Modern indirect gradient analyses are only concerned with species values in sample plots, and we can graph the sample locations or species compared to each other. The methods vary, but in general, the more similar two samples or species are to one another, the closer they will appear in the compressed ordination space representing the original vegetation space. The challenge of using indirect gradient analysis is to find the environmental variables most likely to underlie the pattern. Modern direct gradient analysis plots samples in two or more dimensions that represent environmental factors. Whittaker’s work demonstrated that moisture and factors associated with elevation often reveal rational vegetation patterns. Recognizing that environmental variables have large degrees of covariance, modern ecologists often eliminate certain covarying factors to better pinpoint environmental drivers of plant community composition. There are several methods of indirect ordination. In this exercise, we will explore one indirect method called DCA and attempt to apply post-facto methods of interpretation to sample data. In a future lab, we’ll learn one direct gradient analysis method. Fleming, 2014, plant ecology research lesson, S10 Lab 6 Ordination Diagrams The results of indirect ordinations are usually graphed against axes that have been derived from the data. For example, sample locations (plots, or stands) can be arrayed in a synthetic “ordination space”. Note that ordination does not label axes in graphs; the ecologist can intuit what the ordination axes mean. More on this later. Alternatively, the species position in the space can be plotted. To visualize patterns, one could graph such values as a species cover value (above) in this space, or diversity, etc. These graphs may help in the interpretation of (a) relationships between entities graphed and/or (b) what the axis labels actually represent. Fleming, 2014, plant ecology research lesson, S10 Lab 6 One good use of DCA is to look at species change (e.g. succession) time. DCA tells us a lot about beta-diversity, or species turnover. The closer points are together in ordination space, the more similar their community composition. This graph shows four permanent plots from Mount St. Helens as they change over time (del Moral, R. 2007. J. Veg. Sci. 18: 479-488). Note how Groups I and J have converged in succession; Group H is trending away from the other two groups over time, indicating a different community composition than Groups I and J. Statistical Analyses Environmental data could be used as independent variables to graph against the ordination axes (dependent variables) in an effort to find correlations. These multiple correlations of species patterns to environmental factors can be performed with normal statistics or one could use CCA (see Lab 7). Typical Questions that can be Addressed with Indirect Ordination What factors control the distribution of species and communities in space and in time? What secondary (subtle) factors affect community distributions, especially those that might not be obvious in the field? How do attributes of plant community? How does structure vary with respect to environmental factors? For example, can you predict the relative diversity of a sample if you know the moisture and temperature regimes? Over time, does the vegetation tend to converge or diverge from its initial conditions? Detrended Correspondence Analysis (DCA) DCA is an ordination method that can “handle” many zero values in a data matrix and downweights rare species so that the broad patterns in a community can be clearly described. DCA evolved from other methods that assume that all species have a normal distribution along a single environmental gradient. Those assumptions were appealing because both species and sample ordinations emerged from this analysis. However, these methods badly distort relationships between species and plots and exaggerate the uniqueness of plots with rare species. DCA attempts to correct some of these problems. DCA is a method that removes some of the distortions of reduced dimensions. This correction is done by dividing the first axis into segments then rescaling along the axis such that within each segment, the variance of species scores is minimized. Rescaling tends to equalize the species turnover rate on the axis, which leads to the most useful feature of DCA: the axes on a DCA graph are good estimates of beta diversity, i.e. the differences in community composition along gradients, or between communities in same landscape. DCA, though useful, also has some unfortunate properties and makes several assumptions about the distribution of samples and species in environmental space. We will emphasize the DCA method in this class due to its ease of use and intuitive nature. However, detailed studies or authors wishing to publish their research should use an indirect ordination method called nonmetric multidimensional scaling (NMS). It is more complicated than DCA, and I would be happy to talk with you about it outside of class. Fleming, 2014, plant ecology research lesson, S10 Lab 6 How DCA works First, the matrix is analyzed to produce eigenvectors, values of samples on an ordination axis which represent response to environmental factors. This method assumes that there is only one strong gradient and that the species respond with hump-shaped curves that approximate a normal distribution with the mean positions of the species spread out over the gradient. This method relates to Whittaker’s method of gradient analysis. Thus, in addition to forming the basis of DCA, calculating eigenvectors is also integral to CCA (next lab). In effect, axes are rotated in species space and in sample space so that the correspondence between the two is maximized. Two linear equations result, and after extracting eigenvalues (i.e., an n-dimensional remainder like in division; a measure of the total variation associated with any particular dimension), these manipulations result in two matrices. One matrix contains the ordination of species, the other matrix contains the ordination of samples, and both can be graphed as you saw above. Ordination axes explain some of variance in the system, and often correspond to some environmental factor(s). We can calculate this “variance explained” by dividing the axis eigenvalue by the total variance in the system (ex: total var = 0.36; eigenvalue = 0.16, so variance explained by that axis is 44.4%). This number offers some basis to compare communities to, much like H’ explains community diversity. If “variance explained” is high, we may have sampled well, but most likely our data are suspect because the system is overly “tidy”, or controlled by 1-2 factors. If “variance explained” is low, we may have sampled poorly, but most likely the system responds strongly to many factors and is thus quite “messy”. Low variance explained often leads ecologists to scratch their heads and wonder how to ask better questions, formulate more direct hypotheses, and design better experiments. The position of samples is detrended as follows: the primary axis is cut into an arbitrary number (n=26 is default) of segments and the ordination scores on the next higher axis are adjusted so that the mean ordination scores of the next higher axis within each segment is centered at 0.0. Visualize cutting the ordination into strips and sliding the strips up or down so that each has the same mean, resulting in detrending. Then the axis is rescaled. The goal is to have each species turn over at the same rate. Said a different way, the goal of rescaling is to have the distance between points in an ordination graph means the same thing in different parts of the graph, or between different ordination graphs. The species ordination is rescaled so that each segment is expanded or contracted so that the mean variation of species within each segment is the same (called the pooled within-segment standard deviation). The width of each segment is adjusted so that each half change in species composition spans about 100 DCA units on an ordination axis. In other words, if two sample points in an ordination graph are separated by ~100 DCA units, about ½ of the species between them are different (ex: species A,B,C,D comprise one sample point, while species A,B,F,G (or some combination of two of the original four species) are present in the second sample point. What you see in the output after analysis You will see an eigenvalue, which is an estimate of the variation represented by the axis. You will, for each axis, see a value representing the length of segments. What to look at and report You should limit your interpretation of what are “significant” axes (I use NMS to suggest that number). Justification for the use of DCA (to observe broad patterns, conjecture, what may cause these) rather than another method How much variation is on the axes? This can be attempted by correlations between relative Euclidean distance in species space and the same in the ordination space. How to present the data (ex: you display axes 1 and 2, 1 and 3, or 2 and 3) and why you show it this way. Interpretive aids (highlight groups, correlations with axes, species patterns) for environmental factors, etc. Warnings This method is best considered a way to summarize information and to generate hypotheses. It is not very good at hypothesis testing because detrending can obscure relationships. Interpretations can be changed simply by making different choices. Complex gradients are not amenable to this method. Several authors suggest that the only valid use of DCA is to determine gradient lengths, under restricted situations. DCA is an ordination method that attempts to minimize distortion caused by reducing the dimensionality (e.g. gradients and species’ responses to those gradients) of a complex Fleming, 2014, plant ecology research lesson, S10 Lab 6 ecosystem. Think of shining a light through a Ferris wheel and projecting its shadow onto a giant screen. By reducing the Ferris wheel from three dimensions to two, we distort some of the real relationships of the Ferris wheel components. Some apparent distortions of species’ relationships resulting from ordination are actually real responses by species to gradients in their environment. Said yet another way, the way plants perceive the gradients of their environment is different than how we can measure or portray those gradients. Activity: Working in small groups, use computers with PC-ORD installed. A sample data file, created in MS Excel, will be used. It is called “foods.xls” and describes the relative frequency by which groups of students from certain European countries eat particular foods. You will have an opportunity to run several analyses, create graphs and interpret results. The menus are fairly simple and will be explained during lab. Practice running and interpreting DCA ordinations. In all cases, print a graph of your ordination and report eigenvalues and variance explained for ordination axes 1 and 2. Try the following different options to see the effects of modifying the original data set: 1. Run DCA on unaltered data. I will walk you through this example. 2. Data transformation is commonly used by ecologists when we wish to make data conform to the normal distribution for hypothesis tests that require a normal distribution. Try transforming your data by using the square root of all raw values. Run DCA and graph the output. Have relationships or variance explained changed? If so, how? 3. Use the unaltered data and remove one country (you choose). Run DCA and graph the output. Have relationships or variance explained changed? If so, how? 4. Use the unaltered data and remove three countries (you choose). Run DCA and graph the output. Have relationships or variance explained changed? If so, how? 5. Try a transformation of raw values AND removing one or three countries. Run DCA and graph the output. Have relationships or variance explained changed? If so, how? 6. Imagine you are a country of one. Add a row in the excel file that represents you and fill in each cell with a value from 0-100 that represents your preference for that food. Assume 0 = “I hate it” and 100 = “if I could only ever eat this for the rest of my life I’d step over my own mother to do it.” IMPORTANT: remember to change the number of columns in your excel file before importing to PC-ORD! Run DCA and graph the output. Have relationships or variance explained changed? If so, how? Now that you have some familiarity with DCA, use the file “sample_species_data.xls” to find the broad patterns in species and plots. Try modifying data as you did for the food example and see how your interpretation of species and plots sampled may change.