Vegetation Analysis Final Report Ordination, Cluster Analysis, and Indicator Species Analysis David Golowo, Jr. December 5, 2005 TA: Lee Ann Jacobson Fundamentals of Ecology – BIOL 112 Introduction Often in plant ecology, species of interest are grouped for proper analysis. One method used to achieve proper analysis is ordination. According to Gauch, ordination primarily endeavors to represent sample and species relationships as faithfully as possible in a low-dimensional space (Gauch 1982). UNC Ecology professor R.K. Peet (and professor of this course) goes on to state that ordination itself can assist with subjective classifications (Peet 1980). Ordination is thus described as a technique used in plant ecology that enables one to group multiple objects of interest. These objects may include species, samples, plots, quadrats, etc. The objects are ordered along axes according to their relative resemblances. So, a main goal in using ordination is to compare many objects that are related and then place the most similar of those objects closer together in a 2 or 3-dimensional space. Area of Study The tree species studied in this work are species found in North Carolinian forests. Most of these species like the hickory, maple, white oak, etc are species native to North Carolina and are commonly found in the Duke Forest, the Blue Ridge Mountains, and the Croatan National Forest regions that were visited during the course of the Fall 2005 semester. These regions provide an abundance of species and an environment suitable for vegetational analysis. The ordination method used throughout this work is the PCOrd method the method was NMS (nonmetric multidimensional scaling) and PCord was the software used. Methods The ordination method used throughout this work is the PCOrd method. Designed by McCune, this program is a program that performs multivariate analysis of ecological data. It offers many ordination and classification techniques that may not be available in many statistical packages. It has newer features contains software for better analysis (eg. it has autopilot mode for NMS), it has better graphic (eg it allows the plotting of scree plots in NMS), has newer features like the edit matrix spreadsheet and enhanced graphics. The laboratory experiment was performed in three parts. In the first part, an ordination analysis was run and the results were interpreted. In the second part, groups within the data were identified and the ordination analysis was interpreted using the groups. In the last part, indicator species analysis was used to identify the species the define the groups within the data. Analyses In the first part of the lab, an ordination was run and the data results were interpreted. The PCOrd file was opened and the tree data file was then opened. In the tree data file, the first Main Matrix file opened was the TreeLong.wk1 file don’t need this much detail (can leave out file names, etc). Next, the step-down ordination was run. This was achieved by selecting NMS from the Ordination pull-down menu. In NMS, Autopilot was left unchecked; the Distance Measure selected was Sorensen (Bray-Curtis); the Output options selected were Write final configuration, Varimax rotation, Run log, and Plot stress vs. iteration; and the Parameter Setup features were as follows: Number of axes k, 6; Number of runs with real data, 20; Stability criterion, 0.0005; Iterations to evaluate stability, 20; Maximum number of iterations, 400; the Step down in dimensionality box was checked; and the Random numbers button was selected. Thereafter, the OK button was clicked. In the window, NMS Random Numbers, the following were selected: User Suppliers Seed > a random number was typed in > and OK clicked. A descriptive title was chosen for the results. The title chosen was TreeLong_NMSstep > OK clicked. After this, the ordination was run and three smaller windows namely the Main- TreeLong.wk1, GraphGraphrow.gph, and Result-Result.txt windows appeared in the main PCOrd window. The files were then examined. Stress, instability, and the scree plot were three key components that were examined. Stress measures how far the data in the ordination space diverged from the original data. Since the data are rearranged in ordination space over many iterations, the stress reaches a point of stabilization or until the maximum number of iterations has been reached. If the stress value is less than 5, then there is a very good representation of the data. This, however, is not frequently achieved; 5 – 10 is a good representation; 10 – 20 is a good representation but may give misleading results; and stress above 20 most likely gives misleading results. Instability is the amount of stress that continues to change with the number of increasing iterations. When the analysis reaches a solution, the stress should reach a state of stability. At this point, the stress is low. An acceptable instability is 10E-3. The scree plot shows the decrease in stress as each axis is added to the analysis. The focal NMS was then run. All but the Number of axes k remained the same. The number of axes was changed from 6 to 3. The new run was made and the result was saved as Week1Results.txt Scree plots were then generated. TreeLong was chosen as a second matrix and overlaid in the first plot.. All results will be shown and their interpretations will be given in later in the report. In the second part, groups were identified within the data and the ordination analysis was interpreted using groups. The sets include the following: TreeLong.wk1 was opened as the main matrix and as the second matrix. In the menu bar, Groups pull-down menu was opened and Cluster Analysis was selected. The following settings were then chosen: Set beta, -0.25; Set Group membership level, 6; Set Group variable name, Group; the caption To Write all higher level groupings was checked; the results was given the descriptive name TreeLongWeek2.wk2 and OK was clicked. At the completion of this analysis, Temp2.wk1 was chosen a the new second matrix and it was saved as a results.txt file to be used later. The descriptive name given it was TreeLong_6Groups.wk1. These results were saved and the second matrix was closed as well. Back in PCOrd, EnvLong.wk1 was opened as a second matrix and the above steps were repeated to add grouping levels to the environmental data file. These new groupings allow each of them to be used as a second matrix to display the ordination with the groups displayed in the biplot. From this point, a dendogram was generated from the Graph menu. The grouping level was changed or adjusted by choosing Select Grouping Variable from the Groups menu. This dendogram was also saved. Plots with extreme values were identified by the Outlier analysis. This was achieved by selecting Outlier Analysis from the Summary menu. For plots with extreme environmental variables, such plots were identified in the EnvLong file. Before removing the plots that were indicated as outliers, care was first taken to examine the ordination graph again to see if the plots consistently fell on the boundary of the ordination. The results were compared, the modifications were made be specific – what modifications did you make? Did you remove plots? Which ones?, and the results were saved. In order to determine the most biologically appropriate grouping level, the groups were identified qualitatively by examining them in biplots with the ordination results from the first part of the study (we didn’t do an experiment). TreeLong_6Groups from Part I was opened as the second matrix and the graph file that contained the first graphs were also opened. By opening the Graph Ordination and overlaying the EnvLong_6Groups (thus forming a new graph) information was then added to TreeLong_6Groups for better interpretation. In the third part, indicator species analysis was used to identify the species that define the groups within the data. Key components for analysis in this part are indicator species analysis, abundance, and frequency. Indicator species analysis combines species abundance and frequency to calculate the indicator value for each species. Abundance (or fidelity) is the percentage of the abundance for a species that occurs within a particular group. Frequency ( or constancy ) measures the percentage of plots within a group that contains a particular species. This section is divided into two parts: Step 1: TreeLong_6Groups was opened as a second matrix. In the Groups submenu, Indicator Species Analysis was selected. These classes were defined: Select Group 3, Monte Carlo Test was checked, a random number was typed as the supplier seed, 1000 runs was entered, and the result was save as TreeLong_indicators_3Groups.txt. The same process was repeated for Groups 4, 5, and 6. In order to determine the groups to use for the dataset, average p-values and indicators were analyzed from each group. Each group was first imported into Excel. The group was located using the import tab. The row where the p-values begun was selected, the Delimited button and tab were selected, and OK was clicked. This lead to the display of the p-values, standard deviation, and all essential values in the Spreadsheet. The pvalue data was then sorted according to size and the average was taken. The number of indicators below the threshold value of 0.05 was selected. This process was repeated for each of the grouping level. The grouping level with both the highest number of significant indicators and the lowest average p-value was selected and used for the rest of the analysis. The selected group was 6Groups why? Explain how you made this decision; a new Spreadsheet was then opened and the entire 6Group data was imported into the new Spreadsheet without changing any settings. Worksheets for Abundance, Frequency, and Indicator values were then inserted and the results were analyzed. Good detail, but too much! Would be more useful to use the space to explain what each method does and why it is used, rather than to list the specific steps Results More useful to start results section with text to explain results before introducing graphs and tables Part I NewTreeLong1_NMSstep Axis 2 QUAL CACR ULAL OXAR ACRU QUPR Axis 1 Figure 1 Axis 1 x Axis 2 LIST NewTreeLong1_NMSstep FAGR LITU CACR ULAL OXAR Axis 3 QUAL JUVI QUST Axis 1 Figure 2: Axis 1 x Axis 3 LIST NewTreeLong1_NMSstep FAGR LITU Axis 3 ACRU QUPR JUVI QUST Axis 2 Figure 3: Axis 2 x Axis 3 Figures 1 – 3 Biplots of species in NMS ordination space. Fig. 1 and 3 show the separation along Axis 2 between hardwood species such as the Red Maple, Acer rubrum; Sweet-gum, Liquidambar styraciflua; the post oak, Quercus rubra, etc. Fig. 2 shows similar separation among hardwood trees along Axes 1 and 3. Should explain and interpret patterns you see Part II ClusterSetUp2 Distance (Objective Function) 1.6E-02 5.1E+00 100 75 1E+01 1.5E+01 2E+01 Information Remaining (%) 00001 PSP37 00018 00021 00574 00004 00008 00014 00002 00005 00007 00509 00520 00016 00024 00023 00033 00010 00031 00020 00042 00019 00069 00517 PSP36 00555 00581 00598 00589 00571 00579 00009 00012 00011 00015 00017 00022 00067 00618 00513 00514 00620 00537 00619 00501 00504 00524 00502 PSP88 PSP86 00508 PSP87 00617 PSP35 PSP34 00081 00606 00590 00510 00596 00512 00511 00607 00621 00608 00609 00003 00612 00616 00611 00614 00615 00622 00624 00575 00593 00582 00013 00029 PSP44 PSP61 00503 00507 00025 00026 00583 00584 00585 00032 00505 00506 00625 PSP10 00027 00587 00518 00588 00602 00515 PSP43 00028 00516 00030 00610 00613 00623 50 25 0 Figure 4. Clustering Dendogram Part III Column 6 27 28 41 42 48 51 57 58 59 60 29 21 12 56 31 54 55 10 34 8 26 35 44 24 5 38 52 20 18 16 45 47 3 2 25 15 19 13 30 49 50 17 1 33 11 23 Maxgrp CACR JUVI LIST QUAL QUCO QUPR QUST Group6 Group5 Group4 Group2 LITU FAGR CAPA ULRU MORU ULAL ULAM CAOL OXAR CACO JUNI PITA QUMA ILDE BENI PLOC QUVE DIVI CRUN COFL QUMI QUPH ACSA ACRU ILOP CEOC CRAT CATO MATR QURU QUSH CRMA ACNE OSVI CAOV ILAM Value (IV) 3 1 3 1 36 36 1 36 36 36 3 3 3 36 3 3 3 3 1 36 3 3 1 36 3 3 3 1 36 36 1 3 3 3 36 3 3 1 1 3 3 3 3 3 3 3 36 61.8 69.8 70.4 68.6 38.9 98.9 57.7 63.8 72.1 76.3 60 54.2 54.8 18.2 37.5 47.6 39.4 20 39.9 42.6 17.5 17.1 28.9 17 20.9 12.5 10 37.9 26.6 8 42.5 11.9 16.7 21 45.1 19.2 13.4 9.1 39.9 7.5 35.6 7.5 8 7.5 22.7 35.2 8 Mean 21.5 34.5 27.4 34.1 12.6 10.9 21.8 41.6 44 42.9 36.7 28 20.5 4.5 12.8 25.7 19.5 7.1 22.2 29.4 6.9 7.5 17.4 8.7 11.8 5.8 5.1 27.5 15.6 4.5 36.7 6.2 10.2 13.4 39.2 12.8 9.4 5.8 33.6 4.6 30.4 4.4 5.6 4.5 18.9 30.5 5.8 S.Dev 6.07 6.65 5.89 4.53 4.71 4.35 5.39 5.09 6.34 5.43 2.25 6 6.27 2.76 5.57 6.59 6.16 3.6 6.3 5.1 3.88 3.99 5.31 4.23 4.97 3.35 3.07 5.34 5.87 2.49 4.03 3.41 4.65 5.23 4.67 5.51 4.51 3.28 6.24 2.8 6.08 2.92 3.2 2.99 6.11 6.52 3.21 p * 0.001 0.001 0.001 0.001 0.001 0.001 0.001 0.001 0.001 0.001 0.001 0.002 0.003 0.004 0.007 0.008 0.008 0.017 0.02 0.021 0.022 0.03 0.035 0.037 0.041 0.057 0.058 0.058 0.059 0.071 0.082 0.082 0.083 0.09 0.109 0.126 0.132 0.137 0.139 0.152 0.16 0.165 0.167 0.172 0.185 0.2 0.2 22 14 9 36 37 43 40 53 46 39 4 7 32 equal = p proportion -------- FRAX CECA CAGL PIEC PIVI QUFA PRSE SAAL QUNI PRAM AMAR CACA NYSY to Group = of -------- 1 3 1 36 36 1 1 36 3 1 1 3 1 38.3 23.4 34.3 17.9 15 15.3 26.4 10.3 4.1 3.6 5.8 2.5 33.2 or identifier (1 randomized ----- exceeding for + trials ------ Indicator values finished ************************* 33.5 19 30.9 16.7 14.4 14.6 29 11.9 4.7 3.9 6.8 3.7 36.7 the group number with ------- 7.15 6.05 6.08 5.57 5.35 5.12 8.27 4.53 2.89 2.68 3.68 2.54 5.35 observed with of indicator 0.203 0.207 0.229 0.313 0.333 0.335 0.551 0.565 0.567 0.606 0.607 0.726 0.731 indicator maximum runs value 0.14872 value. observed >= Figure 5. Indicator values for the species data which group is which? Not clear what you’re presenting in the table Discussion The principal species in this community analysis are the Acer rubrum species names in italics, Carpinus caroliniana, Liquidambar styraciflua, Quercus alba, Quercus stellata, Fagus grandifolia, and Ulmus rubra. In the first part, the biplots show how the species separate along the axes. In Figure 1,Carpinus caroliniana is consistent with Liquidambar styraciflua. Acer rubrum is consistent with the oaks. This same trend is seen in Fig 2 and 3 as the Red maple is seen in higher concentration along with the oaks axes 2 and 3.. The indication is that the maple and the oaks likely survive better in the same habitat, pH, at the same elevation, or in the same soil type. Species that are grouped together succeed better in the same environmental conditions. As a result, where they do better, their concentrations tend to be higher along that axis ( eg Carpinus caroliniana and Liquidambar styraciflua in Fig 2). In the second part of the experiment, groups are classified based on species composition. Species that seem to be extreme are weeded out by the outlier analysis?. The dendogram indicates that the tree species in the plot are biologically related as is seen by the lines that links from one plot to the next. Consequently, those species that are closely we did not look at relatedness among species tend to agglomerate in higher concentrations. They tend to do well in similar conditions such as soil type, pH, elevation, etc. In the third part, indicator species analysis is used to identify the groups of plots within the data. Four groups were made and the quality of a group was based on the group with both the lowest average p- values and the highest number of significant indicators. Group 6 not group 6, but a grouping level of 6, meaning we divided the plots into 6 groups, so your analysis should identify and describe those 6 groups fit that description and was the model group selected. It had a p- IV observed)/(1 value average of 0.14872 and 30 indicators. This group had greater abundance of species, more plots that contained particular species (fidelity), and larger number of indicators species. These indicator species are important in determining the ecosystem type and the types of species that will likely succeed there. Their decline may be due to disturbances such as fires, floods, etc. References Gauch, Jr. H. G. Noise Reduction by Eigenvalue Ordinations. 1982 Ecology 63: 1643 – 9. Peet, R. K. Ordination as a tool for analyzing complex data sets. 1980 Vegatio 42: 171 – 4 15/26