Multivariate Analysis in Ecology I: Unconstrained Ordination Jari Oksanen Oulu January 2016 http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Introduction January 2016 1 / 103 What is Ordination? Multivariate Analysis and Ordination Basic ordination methods to simplify multivariate data into low dimensional graphics Analysis of multivariate dependence and hypotheses Analyses can be performed in R statistical software using vegan package and allies Course homepage http://cc.oulu.fi/~jarioksa/opetus/metodi/ Vegan homepage https://github.com/vegandevs/vegan/ http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology January 2016 2 / 103 Introduction What is Ordination? Outline 1 Introduction What is Ordination? Gradient Analysis 2 Unconstrained Ordination NMDS Eigenvector Methods PCA CA Graphics Environmental Variables Gradient Model and Ordination http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Introduction January 2016 3 / 103 What is Ordination? Why Ordination? Nobody should want to make an ordination, but they are desperate with multivariate data Map multidimensional table into low-dimensional display Cluster Dendrogram −1.0 14 0.0 0.5 1.0 24 13 7 6 18 3 19 23 20 22 16 15 14 4 2 21 1.5 DCA1 http://cc.oulu.fi/ jarioksa/ (Oulu) 9 10 5 24 25 0.1 6 713 4 −1.0 15 22 16 11 12 23 20 18 27 28 3 11 0.5 28 27 19 0.3 109 2 12 Height 0.5 0.0 DCA2 1.0 21 25 5 0.7 Ordination uh.dis hclust (*, "average") Multivariate Analysis in Ecology January 2016 4 / 103 Introduction What is Ordination? Two Ways of Analysing Data I II Blinacu 5 6 7 III IV 8 Bryupse V 5 Chilpol 6 7 8 Dichfal Response 1.0 0.8 0.6 0.4 0.2 0.0 Fisspus Fontant Fontdal Hygralp Hygroch Jungpum Leptrip Marsema 1.0 0.8 0.6 0.4 0.2 0.0 1.0 0.8 0.6 0.4 0.2 0.0 Palufal Scapund Schiaga Schiriv 1.0 0.8 0.6 0.4 0.2 0.0 5 6 7 8 5 6 7 8 pH 1.5 http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Introduction 0.0 0.5 1.0 Gradient Analysis Dim2 January 2016 5 / 103 Gradient Analysis Blinacu Hygralp Marsema Hygrsmi Bryupse Scapund Dichfal Schiaga Schiriv Jungpum Conduc Hygrlur Fisspus Chilpol slope Callgig Palufal Amblflu Jungexs Instab pH width −0.5 Fontdal Gradient Analysis developed Partsize in 1950s in USA, with R. H. Whittaker as the Currvel Fontant Depth main founding father Platrip Leptrip −1.0 Only two or three environmental variables, or Gradients needed to explain Hygroch complicated community patterns −1.5 Against classification: Species responses smooth along gradients Against organism analogies: Species responses individualistic −1.5 theory −1.0 and −0.5 praxis: 0.0 Ordination 0.5 1.0 and 1.5 Gradient 2.0 The basis of modern modelling of communities Dim1 http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology January 2016 6 / 103 Introduction Gradient Analysis The Gradient Model R. H. Whittaker (1956) Smoky Mountains. http://cc.oulu.fi/ jarioksa/ (Oulu) Vegetation of The Ecological Monographs 26, Multivariate Analysis in Ecology Introduction January 2016 Great 1–80. 7 / 103 Gradient Analysis Types of Gradients 1 Direct gradients: Influence organims but are not consumed. Correspond to conditions. 2 Resource gradients: Consumed Correspond to resources. Complex gradients. Covarying direct and/or resource gradients: Impossible to separate effects of single gradients. Most observed gradients. http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology January 2016 8 / 103 Introduction Gradient Analysis Landscapes and Gradients D Gradient space C A Landscape A C D B B D B http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Introduction January 2016 9 / 103 Gradient Analysis Species responses 0.8 0.6 0.0 0.2 0.4 Response 0.6 0.4 0.2 0.0 Response 0.8 1.0 1.0 Species have non-linear responses along gradients. Often assumed to be Gaussian. . . 4.5 5.0 http://cc.oulu.fi/ jarioksa/ (Oulu) 5.5 6.0 6.5 7.0 Multivariate Analysis in Ecology 7.5 8.0 4.5 January 2016 10 / 103 Introduction Gradient Analysis Gaussian Response Function Height of the response h in the units of response height µ ++++++++ +++++ ++++++++++++ + 0.8 ++ ++ (x − u)2 µ = h × exp− 2t2 0.6 h 0.2 0.4 t 0.0 Width of the response t in the units of gradient x BAUERUBI Location of the optimum u on the gradient x 1.0 Gaussian Response Function has three interpretable parameters that define the expected response µ along the gradient x u + + ++ +++++ ++ ++ +++++ ++++++ ++++++++++++++++++++++++++ + ++++++ 1000 1100 1200 1300 Altitude http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Introduction January 2016 11 / 103 Gradient Analysis Dream of species packing Species have Gaussian responses and divide the gradient optimally: Equal heights h. 0.8 0.4 0.0 Response Equal widths t. Evenly distributed optima u. 0 1 2 3 4 5 6 Gradient http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology January 2016 12 / 103 Introduction Gradient Analysis Evidence for Gaussian Responses Whittaker reported a large number of different response types Only a small proportion were symmetric, bell shaped responses Still became the standard of our times Comparison of ordination methods based on simulation, and many of those use Gaussian responses We need to use simulation because then we know the truth that should be found http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology January 2016 13 / 103 Unconstrained Ordination Ordination Ordination maps multivariate data onto low dimensional displays: “Most data sets have 2.5 dimensions” Gradients define vegetation: ordination tries to find the underlying gradients Basic ordination uses only community composition: Indirect Gradient Analysis Constrained ordination studies only the variation that can be explained by the available environmental variables: Often called Direct Gradient Analysis Distinct flavours of tools: Nonmetric MDS the most robust method PCA duly despised Flavours of Correspondence Analysis popular Canonical method: Constrained Correspondence Analysis http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology January 2016 14 / 103 Unconstrained Ordination NMDS Nonmetric Multidimensional Scaling Rank-order relation with (1) community dissimilarities and (2) ordination distances: No specified form of regression, but the the best shape is found from the data. Non-linear regression can cope with non-linear species responses of various shapes: Not dependent on Gaussian model. Iterative solution: No guarantee of convergence. Must be solved separately for each number of dimensions: A lower dimensional solutions is not a subset of a higher, but each case is solved individually. A test winner, and a natural choice. . . http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Unconstrained Ordination January 2016 15 / 103 NMDS From Ranks of Dissimilarities to Ordination Distances Observed dissimilarities: A B C D E B 0.467 C 0.636 0.511 D 0.524 0.356 0.634 E 0.843 0.600 0.753 0.513 F 0.922 0.893 0.852 0.667 0.606 Observed rank: Ordination distance 0.2 E A 1: 0.133 2: 0.301 0.0 D 5: 0.323 −0.1 B 9: 0.539 4: 0.307 6: 0.414 15: 0.922 11: 0.612 −0.2 7: 0.416 10: 0.605 14: 0.636 F 8: 0.421 3: 0.303 −0.3 NMDS2 0.1 12: 0.615 13: 0.636 Ranks of observed dissimilarities: A B C D E B 2 C 9 3 D 5 1 8 E 12 6 11 4 F 15 14 13 10 7 C −0.4 −0.2 0.0 0.2 0.4 NMDS1 http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Ordination distances: A B C D E B 0.301 C 0.539 0.303 D 0.323 0.133 0.421 E 0.615 0.414 0.612 0.307 F 0.922 0.636 0.636 0.605 0.416 January 2016 16 / 103 Unconstrained Ordination NMDS MDS is a map Map of Europe from road distances Stockholm MDS tries to draw a map using distance data. ● ● Copenhagen MDS tries to find an underlying configuration from dissimilarities. Only the configuration counts: ● Hamburg ● Hook of Holland ● ● ● Cologne Calais Brussels Cherbourg ● ● Paris ● ● Munich Vienna ● ● Lyons Geneva No origin, but only the constellations. No axes or natural directions, but only a framework for points. Lisbon Madrid ● Milan ● Marseilles Barcelona ● ● ● Rome ● Gibraltar ● Athens ● Lambert conical conformal projection http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Unconstrained Ordination January 2016 17 / 103 January 2016 18 / 103 NMDS Shepard Diagram ● 3.0 Non−metric fit, R2 = 0.967 Linear fit, R2 = 0.835 ● 2.0 1.5 0.5 1.0 Ordination Distance 2.5 ● ● ● 0.2 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●●● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ●●●● ●● ● ● ●● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ●● ●● ● ● ● ●● ●● ●● ●● ● ●● ●●● ● ● ●● ●● ●● ● ● ● ●● ● ● ●●● ● ●● ● ●● ●● ●● ●● ● ●●●● ● ● ●●●● ●● ●● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ●● ●● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ●●● ● ●● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ●● ● ● ●● ●● ● ●●● ● ● ● ● ● ●● ● ● ● ● ● ●● 0.3 0.4 0.5 0.6 0.7 Observed Dissimilarity http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Unconstrained Ordination NMDS Iterative Optimization http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Unconstrained Ordination January 2016 19 / 103 NMDS Recommended procedure NMDS may be good, but its use needs special care: Not every NMDS automatically is good 1 Use adequate dissimilarity indices: An adequate index gives a good rank-order relation between community dissimilarity and gradient distance. 2 No convergence guaranteed: Start with several random starts and inspect those with lowest stress. 3 Satisfied only if minimum stress configurations are similar. http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology January 2016 20 / 103 Unconstrained Ordination NMDS metaMDS I > vare.mds <- metaMDS(varespec) Square root transformation Wisconsin double standardization Run 0 stress 0.184 Run 1 stress 0.196 Run 2 stress 0.185 ... procrustes: rmse 0.0494 max resid 0.158 Run 3 stress 0.209 Run 4 stress 0.215 Run 5 stress 0.235 Run 6 stress 0.196 Run 7 stress 0.234 Run 8 stress 0.196 Run 9 stress 0.222 Run 10 stress 0.185 Run 11 stress 0.195 Run 12 stress 0.229 Run 13 stress 0.184 ... New best solution ... procrustes: rmse 3.6e-05 max resid 0.000139 *** Solution reached http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Unconstrained Ordination January 2016 21 / 103 January 2016 22 / 103 NMDS metaMDS II > vare.mds Call: metaMDS(comm = varespec) global Multidimensional Scaling using monoMDS Data: wisconsin(sqrt(varespec)) Distance: bray Dimensions: 2 Stress: 0.184 Stress type 1, weak ties Two convergent solutions found after 13 tries Scaling: centring, PC rotation, halfchange scaling Species: expanded scores based on ‘wisconsin(sqrt(varespec))’ http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Unconstrained Ordination NMDS Plot metaMDS 0.5 ● + + + + ● + + + + ● ● ● ● + + ● 0.0 ++ + ● ● + ● NMDS2 + ● ● + + + ++ ● + + + + + ● + + ● ● + + ● + + + ● + ● ● + + ● ● + ● + + + + −0.5 + ● + + −1.0 + −0.5 0.0 0.5 1.0 NMDS1 http://cc.oulu.fi/ jarioksa/ (Oulu) > plot(vare.mds) Multivariate Analysis in Ecology Unconstrained Ordination January 2016 23 / 103 NMDS Numbers Badness of fit measure stress is based on the residuals from the non-linear regression A proportional measure in the range 0 (perfect) . . . 1 (desperate) related to goodness of fit measure 1 − R 2 Random configuration typically ≈ 0.4 and 0 degenerate Often given in percents (but omitting the percent sign: 15 = 0.15, since cannot be > 1) Orientation, rotation, scale and origin of the coordinates (scores) are indeterminate: only the constellation matters Vegan arbitrarily fixes some of these: Axes are centred, but the origin has no special meaning Axes are rotated so that the first is the longest (technically: rotated to principal components) Axes are scaled so that one unit corresponds to halving of similarity from the “replicate similarity” The sign (direction) of the axes still undefined http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology January 2016 24 / 103 Unconstrained Ordination NMDS Half-change Scaling in NMDS 1.2 + 0.8 0.6 + + ++ + + + 0.0 Linear area of ordination distance – dissimilarity: 0 . . . 0.8 0.4 Maximum dissimilarity = 1: nothing in common + ++++ + + + + + +++ + + + ++ ++ +++ ++ + Threshold + + + + ++ ++ + +++++ + + ++++ + + + + + + + ++ ++++++++++ ++++ + ++ +++ Half−change + + + + + + +++++++++ + + ++ ++++++++ + + + + +++ +++++ ++++++ + +++ + + +++++ ++ ++ + ++ + ++ ++ ++ ++ ++ + + + +++ + ++ + ++ Replicate dissimilarity ++ + ++ 0.2 Replicate similarity: dissimilarity at ordination distance = 0 Community dissimilarity 1.0 + + + 0.0 0.5 1.0 1.5 2.0 Ordination distance http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Unconstrained Ordination January 2016 25 / 103 NMDS What happened in metaMDS? 1 Square root transformation and Wisconsin double standardization 2 Bray–Curtis dissimilarities 3 monoMDS with several random starts and stopping after finding two identical minimum stress solutions 4 Solution rotated to PCs 5 Solution scaled to half-change units 6 Species scores as weighted averages http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology January 2016 26 / 103 Unconstrained Ordination NMDS 0.6 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 6.5 7.0 7.5 8.0 0.2 Response 0.8 1.0 pH 0.0 Should use dissimilarities which reach their maximum (1) when no species are shared (like those listed above) Indices with no bound maximum are usually bad (Euclidean distance etc.) 0.4 Dissimilarity 0.2 0.0 Wisconsin double standardization often helpful 0.6 Bray–Curtis (Steinhaus), Jaccard, Kulczyński 0.4 Use a dissimilarity that describes correctly gradient separation 0.8 1.0 Dissimilarity measures 4.5 5.0 5.5 6.0 pH http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Unconstrained Ordination January 2016 27 / 103 NMDS Procrustes rotation Procrustes rotation to maximal similarity between two configurations: Translate the origin. Rotate the axes. Deflate or inflate the axis scale. Single points can move a lot, although the stress is fairly constant: Especially in large data sets. http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology January 2016 28 / 103 Unconstrained Ordination NMDS Procrustes Rotation > > > > > tmp <- wisconsin(sqrt(varespec)) dis <- vegdist(tmp) vare.mds0 <- monoMDS(dis, trace = 0) pro <- procrustes(vare.mds, vare.mds0) pro Call: procrustes(X = vare.mds, Y = vare.mds0) Procrustes sum of squares: 0.186 http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Unconstrained Ordination January 2016 29 / 103 January 2016 30 / 103 NMDS Plot Procrustes Rotations 0.4 0.6 Procrustes errors ● ● ● ● ● ● ● 0.0 Dimension 2 0.2 ● ● ● ● ● ● ● −0.2 ● ● ● ● ● ● ● ● −0.4 ● ● −0.4 −0.2 0.0 0.2 0.4 0.6 Dimension 1 http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Unconstrained Ordination NMDS Number of dimensions In NMDS, 2D solution is not a plane in 3D space Solution must be found separately for each dimensionality Some people very disturbed: how do they know the correct number Answer is easy: there is no correct number, although some numbers may be worse than others “Most data sets have 2.5 dimensions” Typically you try with 2 and 3 Do you need more dimensions to explain species patterns and environmental data? Is convergence very slow? Try another number of dimensions Scree plot or stress against the number of dimensions often suggested but rarely works http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Unconstrained Ordination January 2016 31 / 103 January 2016 32 / 103 NMDS Scree Plot 0.15 stress 0.20 0.25 ● 0.10 ● 0.05 ● ● ● ● 1 2 3 4 5 6 No. of Dimensions http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Unconstrained Ordination Eigenvector Methods Simplified mapping: Eigen analysis NMDS uses non-linear mapping for any dissimilarity measure: This is very difficult Things are much simpler if we accept only certain dissimilarity indices and map them linearly onto ordination Linear mapping is only a rotation, and can be solved using eigenvector techniques Sometimes said that certain methods are model-based (CA), but they also employ a distance method metric mapping NMDS MDS PCA CA any any Euclidean Chi-square nonlinear linear linear weighted linear http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Unconstrained Ordination January 2016 33 / 103 Eigenvector Methods Why Not PCA? We admit that PCA is just a rotation, but it is a linear method PCA works with species space, but we boldly go to gradient space CA is an optimal scaling method Sites with similar species composition packed close to each other Species that occur together simultaneously packed close to each other CA can handle unimodal species responses, even approximate one dimensional species packing model http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology January 2016 34 / 103 Unconstrained Ordination PCA Species space 60 20 What is the ideal viewing angle to the species space? 40 Cla.ste Some species show more of the configuration than others 80 Graphical presentations of data matrix: Species are axes and span the space where sites are points 0 Shows as much as possible of all species in just two or three dimensions 0 10 20 30 40 50 60 Cla.ran http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Unconstrained Ordination January 2016 35 / 103 PCA 3 Rotate the axes so that the first axis (1) is as close to all points as possible, and (2) explains as much of the variance as possible 20 Move the origin to the centroid Cla.arb 2 10 Put sites into species space 0 1 30 40 Rotation in species space 0 10 20 30 40 50 60 Cla.ran http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology January 2016 36 / 103 Unconstrained Ordination PCA Goodness of Fit The total variation (Λ) is the sum of squared distances of points from the origin Λ can be expressed as the sum of squares (SS) or variance (SS/n or SS/(n − 1)) The points are projected on the axis, and the sum of projected squared distances is the eigenvalue of the axis (λi ) The eigenvalues are ordered and Ppnon-negative λ1 ≥ λ2 ≥ · · · ≥ λp ≥ 0, and sum up to total variance Λ = i=1 λi λi /Λ gives the proportion that an axis explains of the total variance, and λ1 explains the largest proportion Cumulative sum gives the proportion of variance explained by the first axes: often emphasized but rather useless statistic PCA is often used to reduce data into a few linearly independent components that explain the most of the original variables http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Unconstrained Ordination January 2016 37 / 103 PCA ∑ xijxik 40 40 Euclidean Metric of PCA + cosθjk = Explained 10 sk + 30 40 50 60 0 http://cc.oulu.fi/ jarioksa/ (Oulu) djk = ∑ (xij − xik)2 i θjk + + s= +++ j + + ++ + + + Cla.ran + + + + + 10 ++ 20 + + 0 10 0 + + + + + ++ + + + + ++ + +k ∑ x2ij ∑ x2ik i i 30 + + + + i 20 20 + i=2 (Cla.arb) Total + 0 Cla.arb 30 + Residual 10 20 ∑ x2ij ++ j i + 30 40 50 60 i=1 (Cla.ran) Multivariate Analysis in Ecology January 2016 38 / 103 Unconstrained Ordination PCA Running PCA I > (ord <- rda(dune)) Call: rda(X = dune) Inertia Rank Total 84.1 Unconstrained 84.1 19 Inertia is variance Eigenvalues for unconstrained axes: PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 24.80 18.15 7.63 7.15 5.70 4.33 3.20 2.78 (Showed only 8 of all 19 unconstrained eigenvalues) > head(summary(ord), 3, 1) http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Unconstrained Ordination January 2016 39 / 103 January 2016 40 / 103 PCA Running PCA II Call: rda(X = dune) Partitioning of variance: Inertia Proportion Total 84.1 1 Unconstrained 84.1 1 Eigenvalues, and their contribution to the variance Importance of components: PC1 PC2 PC3 PC4 PC5 PC6 Eigenvalue 24.795 18.147 7.6291 7.153 5.6950 4.3333 Proportion Explained 0.295 0.216 0.0907 0.085 0.0677 0.0515 Cumulative Proportion 0.295 0.510 0.6011 0.686 0.7539 0.8054 PC7 PC8 PC9 PC10 PC11 PC12 Eigenvalue 3.199 2.7819 2.4820 1.854 1.7471 1.3136 Proportion Explained 0.038 0.0331 0.0295 0.022 0.0208 0.0156 Cumulative Proportion 0.843 0.8765 0.9060 0.928 0.9488 0.9644 PC13 PC14 PC15 PC16 PC17 Eigenvalue 0.9905 0.63779 0.55083 0.35058 0.19956 Proportion Explained 0.0118 0.00758 0.00655 0.00417 0.00237 http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Unconstrained Ordination PCA Running PCA III Cumulative Proportion 0.9762 0.98377 0.99032 0.99448 0.99686 PC18 PC19 Eigenvalue 0.14880 0.11575 Proportion Explained 0.00177 0.00138 Cumulative Proportion 0.99862 1.00000 Scaling 2 for species and site scores * Species are scaled proportional to eigenvalues * Sites are unscaled: weighted dispersion equal on all dimensions * General scaling constant of scores: 6.3229 Species scores PC1 PC2 PC3 PC4 PC5 PC6 Achimill -0.6038 0.124 0.00846 0.160 0.4087 0.1279 Agrostol 1.3740 -0.964 0.16691 0.266 -0.0877 0.0474 Airaprae 0.0234 0.251 -0.19477 -0.326 0.0557 -0.0796 .... Callcusp 0.5385 0.180 0.17509 0.239 0.2553 0.1692 http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Unconstrained Ordination January 2016 41 / 103 January 2016 42 / 103 PCA Running PCA IV Site scores (weighted sums of species scores) 1 2 3 .... 20 PC1 PC2 PC3 PC4 PC5 PC6 -0.857 -0.172 2.608 -1.130 0.4507 -2.4911 -1.645 -1.230 0.887 -0.986 2.0346 1.8106 -0.440 -2.383 0.930 -0.460 -1.0278 -0.0518 2.341 1.299 0.903 http://cc.oulu.fi/ jarioksa/ (Oulu) 0.718 -0.0757 -0.9691 Multivariate Analysis in Ecology Unconstrained Ordination PCA Row and Column scores The scores are centred (= their mean is zero) and either normalized (= all have equal spread) or proportional to eigenvalues (= spread is higher when eigenvalue is high) Normalized scores give the regression coefficients between the axis and the variables: often used for species Scores proportional to the eigenvalue give the true configuration of points in the space defined by normalized scores: often used for sites (hence in species space) Together these scores give a linear least square approximation of the data Graphical presentation called biplot However, there are many alternative scaling systems http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Unconstrained Ordination January 2016 43 / 103 January 2016 44 / 103 PCA Default Plot 2 19 18 17 20 1 11 14 10 0 7 5 PC2 15 6 Anthodor Planlanc Bracruta Salirepe ScorautuHyporadi Airaprae Trifprat Empenigr Callcusp Achimill Vicilath Comapalu Rumeacet Ranuflam Trifrepe Chenalbu Juncarti Cirsarve 1 Bromhord Bellpere Lolipere Eleopalu 16 Sagiproc Juncbufo Poaprat −1 Elymrepe Agrostol 8 2 Alopgeni Poatriv 12 −2 9 4 13 3 −2 −1 0 1 2 3 PC1 http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Unconstrained Ordination PCA Reading the Plot Origin: all species (variables) at their average values The distance from the origin for a row (site) implies how much the point differs from the average The distance from the origin for a column (species, variable) implies how much the point increases to that direction The change is measured in absolute scale: big changes, long distances from the origin Implies a linear model of species response against axes The angle between two points implies correlations 90◦ means zero correlation, < 90◦ positive correlation, > 90◦ negative correlation, 0◦ implies r = 1 Arrow biplots often used instead of point biplot http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Unconstrained Ordination January 2016 45 / 103 January 2016 46 / 103 PCA Arrow Biplot 2 19 18 17 20 1 11 14 10 0 7 5 PC2 15 6 Anthodor Planlanc Bracruta Salirepe ScorautuHyporadi Airaprae Trifprat Empenigr Callcusp Achimill Vicilath Comapalu Rumeacet Ranuflam Trifrepe Chenalbu Juncarti Cirsarve 1 Bromhord Bellpere Lolipere Eleopalu 16 Sagiproc Juncbufo Poaprat −1 Elymrepe Agrostol 8 2 Alopgeni Poatriv 12 −2 9 4 13 3 −2 −1 0 1 2 3 PC1 http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Unconstrained Ordination PCA 2 0 Expected Response 4 6 Linear Model −2 −1 0 1 2 3 Principal Component 1 http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Unconstrained Ordination January 2016 47 / 103 PCA Variances and Correlations Analysis of raw data explains variances: variables with high variance are most important If the variables are standardized to unit variance before analysis z = (x − x̄)/sx all variables are equally important and the analysis explains correlations among variables Standardization can be used when we want all variables to have equal weights Standardization must be used when variables are measured in different scales, such as for environmental measurements http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology January 2016 48 / 103 Unconstrained Ordination PCA Reducing the Number of Correlated Environmental Variables I > (pc <- rda(varechem, scale=TRUE)) Call: rda(X = varechem, scale = TRUE) Inertia Rank Total 14 Unconstrained 14 14 Inertia is correlations Eigenvalues for unconstrained axes: PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 PC10 PC11 PC12 5.19 3.19 1.69 1.07 0.82 0.71 0.44 0.37 0.17 0.15 0.09 0.07 PC13 PC14 0.04 0.02 http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Unconstrained Ordination January 2016 49 / 103 PCA The Number of Components PCA is a rotation in species (character) space and retains the original configuration The number of PC’s is min(N, S), and all together give the original data First axes are most important and we may ignore the minor axes We can either use the axes as variables in other models, or use them to identify major (almost) independent variables Often we want to retain a certain proportion of the variance, say 50 % Sometimes we would like to retain “significant” axes There really is no way of doing this, but some people suggest comparing eigenvalues against broken stick distribution http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology January 2016 50 / 103 Unconstrained Ordination PCA Broken Stick and Eigenvalues 5 pc Broken Stick 4 ● Inertia 3 ● 2 ● ● ● 1 ● ● ● ● ● 0 ● PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Unconstrained Ordination January 2016 51 / 103 January 2016 52 / 103 PCA Two Dimensions, but which? Humdepth 0.5 Baresoil Mn PC2 0.0 N Ca Mg K Zn −0.5 P Mo S pH −1.0 Fe Al −1.0 −0.5 0.0 0.5 PC1 http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Unconstrained Ordination PCA Methods Related to PCA Metric Scaling a.k.a. Principal Coordinates Analysis Used dissimilarities instead of raw data With Euclidean distances equal to PCA, but can use other dissimilarities Factor Analysis A statistical method that makes a difference between systematic components and random error In PCA we just ignore latter components, but here we really identify the real components Much used in human sciences and often referred to in ecology (but usually misunderstood) http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Unconstrained Ordination January 2016 53 / 103 January 2016 54 / 103 PCA Confirmatory Factor Analysis ε P 0.812 0.340 ε K 0.399 0.776 ξ ε Ca 0.163 ε Mg 0.310 http://cc.oulu.fi/ jarioksa/ (Oulu) 0.915 0.831 Multivariate Analysis in Ecology Unconstrained Ordination CA Correspondence Analysis 0.20 0.15 0.10 0.05 0.00 Minor variant of PCA: Weighted Principal Components with Chi-square metric 23 19 22 16 28 13 14 20 25 7 5 6 3 4 2 9 12 10 11 fij − eij χij = √ eij 27 Chi-square transformation tells how much the observed proportions fij differ from the expected proportions eij : 24 Null model: Species composition is identical in all sampling units 15 Site and species marginal profiles define the expected abundances 21 Multivariate Analysis in Ecology Unconstrained Ordination 0.03 18 Cal.vul Emp.nig Led.pal Vac.myr Vac.vit Pin.syl Des.fle Bet.pub Vac.uli Dip.mon Dic.sp Dic.fus Dic.pol Hyl.spl Ple.sch Pol.pil Pol.jun Pol.com Poh.nut Pti.cil Bar.lyc Cla.arb Cla.ran Cla.ste Cla.unc Cla.coc Cla.cor Cla.gra Cla.fim Cla.cri Cla.chl Cla.bot Cla.ama Cla.sp Cet.eri Cet.isl Cet.niv Nep.arc Ste.sp Pel.aph Ich.eri Cla.cer Cla.def Cla.phy All sites should have all species in in the same proportions as in the whole data http://cc.oulu.fi/ jarioksa/ (Oulu) 0.00 January 2016 55 / 103 CA Chi-squared metric Cla.ste Cla.ran Cla.arb eij Cal.vul Emp.nig (fij − eij) Vac.vit Cla.unc Dic.fus Ple.sch Vac.myr Dic.sp eij 10 2 9 3 http://cc.oulu.fi/ jarioksa/ (Oulu) 4 12 5 6 11 7 13 18 19 21 Multivariate Analysis in Ecology 23 20 14 16 15 27 22 24 25 28 January 2016 56 / 103 Unconstrained Ordination CA 2 Relative proportions are axes and points have weights 3 Chi-square transformation 4 Weighted rotation 5 De-weighting CA2 Sites in a species space −0.5 0.0 1 0.5 1.0 CA Rotation −1.0 −0.5 0.0 0.5 1.0 CA1 http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Unconstrained Ordination January 2016 57 / 103 January 2016 58 / 103 CA Running CA I > (ord <- cca(dune)) Call: cca(X = dune) Inertia Rank Total 2.12 Unconstrained 2.12 19 Inertia is mean squared contingency coefficient Eigenvalues for unconstrained axes: CA1 CA2 CA3 CA4 CA5 CA6 CA7 CA8 0.536 0.400 0.260 0.176 0.145 0.108 0.092 0.081 (Showed only 8 of all 19 unconstrained eigenvalues) > head(summary(ord), 2, 1) http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Unconstrained Ordination CA Running CA II Call: cca(X = dune) Partitioning of mean squared contingency coefficient: Inertia Proportion Total 2.12 1 Unconstrained 2.12 1 Eigenvalues, and their contribution to the mean squared contingency coefficient Importance of components: CA1 CA2 CA3 CA4 CA5 CA6 Eigenvalue 0.536 0.400 0.260 0.1760 0.1448 0.108 Proportion Explained 0.253 0.189 0.123 0.0832 0.0684 0.051 Cumulative Proportion 0.253 0.443 0.565 0.6486 0.7170 0.768 CA7 CA8 CA9 CA10 CA11 Eigenvalue 0.0925 0.0809 0.0733 0.0563 0.0483 Proportion Explained 0.0437 0.0382 0.0347 0.0266 0.0228 Cumulative Proportion 0.8117 0.8500 0.8847 0.9113 0.9341 CA12 CA13 CA14 CA15 CA16 Eigenvalue 0.0412 0.0352 0.02053 0.01491 0.00907 Proportion Explained 0.0195 0.0167 0.00971 0.00705 0.00429 http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Unconstrained Ordination January 2016 59 / 103 January 2016 60 / 103 CA Running CA III Cumulative Proportion 0.9536 0.9702 0.97995 0.98700 0.99129 CA17 CA18 CA19 Eigenvalue 0.00794 0.00700 0.00348 Proportion Explained 0.00375 0.00331 0.00164 Cumulative Proportion 0.99505 0.99836 1.00000 Scaling 2 for species and site scores * Species are scaled proportional to eigenvalues * Sites are unscaled: weighted dispersion equal on all dimensions Species scores CA1 CA2 CA3 CA4 CA5 CA6 Achimill -0.909 0.0846 -0.586 -0.00892 -0.660 -0.1888 Agrostol 0.934 -0.2065 0.282 0.02429 -0.139 -0.0226 .... Callcusp 1.952 0.5674 -0.859 -0.09897 -0.557 0.2328 Site scores (weighted averages of species scores) http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Unconstrained Ordination CA Running CA IV CA1 CA2 CA3 CA4 CA5 CA6 1 -0.812 -1.083 -0.1448 -2.107 -0.393 -1.8346 2 -0.633 -0.696 -0.0971 -1.187 -0.977 0.0658 .... 20 1.944 1.069 -0.6660 -0.553 1.596 -1.7029 http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Unconstrained Ordination January 2016 61 / 103 CA Goodness of Fit of Scores Inertia is “mean square contingency coefficient”: Chi-squared of a matrix standardized to unit sum, or Chi-square of Px x Eigenvalues are non-negative and ordered like in PCA, but they are bound to maximum 1 The origin gives the expected abundances for all species and all sites The deviant species and deviant sites are far away from the origin CA is weighted analysis, and the weighted sum of squared scores is the eigenvalue The species and site scores are (scaled) weighted averages of each other: proximity matters Rare species have low weights: they are further away from the origin http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology January 2016 62 / 103 Unconstrained Ordination CA Weighted Average? For presence/absence data: weighted average of a species is in the middle (“barycentre”) of plots where the species occurs 3 ● For quantitative data: plots wheres species is abundant are heavier and the weighted average is closer to them ● 1 CA2 2 ● ● ● ● ● 0 ● ● Lolper ● −1 ● ● −1 Sampling units (SU) are close to species that occur on them ● ● ●● CA is a weighted average method: it tries to put SUs close to the species that occur in them, and all SUs with similar species composition close to each other: Unimodal response ● ● ● 0 1 2 CA1 http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Unconstrained Ordination January 2016 63 / 103 January 2016 64 / 103 CA Default Plot and Effect of Scaling scaling = 1 5 Empenigr Airaprae 3 4 Hyporadi 2 19 1 17 Anthodor 0 Callcusp Comapalu 20 Vicilath Scorautu Eleopalu Ranuflam 18 Bracruta Planlanc 14 15 11 Juncarti Achimill 16 Sagiproc 6 Trifrepe 10 Trifprat 57 8 Agrostol Rumeacet 2 12 Poaprat 34 9 13 Bellpere Lolipere Bromhord 1 Poatriv Alopgeni Juncbufo Elymrepe Cirsarve Chenalbu −1 CA2 Salirepe −2 −1 0 1 2 3 4 CA1 http://cc.oulu.fi/ jarioksa/ (Oulu) scaling =Multivariate 2 Analysis in Ecology Unconstrained Ordination CA Weighted Averages Species scores are [proportional to] weighted averages of site scores, and simultaneously Site scores are [proportional to] weighted averages of species scores Either one (but not both) of these can be a direct weighted average of other If sites scores are weighted averages of species scores, site point is in the middle of points of species that occurs in the site The location of the point is meaningful whereas in PCA the main things were distance and direction from the origin (but these, too, matter) Can approximate unimodal response model and therefore CA is much better for community ordination than PCA http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Unconstrained Ordination January 2016 65 / 103 CA Linear and Unimodal Models PCA implies linear relations between axes and species abundances 80 60 0 20 40 cladstel 60 40 20 0 cladstel 80 CA packs species and approximates a unimodal model −4 −2 0 2 −1 0 PCA1 http://cc.oulu.fi/ jarioksa/ (Oulu) 1 2 3 CA1 Multivariate Analysis in Ecology January 2016 66 / 103 Unconstrained Ordination CA Optimal Scaling 0.8 0.6 0.4 0.2 0.0 The species responses should be narrow: width is measured as SSw Suboptimal scaling Response The locations of species optima (tops) should be widespread: spread is measured as SSB The total variance is their sum SST = SSB + SSw 1 4 5 0.4 0.0 Response 0.8 Optimal scaling Scaling is optimal if most of variance is between species and SSB is high The criterion of variance is the eigenvalue maximized in CA: λ = SSB /SST 0 1 2 3 4 5 Gradient Multivariate Analysis in Ecology Unconstrained Ordination 3 Gradient High SSB means that species have different optima, and low SSw means that species have narrow tolerance http://cc.oulu.fi/ jarioksa/ (Oulu) 2 January 2016 67 / 103 CA Goodness of Fit Statistics: Repetition NMDS: stress of nonlinear transformation from observed dissimilarities to ordination distances In range 0 . . . 1 (0 . . . 100 %), but in practice 0.4 for random configuration 0.1 is good, and 0.2 is not bad, 0 is suspect PCA: sum of eigenvalues is variance (or SS) Upper limit is total variance, large is good CA: sum of all eigenvalues is (scaled) Chi-square Single eigenvalue maximum 1 high is good, but λ < 0.2 may not be bad Eigenvalues λ > 0.7 are suspect: disjunct or very heterogeneous data http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology January 2016 68 / 103 Unconstrained Ordination CA Nonlinear and Linear Mapping ● ● ● ●● 0.2 0.4 0.6 0.8 1.0 8 Observed Dissimilarity 10 12 14 16 0.2 0.3 0.4 0.5 0 5 Ordination Distance ● ● ●● ●● ● ● ● ●● ● ●● ● ● ●● ● ● ● ● ●● ●● ● ● ●●●● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ●● ● ● ●● ● ●● ● ● ●● ● ●● ●● ●● ●● ● ●● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ●● ●● ●● ● ●● ●● ●● ●● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ●● ● ●●● ●● ● ● ● ● ●●●● ●● ● ● ● ● ●● ● ● ● ● ●●●● ● ● ●● ●● ●● ●● ●● ●● ● ● ● ● ● ● ● ●● ● ●● ●● ● ● ● ●●● ● ● ●●● ● ●● ● ● ● ● ●● ● ● ●●●● ● ●● ● ● ●● ●● ● ● ● ● ●●● ● ● ● ●● ● ● ● ● ●●● ● ● ●●●● ● ● ● ●● ●● ● ●● ● ● ● ●● ●●● ● ● ● ● ●● ● ● ●● ● ● ●● ● ●● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ●● ● ●●● ● ● ● ●●● ●● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ●● ● ● 0.1 ● ● 0.0 15 ● ● ●● ● ●● ● 2.5 2.0 1.5 1.0 0.5 0.0 ● Ordination Distance Non−metric fit, R2 = 0.986 Linear fit, R2 = 0.926 Ordination Distance CA 0.6 PCA ● 10 3.0 NMDS 18 ● ●● ●● ●● ● ●● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ●●● ● ●● ● ●● ●● ● ●● ●● ● ●● ●● ● ● ●● ●● ● ●● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● ● ●● ● ●● ● ●● ●● ● ●● ●● ● ● ●●● ● ● ● ●●●● ●● ● ● ● ● ●● ●●● ●● ● ● ● ● ●● ● ●● ● ● ●● ● ● ● ● ● ●● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● 0.2 Observed Dissimilarity 0.3 0.4 0.5 0.6 0.7 Observed Dissimilarity Dune Meadow data http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Unconstrained Ordination January 2016 69 / 103 CA Nonlinear and Linear Mapping: A Difficult Case CA 0.4 PCA 7 NMDS 0.1 0.2 Ordination Distance 4 2 3 Ordination Distance 3 2 0 0.0 1 1 0 Ordination Distance 5 0.3 6 4 Non−metric fit, R2 = 0.98 Linear fit, R2 = 0.92 0.2 0.4 0.6 0.8 Observed Dissimilarity 1.0 2 4 6 Observed Dissimilarity 8 0.2 0.4 0.6 0.8 Observed Dissimilarity Bryce Canyon http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology January 2016 70 / 103 Unconstrained Ordination Graphics Anatomy of a Plot 0.5 2 CladcervCladstel Cladphyl 24 Dicrpoly Cetrisla Cladchlo 12 Pinusylv 11 0.0 9 4 Flavniva NMDS2 Polypili Barbhatc 10 Ptilcili Pohlnuta Empenigr Vaccviti Cladrang 19 23 Cladunci Cladgrac Cladcris 6 13 Claddefo Cetreric Cladcorn 20 Cladfimb Callvulg Peltapht 15 3 18Cladsp 16 Cladarbu Cladcocc 7 Polyjuni 14 5Stersp Diphcomp 21 Cladbotr Dicrsp Pleuschr Betupube 28 Vaccmyrt Rhodtome 22 27 Polycomm Hylosple Descflex Dicrfusc −0.5 Vacculig 25 Cladamau Icmaeric −1.0 Nepharct −0.5 0.0 0.5 1.0 NMDS1 http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Unconstrained Ordination January 2016 71 / 103 Graphics Plotting functions All vegan ordination functions have a plot function, and ordiplot can be used for other functions as well For full control, use first plot(x, type="n") and then add configurable points or text Congested plots can displayed with orditorp or edited with orditkplot Lattice graphics can be made with ordixyplot, ordicloud or ordisplom Dynamic, spinnable 3D plots can be made with ordirgl function in the vegan3d package Items can be added to the plots with ordiarrows, ordihull, ordispider, ordihull, ordiellipse, ordisegments, or ordigrid 2 24 12 11 (Oulu) http://cc.oulu.fi/ jarioksa/ 10 Multivariate Analysis in Ecology 21 January 2016 72 / 103 Unconstrained Ordination Environmental Variables Ordination and Environment We take granted that vegetation is controlled by environment, so 1 Two sites close to each other in ordination have similar vegetation 2 If two sites have similar vegetation, they have similar environment 3 Two sites far away from each other in ordination have dissimilar vegetation, and perhaps 4 If two sites have different vegetation, they have different environment http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Unconstrained Ordination January 2016 73 / 103 Environmental Variables Fitted Vectors 1.0 0.5 pH 0.0 width Instab Currvel −0.5 Partsize Depth −1.0 MDS2 Conduc slope −1.5 For every arrow, there is an equally long arrow into opposite direction: Decreasing direction of the gradient. Implies a linear model: Project sample plots onto the vector for expected value. 1.5 Direction of fitted vector shows the gradient of the environmental variable, length shows its importance. −1.5 −1.0 −0.5 0.0 0.5 1.0 1.5 2.0 MDS1 http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology January 2016 74 / 103 Unconstrained Ordination Environmental Variables Interpretation of Arrow ● 0.5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● NMDS2 ● ● ● ● ● ● ● ● ● ● ● −0.5 ● ● ● ● WatrCont −1.0 ● ● −1.0 −0.5 0.0 0.5 1.0 1.5 NMDS1 http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Unconstrained Ordination January 2016 75 / 103 Environmental Variables Alternatives to Vectors Fitted vectors natural in constrained ordination, since these have linear constraints. Distant sites are different, but may be different in various ways: Environmental variables may have a non-linear relation to ordination. Linear 0.5 Dim1 http://cc.oulu.fi/ jarioksa/ (Oulu) 1.0 1.5 1.0 0.5 −0.5 Dim2 0.5 −1.5 + pH −0.5 + + + + + +++ + + ++ ++ + + + + + + + + +++ + + ++ pH +++ + ++++++ + +++ ++ + +++ + + + + + + + + ++++++ + +++ ++ + ++ +++ + + +++++ + + + + + + ++ +++++++ ++++ ++ ++ + + ++ + + + + + + + + ++ + + + ++ 1.0 + + + 0.0 1.5 1.5 + + + −1.0 Fitted surface −1.5 + Observed values Dim2 + −1.5 −0.5 Dim2 0.5 1.0 1.5 + −1.0 0.0 0.5 1.0 Dim1 Multivariate Analysis in Ecology 1.5 + + + + + + + + + + +++ + + ++ ++ + + + + + + + + +++ + + ++ pH +++ + ++++++ + +++ ++ + +++ + + + + + + + + ++++++ + +++ ++ + ++ +++ + + +++++ + + + + + + ++ +++++++ ++++ ++ ++ + + ++ + + + + + + + + ++ + + + ++ + + + −1.0 0.0 0.5 1.0 1.5 Dim1 January 2016 76 / 103 Unconstrained Ordination Environmental Variables Fitting Environmental Vectors I > (ef <- envfit(vare.mds, varechem, permu = 999)) ***VECTORS NMDS1 NMDS2 r2 Pr(>r) N -0.050 -0.999 0.21 0.098 P 0.687 0.727 0.18 0.135 K 0.827 0.562 0.17 0.147 Ca 0.750 0.661 0.28 0.029 Mg 0.697 0.717 0.35 0.015 S 0.276 0.961 0.18 0.143 Al -0.838 0.546 0.52 0.002 Fe -0.862 0.507 0.40 0.013 Mn 0.802 -0.597 0.53 0.001 Zn 0.665 0.747 0.18 0.146 Mo -0.849 0.529 0.05 0.581 Baresoil 0.872 -0.490 0.25 0.035 Humdepth 0.926 -0.377 0.56 0.001 pH -0.799 0.601 0.26 0.042 --Signif. codes: 0 ‘***’ 0.001 ‘**’ Permutation: free Number of permutations: 999 http://cc.oulu.fi/ jarioksa/ (Oulu) . * * ** * *** * *** * 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Multivariate Analysis in Ecology Unconstrained Ordination January 2016 77 / 103 January 2016 78 / 103 Environmental Variables Plotting Environmental Vectors 0.6 Limit p < 0.1 ● 0.4 ● Mg Ca ● ● pH ● ● ● 0.0 ● ● ● ● ● ● ● ● ● −0.2 ● ● ● ● Baresoil ● Humdepth ● ● Mn N −0.4 NMDS2 0.2 Al Fe ● −0.4 −0.2 0.0 0.2 0.4 0.6 NMDS1 http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Unconstrained Ordination Environmental Variables Fitting Environmental surfaces > > > > > ef <- envfit(vare.mds ~ Al + Ca, varechem) plot(vare.mds, display = "sites") plot(ef) tmp <- with(varechem, ordisurf(vare.mds, Al, add = TRUE)) tmp <- with(varechem, ordisurf(vare.mds, Ca, add = TRUE, col = "green4")) http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Unconstrained Ordination January 2016 79 / 103 January 2016 80 / 103 Environmental Variables 0.6 Plotting Environmental Surfaces ● 10 0 0.4 ● ● Al ● Ca 700 0.2 650● ● ● 65 0 550 0.0 ● 500 ● 450 0 25 ● ● ● ● ● ● 350 −0.2 ● ● 400 ● ● ● ● 200 ● ● 150 50 0 −0.4 NMDS2 600 ● −0.4 −0.2 0.0 0.2 0.4 0.6 NMDS1 http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Unconstrained Ordination Environmental Variables Factor Fitting I > dune.ca <- cca(dune) > ef <- envfit(dune.ca ~ A1 + Management, data=dune.env, perm=999) > ef ***VECTORS CA1 CA2 r2 Pr(>r) A1 0.9980 0.0606 0.31 0.052 . --Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Permutation: free Number of permutations: 999 ***FACTORS: Centroids: ManagementBF ManagementHF ManagementNM ManagementSF CA1 -0.73 -0.39 0.65 0.34 CA2 -0.14 -0.30 1.44 -0.68 http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Unconstrained Ordination January 2016 81 / 103 January 2016 82 / 103 Environmental Variables Factor Fitting II Goodness of fit: r2 Pr(>r) Management 0.44 0.003 ** --Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Permutation: free Number of permutations: 999 http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Unconstrained Ordination Environmental Variables Plotting Fitted Factors 3 19 2 17 20 1 CA2 ManagementNM 18 14 15 0 11 16 A1 6 10 ManagementBF 5 7 ManagementHF −1 2 8 12 ManagementSF 4 9 3 13 1 −2 −1 0 1 2 CA1 http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Unconstrained Ordination January 2016 83 / 103 Environmental Variables Environmental Interpretation Environmental variables need not be parallel to ordination axes. Axes cannot be taken as gradients, but gradients are oblique to axes: You cannot tear off an axis from an ordination. Never calculate a correlation between an axis and an environmental variable. Environmental variables need not be linearly correlated with the ordination, but locations in ordination can be exceptional. http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology January 2016 84 / 103 Unconstrained Ordination Gradient Model and Ordination Gradient Model and Ordination Species packing gradient Single gradients appear as curves in linear ordination methods PCA horseshoe: curve bends inward and gives wrong ordering of points on axis 1 PCA CA CA arch: axis 1 retains the correct ordering of sites despite the curve Environmental interpretation by vector fitting or surface bound to be biased Axes cannot be interpreted as “gradients” http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Unconstrained Ordination January 2016 85 / 103 Gradient Model and Ordination The birth of the curve There is a curve in the species space and PCA shows it correctly CA deals better wit unimodal responses, but the second optimal scaling axis is folded first axis Gradient space Species space CA1 CA2 http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology January 2016 86 / 103 Unconstrained Ordination Gradient Model and Ordination Solutions to the Curvature Detrended Correspondence Analysis (DCA) CA axis retains the correct ordering: keep that, but instead of orthogonal axes, use detrended axes Programme DECORANA additionally rescales axes to sd units approximating t parameter of the Gaussian model Distorts space, introduces new artefacts and probably should be avoided Nonmetric Multidimensional Scaling (NMDS) should be able to cope with moderately long gradients Constrained ordination may linearize the responses http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Unconstrained Ordination January 2016 87 / 103 January 2016 88 / 103 Gradient Model and Ordination Running Detrended Correspondence Analysis > (ord <- decorana(dune)) Call: decorana(veg = dune) Detrended correspondence analysis with 26 segments. Rescaling of axes with 4 iterations. DCA1 DCA2 DCA3 DCA4 Eigenvalues 0.512 0.304 0.1213 0.1427 Decorana values 0.536 0.287 0.0814 0.0481 Axis lengths 3.700 3.117 1.3005 1.4789 http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Unconstrained Ordination Gradient Model and Ordination Default plot 3 Empenigr Airaprae Hyporadi 2 Salirepe 19 Sagiproc Scorautu Bracruta 1 17 Vicilath 20 Ranuflam Cirsarve Juncarti Callcusp 15 18 Juncbufo Agrostol 1416 12 Trifrepe 11 Eleopalu 13 8 Chenalbu Alopgeni 6 Planlanc 9 4 Comapalu Rumeacet7 510 3 Trifprat 2 Poatriv Poaprat Bellpere −1 0 DCA2 Anthodor Lolipere 1 −2 Bromhord Elymrepe Achimill −3 −2 −1 0 1 2 3 DCA1 http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Unconstrained Ordination January 2016 89 / 103 Gradient Model and Ordination Community Pattern Simulation PCA PCoA DCA NMDS 4 x 2.5t Truth CA http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology January 2016 90 / 103 Unconstrained Ordination Gradient Model and Ordination Short Gradients: Is There a Niche for PCA? PCA 2 x 1t Folklore: PCA with short gradients (≤ 2t). Not based on research, but simulation finds PCA uniformly worse than CA: With short gradients about as good as CA, but usually worse. There should be no species optimum within gradient: Shortness alone not sufficient. CA PCA best used for really linear cases (environment) or for reduction of variables into principal components (but see FA). Noise dominates over signal in homogeneous data. http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology Unconstrained Ordination January 2016 91 / 103 Gradient Model and Ordination Long Gradients: DCA or NMDS CA Curvature with long gradients: Need either DCA or NMDS. DCA NMDS is a test winner: More robust than DCA. DCA more popular. DCA may produce new artefacts, since it twists the space. MDS 8x3t http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology January 2016 92 / 103 Unconstrained Ordination Gradient Model and Ordination Extended Dissimilarities and Step-across 0.8 0.7 0.6 0.3 0.4 Flexible shortest path or their approximations, extended dissimilarities 0.5 Use step-across points to estimate their distance Community Dissimilarity 0.9 1.0 How different are sites that have nothing in common? +++ +++ ++++ ++ ++ ++ + + +++ + + ++ + ++ ++ ++ + ++ + ++ ++ + + +++ ++ ++ ++ ++ +++ + + ++ + + + ++ + ++ + + + ++ + + ++ + + + + + + + + + + + + + + + + + ++ ++ + ++ + + ++ + + + + + + + + + + + + + + ++ + +++ ++ ++ + + + + + ++ + + + + + + + + +++ + + ++ ++ + ++ ++ + ++ + + +++ + ++ + + ++ ++ ++ + + + + + ++ + ++ ++ ++ ++ + ++ + + + + + + + ++ ++ ++ + + + + + + + ++ + + ++ + + + + ++ ++ + + ++ ++ ++ + + ++ + + + + ++ ++ + + + ++ + ++ ++ + ++++++ ++++ ++ + ++++ ++++ ++++ + +++ + ++ ++ + + ++ + + ++ + ++ + +++++ + +++ + + + 2 4 6 8 http://cc.oulu.fi/ jarioksa/ (Oulu) 2.0 1.5 1.0 Extended Dissimilarity No shared species since rare species were not observed: Swan transformation estimates the probability of finding an unobserved species 0.5 Extended dissimilarity: use only one-site steps, do not update dissimilarities below a threshold 2.5 Gradient Distance + + +++ +++ + ++ + ++ ++ + + + ++ ++ + + + + + ++ + ++ ++ ++ + + + ++++ ++ + +++ + ++ + ++ ++ ++ ++ + + + + + ++ + + + + + + + + + + ++ +++ + ++ + + + ++ ++ + + + ++ ++ ++ ++ ++ ++ + ++ ++ + ++ + +++ ++ +++ + + + + + + + + + + + ++ + + + ++ + +++ + ++ ++ ++ ++ ++ + + + ++ ++ + + + ++ + + ++ ++ ++ ++ ++ + ++ + + ++ ++++ + ++ + ++ ++ + + + + ++ ++ + + + ++ + + ++ + + ++ +++ ++ ++ + + + + + + + + ++ + + + ++ ++ ++ +++ ++ + + + ++ + + + +++ ++ +++ + ++ + ++ ++ + ++ ++ ++ + + ++ ++ ++ ++ ++ + + ++ + + ++ + ++ ++ + ++ ++ + + ++ ++ + + ++ ++ + ++ + ++ + + ++ ++ ++ ++ + + + + + + + + + ++ + + ++ + ++ ++ + ++ ++ + + ++ + + + + + + + ++ + + + ++ + ++ + ++ + + + + ++ + ++ + ++ ++ + 2 + 4 6 Multivariate Analysis in Ecology Unconstrained Ordination 8 Gradient Distance January 2016 93 / 103 Gradient Model and Ordination Strong and Weak Ties strong weak stepacross swan Maximum dissimilarities (no shared species) are tied Strong tie treatment tries to keep tied values together and puts maximum dissimilarites to a circle Weak tie treatment allows breaking ties and straightens the axes: now the default in vegan, whereas earlier was impossible 8 x 1.5 sd 8 × 1.5 sd units, Gaussian binary response http://cc.oulu.fi/ jarioksa/ (Oulu) Multivariate Analysis in Ecology January 2016 94 / 103