Appendix A Simulation procedures to assess the performance of the

advertisement
1
Appendix A
2
Simulation procedures to assess the performance of the beta-diversity and
3
phylogenetic community composition approaches
4
The first set of simulations was based on a community matrix of 100 sites and 50
5
species, whereas the second set of simulations was based on a grassland plant community
6
data set (Kembel and Cahill 2011) containing 76 species distributed across 27
7
communities in grasslands in Alberta, Canada. The data set also included five
8
environmental variables: habitat type (mixedgrass or fescue grassland), slope, aspect,
9
slope position, and moisture regime. All variables were continuous, with the exception of
10
habitat type which was coded as either 1 (fescue) or 2 (mixedgrass).
11
1 – Generate an environmental vector E (100 communities x 1) containing uniformly
12
distributed random values between 0 and 100.
13
2 – Generate a phylogenetically structured P vector (50 species x 1) using a Brownian
14
evolved “trait”. The trait was generated using the function fastBM in the R package
15
phytools (Revell 2012). P was then transformed to vary between -1 and 101.
16
3 – Generate a vector h (50 species x 1) containing uniformly distributed random values
17
between 0 and 30. These values represent the height (expected maximum abundance) of
18
any given species at its optimum.
19
4 – Generate a vector  (50 species x 1) containing normally distributed random values
20
with standard deviation 10 and a mean tolerance µtol. We considered two simulation
1
21
scenarios in which µtol was set either as 5 or 10, the latter providing a greater tolerance
22
and hence a weaker expected signal between species distributions, phylogeny and
23
environment.
24
5 – Generate a unimodal response for the jth species at the ith site as follows:
25
é -(Ei - Pj )2 ù
Lij = h j exp ê
ú
2s 2j
êë
úû
26
The values in Lij were the transformed into Poisson deviates in order to generate a
27
species distribution abundance matrix. Although the values in L represent abundances,
28
we have in this paper considered only the case of presence-absence data and therefore
29
values were transformed accordingly.
30
We have also considered the situation of two gradients. In this case, we generated
31
two independent phylogenetically structured traits (P i1 and Pi2) and two independent
32
environmentally structured environments (E i1 and Ei2) and L was generated as follows:
33
é -(Ei1 - Pj1 )2 ù
é -(Ei2 - Pj2 )2 ù
Lij = h j exp ê
ú exp ê
ú
2s 2j
2s 2j
êë
úû
êë
úû
34
Note that the same tolerance was used for each species in both gradients. In order
35
to assess the type I error and statistical power of the two frameworks, we considered four
36
scenarios: 1 – both phylogeny and environment were unimportant in structuring L 37
Instead of using a phylogenetically structured trait, we used a P simply containing
38
normally distributed values without any phylogenetic signal. After L was generated,
2
39
another E vector also containing uniformly distributed random values between 0 and 100
40
was used in the gradient analyses instead. 2 – only environment but not phylogeny was
41
important - Instead of using a phylogenetically structured trait, we generated L based on
42
a P vector containing normally distributed values without any phylogenetic signal and the
43
original E used to generate L was used in the gradient analyses. 3 – only phylogeny but
44
not environment structured species distributions – L was generated using the original P
45
vector but once L was generated, E was replaced by another randomly generated vector
46
that was then used in the gradient analysis instead of the original one. 4 – both phylogeny
47
and environment structured species distributions – L was generated using the original P
48
and E, which in turn were also used in the gradient analyses. For each scenario, we
49
generated 1000 sample matrices with 1 or 2 gradients based on two values for µtol (5 or
50
10), giving a total of 16 000 simulations involving the three test procedures (row, column
51
and row/column). Given that calculation of the phylogenetic beta-diversity matrices (100
52
x 100) was computationally intensive, especially given that they need to be recalculated
53
at each permutation involving columns randomizations, we limited the number of
54
permutations to 99.
55
Using the phylogenetic community composition approach (see results for the
56
empirical data set), both the row and column-based permutation test showed a significant
57
link between the grassland species distributions, phylogeny, and environmental affinities.
58
Moreover, because both the row and column based approaches were significant, we knew
3
59
that both the phylogeny and environment were related to the grassland distribution.
60
Because the phylogeny was related to species distributions, we needed to condition
61
species distributions on phylogenetic information that was capable of mimicking the
62
scenarios used in the first set of simulations: 1 – both phylogeny and environment were
63
unimportant in structuring L - we created a vector containing normally distributed values
64
without any phylogenetic signal and built a “phylogeny” (dendrogram) based on this
65
vector that was then applied in both frameworks; a row permuted version of matrix E
66
was used instead of the original matrix of environmental variable. 2 – only environment
67
but not phylogeny was important – a random phylogeny was created in the same way as
68
the first case, but the original matrix E was used in the gradient analysis. 3 – only
69
phylogeny but not environment structured species distributions – we created a vector
70
containing normally distributed values with a phylogenetic signal and built a “phylogeny”
71
(dendrogram) based on this vector that was then used in the gradient analyses; a row
72
permuted version of E was used instead of the original one. 4 – both phylogeny and
73
environment structured species distributions – we used a “phylogeny” created in the same
74
way as in scenario 3 and E was not manipulated. Two types of phylogenetically
75
structured vectors were used in this latter scenario. One containing a weak signal based
76
on a “trait” evolved under a Brownian motion model, and another containing a strongly
77
phylogenetically conserved trait. The conserved trait was generated by manipulating the
78
phylogenetic tree according to Pagel’s (1999) delta transformation. By giving delta a
4
79
value of 0.01, branch lengths were much shorter than the original values and using this
80
tree allowed us to generate highly phylogenetically conserved traits. We conducted 1000
81
simulations for each scenario. Although the PCC approach is computationally much
82
faster, we restricted the number of permutations to 99 so that type I error and power
83
estimates (the number of rejections over 1000) were comparable across both frameworks.
84
5
Download