ele12346-sup-0001-AppendixS1

advertisement
1
Appendix S1 Model performance
2
1) Additional details of data collection and analysis
3
4
5
6
7
8
9
10
11
12
13
14
Exclusion of species pairs with weak genetic divergence: Sister pairs were included if cytochrome b
sequences (of at least 500 base pairs in length) were available for both members of a sister pair, and if
GTR- distances (calculated in PAUP 4.0b10; Swofford 2002) exceed 0.75 percent divergence. Very
young sisters, whose ages are estimated with error (due to variability in coalescent times and the
stochastic nature of sequence evolution), can have large impacts on parameter values estimated from
the character evolution models that we use. Here we chose to delete sisters differing by less than 0.75
percent divergence in order to eliminate the most problematically dated sisters. This cut-off
corresponds to the 350,000 year average difference between cytochrome b coalescence and actual
population splitting (Moore 1995). Sisters with less than 0.75 percent divergence are unlikely to
represent phyloegenetically distinct (e.g. reciprocally monophyletic) taxa. This cut-off also eliminates
most sisters whose young ages of estimated divergence reflect recent mtDNA introgression across
species boundaries.
15
16
17
18
19
20
21
22
Phylogeny generation: In order to determine the approximate age of each sister pair, a phylogeny was
created in BEAST version1.7.5 (Drummond et al. 2012)from cytochrome b sequence data using a
lognormal relaxed clock (Yule speciation prior) with a GTR- model. The tree topology was fixed (using
Barker et al. 2004 for interfamily relationships and a large number of published molecular phylogenies
for relationships between species and genera within families) and BEAST was used to estimate branch
lengths using a 2% clock (Weir & Schluter 2008). Bayesian analyses were run for 20 million generations,
and were sampled every 1000 generations following a burn-in of 10 million generations. Median node
heights were used to estimate species ages (see Appendix S2).
23
24
25
26
27
28
Details of climatic data sorting: Because seasonality may not be synchronous in different geographic
regions, the monthly values for each climatic measure were sorted from highest to lowest for each
locality within a species range, and analysis were performed on sorted values. For example, if June had
the highest mean monthly precipitation at a given locality, then June would be placed in first rank for
this measure. Sorting reduced overestimation of climatic divergence between sister pairs with
asynchronous seasonality.
29
30
31
Climatic divergences: L2 in Equation 1 refers to the degree of climatic divergence for each sister pair
(|X1i – X2i|) divided by its expected standard deviation under the best fit model (which was an OU
model for PC1 to PC3):
32
Equation S1
33
34
35
36
where βi and Ti are the evolutionary rate and age of divergence respectively for sister pair i, and α is the
evolutionary constraint parameter. β and α take the maximum likelihood estimates obtained from
fitting the climatic data (Table 1). For PC2, the best fit model was OUnull, and all βi take the same value
estimated from the model. For PC1 and PC3, the best fit model was OUβ-linear and βi is equal to:
37
Equation S2
𝐿2 =
|X1𝑖− X2𝑖|
β
√ 𝑖 (1−𝑒𝑥𝑝(−2𝛼𝑇𝑖 ))
𝛼
β𝑖 = 𝑏𝛽 𝐿1𝑖 + 𝑐𝛽 ,
,
38
39
where b and c are the slope and intercept parameters describing how β changes as a linear function of
the latitude for each sister pair (L2i).
40
41
2) Model bias
42
43
44
45
46
47
48
49
50
OU models applied to whole phylogenies (i.e. with a single trait optima applied to all species) are
reported to suffer bias in the estimation of the parameter α (Thomas in press). Bias in α is detected by
simulating data under a BM model (where α = 0), and then fitting the simulated data to an OU model.
Bias occurs when the OU fit results in a non-zero estimate of α across a series of simulated datasets.
Here we estimate potential bias for our three PC’s. We simulated 2000 climatic datasets under the BMnull
model for each of our three climatic PC’s. Data were simulated using the maximum likelihood estimates
of evolutionary rate under the BMnull model for each PC and each simulated dataset was then fit to more
complex models (OUnull and OUβ-linear). Median values of α parameters (α for OUnull and OUβ-linear models)
were estimated to very close to zero for all three PC’s suggesting almost no bias α (Table S1).
51
52
Table S1 Estimates of bias in the alpha parameter of OU models.
Trait
PC1
PC1
PC2
PC2
PC3
PC3
Simulated
Model
BMnull
BMnull
BMnull
BMnull
BMnull
BMnull
Test Model
OUnull
OUβ-linear
OUnull
OUβ-linear
OUnull
OUβ-linear
Bias in alpha
0.00233
0.00006
0.00019
0.00000
0.00136
0.00008
53
54
2) Model selection thresholds and type I error rates
55
56
57
58
59
60
61
The model with the lowest AICc (or AIC) is chosen as the best fit of the candidate models. However, less
optimal models may have only slightly higher AICc values than the best fit model, and may need to be
considered. A general “rule of thumb” is to reject all candidate models with AICc values greater than 2
units above the best fit model (∆AICc). However, this threshold value has been reported to result in
elevated Type I error rates in some contexts (e.g. Rabosky 2006), and an appropriate rejection threshold
may depend on the models being compared, the sample size, the distribution of species ages in the
dataset and other factors (see Rabosky 2006; Gavin et al. 2014).
62
63
64
65
66
67
68
We used simulation to calculate Type I error for BM and OU models applied to our climatic dataset, and
used these simulations to calculate an appropriate ∆AICc threshold necessary to reject less parameter
rich models (“null” models) in favor of more parameter rich models (“test” models). Our approach
follows that of Rabosky (2006). Type I error is the probability of rejecting a true “null” hypothesis in
favor of a more parameter rich model. To calculate Type I error we used the MLE of parameters fit to
the data under the null models (BMnull and OUnull) in which climatic rates do not vary with latitude, and
simulated climatic Euclidean distances using the same number of sister pairs as in our actual dataset,
69
70
71
72
73
with the same ages and latitudes as our actual data. Four-thousand datasets were simulated in EvoRAG
for each of the three climatic PC’s. For each simulated dataset, we then calculated the likelihood fit to
the set of null models and to models in which climatic rates vary with latitude (BMlinear, OUβ-linear). The
Type I error is the proportion of simulations for which AIC was lower (and thus favored) for the rate
variable models.
74
75
76
77
78
79
80
81
82
Type I errors are shown in Table S2 and are always lower than 0.17 for the three climatic PC’s. A type I
error rate less than 0.05 indicates that the “null” model can be rejected in favor of the “test” model
whenever the delta AIC between “null” and “test” is greater than 0. When type I error rates exceed 0.05,
then a positive threshold for ∆AICc scores between the “null” and “test” models needs to be established
in order to maintain a type I error rate ≤ 0.05. The 95th percentile of the distribution of delta AIC values
between the set of null and alternative models defines this threshold level. In order to reject rate
constancy across latitude in favor of a model in which rates vary with latitude, a ∆AICc value of 1.5 to 1.9
is required, depending on the PC (Table S2). Here we use more conservative value of 1.9 throughout for
all PC’s.
83
84
Table S2 Estimates of Type I error, and the threshold ∆AICc required to reject a null model of no effect
of latitude on climatic rates with a Type I error rate ≤ 0.05.
Trait
PC1
PC1
null
model
BMnull
OUnull
Type 1
threshold
error
∆AICc
0.16
1.9
0.12
1.5
PC2
PC2
BMnull
OUnull
0.15
0.13
1.9
1.5
PC3
PC3
BMnull
OUnull
0.17
0.14
1.9
1.7
85
86
3) Power analysis
87
88
89
90
91
92
93
94
95
96
At the request of reviewers we performed retrospective power analyses for our BM and OU models, but
note that such analyses are controversial, and many statisticians deem them logically flawed when used
to try to interpret non-significant results (e.g. Nakagawa & Foster 2004). To determine the retrospective
statistical power under our hypothesis that rates of climatic evolution vary with latitude, we found the
maximum likelihood parameter estimates of the best fit gradient model (BMlinear, OUβ-linear ) to our data
(performed separately to PC1, PC2, and PC3), simulated 1000 climatic datasets under those parameter
estimates and then fit the gradient models and null models (BMnull and OUnull) to each simulated dataset.
Statistical power is the proportion of simulations which correctly rejected the null models in favor of the
gradient models. We used a ∆AICc threshold of 1.9 (see above) as our rejection criteria in order to
maintain a Type I error rate ≤ 0.05.
97
98
99
Result are shown in Table S3 and indicate very high retrospective statistical power for PC1 and PC3. For
PC2 retrospective statistical power was very low. This is not surprising given that OUnull was best fit for
this PC. The retrospective statistical power for PC2 is not informative about whether the OUnull was best
100
101
102
103
fit because it really is the true model, or because OUβ-linear simply lacked statistical power. However,
visual inspection of the raw data (shown in Fig. 2) do not show any clear differences in how climatic
divergence accumulates through time for tropical and temperate sister pairs, and we conclude that if
there is a latitudinal effect that went undetected due to low power, then the effect was likely very weak.
104
105
106
107
Table S3 Retrospective statistical power for climatic PC’s 1 to 3. The test model is the alternative model
under which data is simulated. The “null” model is the less parameter rich model fit to the simulated
data. The statistical power is calculated using a ∆AICc threshold value of 2.3 in order to reject the “null”
model in favor of the “test” model.
Trait
PC1
PC2
PC3
Test models
BM linear & OUβ-linear
BM linear & OUβ-linear
BM linear & OUβ-linear
Null models
BMnull & OUnull
BMnull & OUnull
BMnull & OUnull
Statistical power using critical
∆AICc value of 2.3
0.893
0.058
0.985
108
109
110
4) Analysis without Neotropical migrants included
111
112
113
Model fits when excluding Neotropical migrants (only resident species and species that migrate locally
within the tropics, or within the Nearctic are included) had little effect on model fits. Each PC continued
to favor the same model as when Neotropical migrants were included (Table S4).
114
115
116
117
118
119
Table S4 Support for BM and OU models of climatic niche evolution when excluding Neotropical
migrants. ΔAICc scores (AICc for each model – smallest AICc score) and Akaike Weights (wAICc) are used
as metrics of model support. The best-fit model has the smallest ΔAICc value of 0 (bold). Akaike weights
indicate the probability of fit for each model. N indicates the number of parameters in each model. β
slope describes how the evolutionary rates changes with latitude.
MODEL
BMnull
PC1
PC2
N ΔAICc wAICc β slope ΔAICc
wAICc
β slope
1
37.48
0.000 NA
19.25
0.000 NA
BMlinear
2
19.58
0.000
0.173
18.04
0.000
OUnull
2
4
14.69
0.00
0.001 NA
0.999
3.798
0.00
1.75
OUβ-linear
PC3
ΔAICc
wAIC
β slope
36.52
0.000 NA
0.054
10.18
0.006
0.706 NA
0.294
0.084
20.57
0.00
0.000 NA
0.994
0.082
0.025
120
121
Literature Cited
122
123
124
125
Barker, F.K., Cibois, A., Schikler, P.A., Feinstein, J., & Cracraft, J. (2004). Phylogeny and diversification of
the largest avian radiation. Proc. Natl. Acad. Sci.,101, 11040-110453.
Drummond, A.J., Suchard, M.A., Xie, D. & Rambaut. (2012). A Bayesian phylogenetics with BEAUti and
the BEAST 1.7. Mol. Biol. Evol., 29, 1969-1973.
126
127
128
129
130
131
132
133
134
135
136
137
Moore, W. S. 1995. Inferring phylogenies from mtDNA variation: mitochondrial gene trees vs. nuclear
gene trees. Evolution. 49, 718–726.
Nakagawa, S. & Foster, T. M. (2004). The case against retrospective statistical power analyses with an
introduction to power.
Rabosky, D. (2006). Likelihood methods for detecting temporal shorts in diversification rates. Evolution.
60: 1152-1165.
Swofford, D. L. (2002). PAUP* 4.0b10: phylogenetic analysis using parsimony (*and other methods).
Sunderland, MA: Sinauer Associates.
Thomas, G. H., N. Cooper, C. Venditti, A. Meade, & R. P. Freckleton. (In press). Bias and measurement
error in comparative analyses: a case study with the Ornstein Ulhenbeck model. bioRxiv
http://dx.doi.org/10.1101/004036
Weir, J.T. & Schluter, D. (2008). Calibrating the avian molecular clock. Mol. Ecol., 17, 2321– 2328.
Download