Appendix 1. Mathematical formulas used for calculation of the

advertisement
Appendix 1. Mathematical formulas used for calculation of the surface area and volume of the spheroid, a
geometrical model of a pollen grain, results obtained with General Mixed Model fitting and methodology of
phylogenetic tree construction.
Pollen morphology characteristics
In our study, we used measurements of mean polar (P) and equatorial (E) axes of pollen grains to calculate the
characteristics of pollen morphology used in the analysis. As a geometrical model of a pollen grain we chose a
spheroid (i.e., ellipsoid with two equal radii, 0.5P, 0.5E, 0.5E) with a surface area S and volume V represented
by equations 1-4 (public communication: http://mathworld.wolfram.com/Spheroid.html).
2
(1) V  PE
6
(2) for P>E
S
E 2
2
 E2

EP
2z
arcsin( z )
 P2
(3) for P<E
1 z 
S

log 

2
2z
1 z 
(4) for P=E
S  P 2
E
where z  1   
P
2
P
where z  1   
E
2
Results of General Linear Mixed Model fitting
To check the robustness of our GLM analyses we performed General Linear Mixed Model fitting with
package nlme and lme4 implemented in R (Bates et al. 2015; Pinheiro et al. 2015; R Core Team 2015). In the
analysis the PC1poll was declared as a response variable, PCclim (PC1clim or PC2clim) as fixed continuous
predictor and ‘taxon’ as a random categorical predictor. The model selection was performed with lme4
package (function lmer) and followed by the test of effect significance for the best-fit model performed with
lme function (package nlme). The model parameters were estimated using Maximum Likelihood methods
(ML).
In case of both climate principal components PC1clim and PC2clim a model with no interaction included
i.e. PC1poll ~ PCclim + taxon, fit best with the data according to the Akkaike information criterion and
Akkaike weights wi (see Table 1 in Appendix 1). The pollen morphology described by PC1poll, turned out to
be significantly associated with climate variables p=0.0015 (PC1clim) and p=0.0143 (PC2clim). The principal
component PC1poll increased with increasing values of climate principal components (common slope ± SE:
0.107±0.033 for PC1clim and 0.070 ±0.028 for PC2clim). The results obtained with the General Linear Mixed
Model were very similar to results obtained with GLM analysis reported in the main text (see Results in the
1
main text).
Table 1. Appendix 1. Summary of model selection performed with General Linear Mixed Models. Best fitted
models (given in bold) are indicated by lowest AIC value (given in parentheses) and highest Akaike weight wi
(given in italics). Akaike weights indicate probability that a given model provide the best fit out of all tested
models.
Fixed
predictor
PC1clim
PC2clim
PC1poll ~ taxon
0.02 (175.3)
0.10 (175.3)
PC1poll ~ PCclim
0.00 (213.4)
0.00 (217.5)
PC1poll ~ PCclim + taxon
0.94 (167.7)
0.89 (170.9)
PC1poll ~ PCclim + taxon + taxon× PCclim
0.04 (173.9)
0.01 (179.9)
Model
Phylogenetic tree construction
We extracted all the sequences available for the species included in our study from the GenBank nucleotide
database. The data were clustered into sets of homologous sequences using BLASTclust toolkit (Biegert et al.
2006) and only phylogenetically informative clusters, i.e. those having four or more sequences, were taken
into account in further analyses (Table 1). The clusters were automatically aligned using mafft v. 7.221 (Katoh
& Toh 2008) and obvious misalignments were corrected manually. The model of nucleotide substitution and
partitioning scheme was selected using PartitionFinder (Lanfear et al. 2012).
The phylogeny was inferred using Maximum Likelihood approach implemented in RAxML v. 8.1.18
(Stamatakis 2006) for 69 species from eight taxonomic groups: Cardueae, Cayaponia, Centaureinae,
Lessingianthus, Matthioleae, Mutisia, Sisymbrieae and Stachys. No sufficient data was present in GenBank
for Burnellia and Vernonia and representatives of these genera were not included in our phylogenetic
analyses. In order to make sure that the algorithm was not caught in a local optimum we performed 200
independent searches starting from randomized stepwise addition parsimony trees. The resulting tree was
rooted in the middle of the branch connecting monocots and the remaining eudicots. In order to make the tree
ultrametric, and thus feasible for the comparative method, the root depth was arbitrary fixed to 100 and
branches were scaled using Penalized Likelihood method implemented in r8s v. 1.8 (Sanderson 2003). Prior to
the phylogenetically informed comparative analysis, we pruned from the tree species belonging to Mutisia
(two species), Lessingianthus (two species) and Sisymbrieae (five species) as the number of species was lower
2
than the total number of taxa considered in comparative analyses. As a result we obtained a composite
phylogeny for 60 species of 6 taxonomic groups: Cardueae, Cayaponia, Centaureinae, Matthioleae, Mutisia,
Stachys (see Appendix 4), with tree topology and branch lengths, based on the outcome of genetic data
analysis described above.
Table 2. Appendix 1. The phylogenetically informative clusters divided by taxon group: ITS – Internal
Transcribed Spacer; trnL – tRNA-Leu gene; trnL-trnF – trnL-trnF intergenic spacer; matK – maturase K gene;
rbcL – ribulose-1,5-bisphosphate carboxylase/oxygenase large subunit gene; rps16 – ribosomal protein S16
gene, intron; ndhF – NADH dehydrogenase subunit F gene.
Taxon
ITS
trnL
trnL-trnF
matK
rbcL
rps16
ndhF
Brunellia
0
0
0
0
0
0
0
Cardueae
13
12
0
6
6
0
6
Cayaponia
6
6
6
0
0
0
0
Centaureinae
15
2
0
5
3
0
3
Lessingianthus
2
2
0
1
0
0
1
Matthioleae
7
0
0
0
0
0
2
Muscari
0
0
0
0
2
0
1
Mutisia
3
5
0
3
3
0
2
Sisymbrieae
5
2
0
0
1
0
1
Stachys
10
9
9
0
3
9
0
Vernonia
0
0
0
0
0
0
0
3
References
Bates D., Maechler M., Bolker B. and Walker S. 2015. lme4: Fitting Linear Mixed-Effects Models.
R package version 1.1-9, https://CRAN.R-project.org/package=lme4.
Biegert A., Mayer C., Remmert M., Soding J. and Lupas A.N. 2006. The MPI Bioinformatics toolkit
for protein sequence analysis. Nucleic Acids Research 34: W335-W339.
Katoh K. and Toh H. 2008. Recent developments in the MAFFT multiple sequence alignment
program. Briefings in Bioinformatics 9: 286-298.
Lanfear R., Calcott B., Ho S.Y.W. and Guindon S. 2012. PartitionFinder: Combined Selection of
Partitioning Schemes and Substitution Models for Phylogenetic Analyses. Molecular Biology and
Evolution 29: 1695-1701.
Pinheiro J., Bates D., DebRoy S., Sarkar D. and Team R.C. 2015. nlme: Linear and Nonlinear Mixed
Effects Models. R package version 3.1-122, http://CRAN.R-project.org/package=nlme.
R Core Team. 2015. A Language and Environment for Statistical Computing. R Foundation for
Statistical Computing, Vienna, Austria.
Sanderson M.J. 2003. r8s: inferring absolute rates of molecular evolution and divergence times in the
absence of a molecular clock. Bioinformatics 19: 301-302.
Stamatakis A. 2006. RAxML-VI-HPC: Maximum likelihood-based phylogenetic analyses with
thousands of taxa and mixed models. Bioinformatics 22: 2688-2690.
4
Download