Appendix 1. Mathematical formulas used for calculation of the surface area and volume of the spheroid, a geometrical model of a pollen grain, results obtained with General Mixed Model fitting and methodology of phylogenetic tree construction. Pollen morphology characteristics In our study, we used measurements of mean polar (P) and equatorial (E) axes of pollen grains to calculate the characteristics of pollen morphology used in the analysis. As a geometrical model of a pollen grain we chose a spheroid (i.e., ellipsoid with two equal radii, 0.5P, 0.5E, 0.5E) with a surface area S and volume V represented by equations 1-4 (public communication: http://mathworld.wolfram.com/Spheroid.html). 2 (1) V PE 6 (2) for P>E S E 2 2 E2 EP 2z arcsin( z ) P2 (3) for P<E 1 z S log 2 2z 1 z (4) for P=E S P 2 E where z 1 P 2 P where z 1 E 2 Results of General Linear Mixed Model fitting To check the robustness of our GLM analyses we performed General Linear Mixed Model fitting with package nlme and lme4 implemented in R (Bates et al. 2015; Pinheiro et al. 2015; R Core Team 2015). In the analysis the PC1poll was declared as a response variable, PCclim (PC1clim or PC2clim) as fixed continuous predictor and ‘taxon’ as a random categorical predictor. The model selection was performed with lme4 package (function lmer) and followed by the test of effect significance for the best-fit model performed with lme function (package nlme). The model parameters were estimated using Maximum Likelihood methods (ML). In case of both climate principal components PC1clim and PC2clim a model with no interaction included i.e. PC1poll ~ PCclim + taxon, fit best with the data according to the Akkaike information criterion and Akkaike weights wi (see Table 1 in Appendix 1). The pollen morphology described by PC1poll, turned out to be significantly associated with climate variables p=0.0015 (PC1clim) and p=0.0143 (PC2clim). The principal component PC1poll increased with increasing values of climate principal components (common slope ± SE: 0.107±0.033 for PC1clim and 0.070 ±0.028 for PC2clim). The results obtained with the General Linear Mixed Model were very similar to results obtained with GLM analysis reported in the main text (see Results in the 1 main text). Table 1. Appendix 1. Summary of model selection performed with General Linear Mixed Models. Best fitted models (given in bold) are indicated by lowest AIC value (given in parentheses) and highest Akaike weight wi (given in italics). Akaike weights indicate probability that a given model provide the best fit out of all tested models. Fixed predictor PC1clim PC2clim PC1poll ~ taxon 0.02 (175.3) 0.10 (175.3) PC1poll ~ PCclim 0.00 (213.4) 0.00 (217.5) PC1poll ~ PCclim + taxon 0.94 (167.7) 0.89 (170.9) PC1poll ~ PCclim + taxon + taxon× PCclim 0.04 (173.9) 0.01 (179.9) Model Phylogenetic tree construction We extracted all the sequences available for the species included in our study from the GenBank nucleotide database. The data were clustered into sets of homologous sequences using BLASTclust toolkit (Biegert et al. 2006) and only phylogenetically informative clusters, i.e. those having four or more sequences, were taken into account in further analyses (Table 1). The clusters were automatically aligned using mafft v. 7.221 (Katoh & Toh 2008) and obvious misalignments were corrected manually. The model of nucleotide substitution and partitioning scheme was selected using PartitionFinder (Lanfear et al. 2012). The phylogeny was inferred using Maximum Likelihood approach implemented in RAxML v. 8.1.18 (Stamatakis 2006) for 69 species from eight taxonomic groups: Cardueae, Cayaponia, Centaureinae, Lessingianthus, Matthioleae, Mutisia, Sisymbrieae and Stachys. No sufficient data was present in GenBank for Burnellia and Vernonia and representatives of these genera were not included in our phylogenetic analyses. In order to make sure that the algorithm was not caught in a local optimum we performed 200 independent searches starting from randomized stepwise addition parsimony trees. The resulting tree was rooted in the middle of the branch connecting monocots and the remaining eudicots. In order to make the tree ultrametric, and thus feasible for the comparative method, the root depth was arbitrary fixed to 100 and branches were scaled using Penalized Likelihood method implemented in r8s v. 1.8 (Sanderson 2003). Prior to the phylogenetically informed comparative analysis, we pruned from the tree species belonging to Mutisia (two species), Lessingianthus (two species) and Sisymbrieae (five species) as the number of species was lower 2 than the total number of taxa considered in comparative analyses. As a result we obtained a composite phylogeny for 60 species of 6 taxonomic groups: Cardueae, Cayaponia, Centaureinae, Matthioleae, Mutisia, Stachys (see Appendix 4), with tree topology and branch lengths, based on the outcome of genetic data analysis described above. Table 2. Appendix 1. The phylogenetically informative clusters divided by taxon group: ITS – Internal Transcribed Spacer; trnL – tRNA-Leu gene; trnL-trnF – trnL-trnF intergenic spacer; matK – maturase K gene; rbcL – ribulose-1,5-bisphosphate carboxylase/oxygenase large subunit gene; rps16 – ribosomal protein S16 gene, intron; ndhF – NADH dehydrogenase subunit F gene. Taxon ITS trnL trnL-trnF matK rbcL rps16 ndhF Brunellia 0 0 0 0 0 0 0 Cardueae 13 12 0 6 6 0 6 Cayaponia 6 6 6 0 0 0 0 Centaureinae 15 2 0 5 3 0 3 Lessingianthus 2 2 0 1 0 0 1 Matthioleae 7 0 0 0 0 0 2 Muscari 0 0 0 0 2 0 1 Mutisia 3 5 0 3 3 0 2 Sisymbrieae 5 2 0 0 1 0 1 Stachys 10 9 9 0 3 9 0 Vernonia 0 0 0 0 0 0 0 3 References Bates D., Maechler M., Bolker B. and Walker S. 2015. lme4: Fitting Linear Mixed-Effects Models. R package version 1.1-9, https://CRAN.R-project.org/package=lme4. Biegert A., Mayer C., Remmert M., Soding J. and Lupas A.N. 2006. The MPI Bioinformatics toolkit for protein sequence analysis. Nucleic Acids Research 34: W335-W339. Katoh K. and Toh H. 2008. Recent developments in the MAFFT multiple sequence alignment program. Briefings in Bioinformatics 9: 286-298. Lanfear R., Calcott B., Ho S.Y.W. and Guindon S. 2012. PartitionFinder: Combined Selection of Partitioning Schemes and Substitution Models for Phylogenetic Analyses. Molecular Biology and Evolution 29: 1695-1701. Pinheiro J., Bates D., DebRoy S., Sarkar D. and Team R.C. 2015. nlme: Linear and Nonlinear Mixed Effects Models. R package version 3.1-122, http://CRAN.R-project.org/package=nlme. R Core Team. 2015. A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. Sanderson M.J. 2003. r8s: inferring absolute rates of molecular evolution and divergence times in the absence of a molecular clock. Bioinformatics 19: 301-302. Stamatakis A. 2006. RAxML-VI-HPC: Maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22: 2688-2690. 4