1 Electronic Supplementary Material –Text and Figure Captions 2 3 Bytebier et al. - Estimating the age of fire in the Cape flora of South Africa from 4 an orchid phylogeny 5 6 1 7 Expanded Materials and Methods 8 9 Phylogenetic analyses 10 Phylogenetic relationships were inferred for 7 outgroup and 136 ingroup taxa, 11 representing 70% of all recognised Disa species and infraspecific taxa. One nuclear 12 and two plastid gene regions were sequenced and compiled in a matrix with 4094 13 characters, 1096 (26.8%) of which were parsimony informative. In a parsimony 14 analysis, 87 nodes of 142 (61%) were supported with a bootstrap support values of 15 75% or higher, while the topology resulting from a Bayesian inference analysis had 16 101 (71%) nodes with a posterior probability of 0.95 or above. The phylogenetic 17 analysis is discussed in detail in (Bytebier et al. 2007). 18 19 Molecular dating 20 Several methods have been proposed for estimating absolute divergence times in 21 phylogenies, ranging from strict molecular clocks to methods that allow each lineage 22 to evolve at an own rate (Renner 2005; Rutschmann 2006). Since a robust age 23 estimation was critical to this study, we applied two widely used algorithms that make 24 different assumptions: Penalized Likelihood (Sanderson 2002) and BEAST 25 (Drummond & Rambaut 2007). In all cases phylogenetic uncertainty was taken into 26 account. 27 28 A strict molecular clock for the maximum a posteriori (MAP) tree was rejected by a 29 likelihood ratio test (Felsenstein 1981). The Penalized Likelihood analysis was run in 30 the software r8s v. 1.71 (Sanderson 2002). A cross-validation procedure was 31 conducted to identify the best-fitting smoothing value for MAP tree, by testing 14 2 32 different smoothing values ranging from -3.0 to 3.5 log10 (smooth). The optimal value 33 found (0.01) was then used for independently dating 1,000 randomly selected trees 34 from the stationary Bayesian sample, using the settings num_restarts=3, 35 num_time_guesses=3, penalty=yes, maxiter=2000. We then performed 5 independent 36 runs in BEAST v. 1.5.4 (Drummond & Rambaut 2007), with 10 million generations 37 each, sampling every 500th tree, using a Yule process of speciation as tree prior, an 38 uncorrelated lognormal molecular clock, and the same substitution and site 39 heterogeneity models (GTR+ +I) as for the MrBayes analysis (Bytebier et al. 2007). 40 Convergence of the independent runs and effective sample sizes (>100 for all 41 parameters) was assessed in Tracer v 1.5 (Rambaut & Drummond 2007). Summary 42 statistics were summarised using TreeAnnotator v 1.5.4 (Drummond & Rambaut 43 2007). 44 45 For the PL analysis, the tree with maximum a posteriori score among 45,000 trees 46 from a post burn-in Bayesian sample was provided as reference for summarizing the 47 results, whereas for the BEAST analysis it was the tree with the highest sum of clade 48 probabilities. 49 50 Coding of biomes, habitats and rainfall seasonality 51 Disa species occurring in southern Africa were coded for occurrence in biomes and 52 habitats by Linder et al. (Linder et al. 2005). The biomes (Rutherford & Westfall 53 1986; Rutherford 1997) summarize climatic, edaphic and biotic information into 54 broad descriptive units (table S3). Species occurring outside of southern Africa were 55 coded to the same biomes according to information in Linder (Linder 1981a; Linder 56 1981e; Linder 1981c; Linder 1981b; Linder 1981d), Flora of Tropical East Africa 3 57 (Summerhayes 1968), Flora Zambesiaca (la Croix & Cribb 1995), Flore d’Afrique 58 Centrale (Geerinck 1984) and Orchids of Malawi (la Croix et al. 1991), and on the 59 personal field experience of the authors. 60 61 Linder et al. (Linder et al. 2005) coded the southern African orchid species to the 62 following habitats: Grassland, Woodland, Subalpine, Marsh, Scrub, Mature Heath, 63 Postfire, Streambank and Epilithic (table S4). To this, we added Southeast Cloud 64 Zone habitat. The importance of southeast clouds as a source of water during the dry 65 summers in the CFR was documented by Marloth (Marloth 1904). Although we could 66 not trace information on the distribution of these clouds, our field experience indicates 67 that they occur frequently along the summits of mountains, within sight of the Indian 68 Ocean. As for the biomes, the habitats for the species occurring outside of southern 69 Africa were coded based on comments in monographic and floristic treatments as well 70 as our field experience. 71 72 All Disa species were scored present or absent for winter, all year, summer or 73 bimodal rainfall. For species occurring in southern Africa, the distribution maps from 74 Linder & Kurzweil (Linder & Kurzweil 1999) were superimposed on the rainfall 75 seasonality map of Schulze (Schulze 1997). For species occurring outside of southern 76 Africa, rainfall seasonality data was extracted from Linder (Linder 1981a; Linder 77 1981e; Linder 1981c; Linder 1981b; Linder 1981d). 78 79 Character state (presence/absence) for each species and each categorical variables of 80 biome, habitat and rainfall seasonality are reported in table S5. 81 4 82 Character optimisations 83 We used binary coding (absence/presence) to allow for polymorphic states at internal 84 nodes (Hardy & Linder 2005). Thus each categorical variable was coded as either 85 present or absent (e.g. each species was scored as present or absent for fynbos, etc). 86 87 Maximum likelihood optimisation of ancestral states was performed with the 88 “asymmetric Markov k-state 2 parameter model” and with “root state frequencies 89 same as equilibrium” implemented in Mesquite version 2.6 (Maddison & Maddison 90 2009). This model has one parameter for the rate of change from state 0 to 1 (the 91 "forward" rate) and another for the rate of change from 1 to 0 (the "backward" rate) 92 and thus, allows a bias in gains versus losses. We compared the lnL scores of a two- 93 rate (forward and backward rates independent) and a one-rate (forward and backward 94 rates constrained to be equal) model for each character. The accuracy of parameter 95 estimation depends on the amount of data available as well as model complexity 96 (Mooers & Schluter 1999). For several characters (but not all) the two-rate model 97 resulted in a significantly improved fit and we therefore preferred this model since it 98 makes fewer assumptions (i.e. it does not assume the forward and backward rates of 99 characters change to be equal). We are aware, however, that the one-rate model 100 handles trees with few transitions and an imbalance of character states better than the 101 two-rate model (Mooers & Schluter 1999) and we therefore also checked the 102 optimisations with this model. In most cases this did not give a significantly different 103 result and if it did, we report these differences explicitly. Initially, all characters were 104 optimised using these assumptions over the MAP tree. Then, to take into account 105 phylogenetic uncertainty, each character was optimised over a sample of 1,000 5 106 chronograms obtained from the Penalized Likelihood analysis, and average 107 frequencies across trees calculated for each node of the MAP tree. 108 109 Randomness of the distribution 110 We tested for phylogenetic conservative characters by calculating the number of steps 111 each character required for a parsimony reconstruction over the MAP tree, and 112 comparing this to the distribution of minimum steps for the same character reshuffled 113 1000 times using Mesquite (Maddison & Maddison 2009), while keeping the 114 proportions of the states constant. If the number of steps of the observed distribution 115 was outside 950 (95%) of the randomised state distributions, then the Null hypothesis 116 that the character states were phylogenetically randomly distributed was rejected. 117 118 6 119 Supporting References 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 Bytebier, B., Bellstedt, D. U. & Linder, P. H. 2007 A molecular phylogeny for the large African orchid genus Disa. Mol Phylogenet Evol 43, 75-90. Drummond, A. & Rambaut, A. 2007 BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evol Biol 7, 214. Felsenstein, J. 1981 Evolutionary trees from DNA-sequences - a maximum-likelihood approach. J Mol Evol 17, 368-376. Geerinck, D. 1984 Orchidaceae (premiere partie). Flore d'Afrique Centrale (ZaireRwanda-Burundi). Meise, Belgium: Jardin botanique national de Belgique. Hardy, C. R. & Linder, H. P. 2005 Intraspecific variability and timing in ancestral ecology reconstruction: A test case from the Cape flora. Syst Biol 54, 299-316. la Croix, I. & Cribb, P. J. 1995 163. Orchidaceae. Flora Zambesiaca. London: Flora Zambesiaca Managing Committee. la Croix, I. F., la Croix, E. A. S. & la Croix, T. M. 1991 Orchids of Malawi. Rotterdam & Brookfield: A.A. Balkema. Linder, H. P. 1981a Taxonomic Studies in the Disinae (Orchidaceae). IV. A Revision of Disa Berg. sect. Micranthae Lindl. Bull Jard Bot Natl Belg 51, 255-346. Linder, H. P. 1981b Taxonomic studies in the Disinae. V. A revision of the genus Monadenia. Bothalia 13, 339-363. Linder, H. P. 1981c Taxonomic studies in the Disinae. VI. A revision of the genus Herschelia. Bothalia 13, 365-388. Linder, H. P. 1981d Taxonomic studies in the Disinae: 2. A revision of the genus Schizodium Lindl. J. S. Afr. Bot. 47, 339-371. Linder, H. P. 1981e Taxonomic studies on the Disinae. III. A revision of Disa Berg. excluding Sect. Micranthae Lindl. Contr Bolus Herb 9. Linder, H. P. & Kurzweil, H. 1999 Orchids of southern Africa. Rotterdam/Brookfield: A.A. Balkema. Linder, H. P., Kurzweil, H. & Johnson, S. D. 2005 The Southern African orchid flora: composition, sources and endemism. J Biogeogr 32, 29-47. Maddison, W. P. & Maddison, D. R. 2009 Mesquite: a modular system for evolutionary analysis Marloth, R. 1904 Results of experiments on Table Mountain for ascertaining the amount of moisture deposited from southeaster clouds. Trans S African Philos Soc 14, 403-408. Mooers, A. O. & Schluter, D. 1999 Reconstructing ancestor states with maximum likelihood: Support for one- and two-rate models. Syst Biol 48, 623-633. Rambaut, A. & Drummond, A. J. 2007 Tracer Renner, S. S. 2005 Relaxed molecular clocks for dating historical plant dispersal events. Trends Plant Sci 10, 550-558. Rutherford, M. C. 1997 Categorisation of biomes. In Vegetation of Southern Africa (ed. R. M. Cowling, D. M. Richardson & S. M. Pierce), pp. 91-98. Cambridge, U.K.: Cambridge University Press. Rutherford, M. C. & Westfall, R. 1986 Biomes of southern Africa - an objective categorisation. Mem Bot Surv South Africa 54, 1-98. Rutschmann, F. 2006 Molecular dating of phylogenetic trees: A brief review of current methods that estimate divergence times. Divers Distrib 12, 35-48. 7 166 167 168 169 170 171 172 173 Sanderson, M. J. 2002 Estimating absolute rates of molecular evolution and divergence times: a penalized likelihood approach. Mol Biol Evol 19, 101-109. Schulze, R. E. 1997 South African atlas of agrohydrology and climatology. Pretoria, South Africa: Water Research Commission. Summerhayes, V. S. 1968 Orchidaceae (Part I). Flora of Tropical East Africa. London, U.K.: Crown Agents for Oversea Governments and Administrations. 174 8 175 Figure Captions 176 177 Figure S1 Maximum a posteriori (MAP) cladogram yielded under a Bayesian 178 analysis of phylogeny among 136 taxa within the genus Disa. Dashed lines indicate 179 nodes with Bayesian posterior probability < 0.50. Nodes numbers are shown within 180 circles and referred to in Table S1. Intrageneric sections within Disa are given 181 following the species names. For a fully annotated tree see (Bytebier et al. 2007). 182 183 Figure S2 Molecular chronogram of Disa, as inferred under Penalized Likelihood 184 implemented in r8s. Green bars at node intersections indicate 95% confidence 185 intervals of ages (N = 1,000). The tree topology is the same as in Fig. S1, but with 186 branch lengths equal to mean ages. Ages in million of years from present. 187 188 Figure S3 Molecular chronogram of Disa, as inferred under a relaxed Bayesian clock 189 implemented in BEAST. Green bars at node intersections indicate 95% highest 190 posterior densities of ages (N = 16,000). Ages in million of years from present. 191 192 Figure S4 Summary of ancestral state optimisations for biome, as inferred using the 193 asymmetrical (two-rate) likelihood model implemented in Mesquite over a sample of 194 PL chronograms (N = 1,000). Habitat shifts supported by average likelihoods ≥ 0.70 195 are indicated by an asterisk; all others have average likelihoods ≥ 0.95. 196 197 Figure S5 Summary of ancestral state optimisations for rainfall, as inferred using the 198 asymmetrical (two-rate) likelihood model implemented in Mesquite over a sample of 9 199 PL chronograms (N = 1,000). Habitat shifts supported by average likelihoods ≥ 0.70 200 are indicated by an asterisk; all others have average likelihoods ≥ 0.95. 201 202 Figure S6 Summary of ancestral state optimisations for habitat, as inferred using the 203 asymmetrical (two-rate) likelihood model implemented in Mesquite over a sample of 204 PL chronograms (N = 1,000). Habitat shifts supported by average likelihoods ≥ 0.70 205 are indicated by an asterisk; all others have average likelihoods ≥ 0.95. 10