Electronic supplementary material Article: The effects of island ontogeny on species diversity and phylogeny Luis M. Valente, Rampal S. Etienne, Albert B. Phillimore Contains: Supplementary methods Simulation results of symmetrical and constant area ontogenies Supplementary Figures S1-S9 Supplementary Tables S1-S4 Supplementary methods Island ontogeny The island has a limited lifespan, denoted by tend. We modelled the typical ontogeny of an oceanic island by allowing island area (A) to vary with island age t. We simulated islands under three contrasting island ontogeny scenarios: volcanic, symmetrical and constant area (Fig. 1). In the volcanic island case, the life cycle of the island follows that of a typical oceanic island of volcanic origin [1], with a initial rapid period of growth, followed by a more protracted erosive period during which area gradually declines until complete island submergence. Under this model, area is zero at island birth and rapidly increases to a maximum value that occurs at t = tmax = ptend, (where we took p = 0.1) from which point it steadily declines until area equals zero again at t = tend. In the symmetrical area scenario, area is zero at island birth and increases until a maximum value that occurs at t = tmax = 0.5tend from which point it declines until zero at t = tend. Under the symmetrical area case, the period of island growth has the same length as the period of island decay. In the volcanic and symmetrical island ontogeny scenarios, the area At at island age t is given by: 𝐴𝑡 = 𝐴max (𝑡 𝑡 end 𝑐 𝑐 ) (1 − 𝑡 𝑡 end 𝑐 𝑑 𝑑 (𝑐+𝑑) (𝑐+𝑑) where c is given by: 𝑐= 𝑓𝑡max 1+𝑓 𝑑= 𝑡max 1+𝑓 and d is given by: and f by: 𝑓= 1 𝑡end ⁄(1 − 1 𝑡end ) )𝑑 Furthermore, Amax is maximum island area and tmax is the island age where maximum area is reached. We set tend = 10, and therefore tmax = 1 and tmax = 5 for the volcanic and the symmetrical area island ontogeny scenarios, respectively. In the constant area scenario, A remains constant through time and A > 0. This can be viewed as a null model for island ontogeny. In the main text we focus on the results of the volcanic ontogeny, but we note cases where the other ontogenies give rise to very different results and report their full results here in the supplementary material. Simulation algorithm We simulated under continuous time, starting at island birth (t0) and ending at tend. At t0 there are no species on the island (n = 0). As immigration, cladogenesis and extinction rates are stochastic and change through time as a function of A, we employed the Gillespie algorithm [2] to improve the efficiency of simulating all events. In the basic version of this algorithm, first the time to the next event is drawn from an exponential distribution with parameter equal to the sum of all total rates and then the type of event is drawn randomly with probabilities proportional to the relative contribution of each event to the total rate. Because rates are time-dependent we used a modified version of the algorithm that takes into account temporal rate variation [3]. When the selected event is immigration, one species from the mainland pool is randomly chosen to immigrate to the island. If the immigrant species is not already present on the island, n increases by one and the age of the population of the species on the island is set to zero at the time of the event (the colonization time). If a population of the immigrant species is already present on the island, the age of the immigrant population is re-set to zero to mimic the effects of gene flow, and n is not increased. This is because introgression of genetic material from recent mainland immigrants into island populations of the same species often leads to genetic homogenization, causing the latter to appear younger in phylogenetic analyses [4]. Immigration in our framework incorporates both dispersal and establishment. When the event is anagenesis, one immigrant species on the island is randomly selected to speciate (endemic species may also speciate via anagenesis, but this is not measurable so our anagenesis rate pertains only to immigrant species). The class of the species is changed to ‘anagenetic’ but retains the timing of colonization of its ancestor. n does not increase. If a cladogenetic event is selected, one island species is randomly chosen to speciate. If the species undergoing cladogenesis is immigrant or anagenetic, a new endemic clade is born, with two new endemic sister species. The phylogeny of the clade is recorded and the age of the root of the tree is equal to the timing of arrival on the island of the ancestor species. If the parent species is cladogenetic, it is replaced by its two daughter species and this divergence is added to the phylogenetic tree for the respective clade. At every cladogenetic event, n is increased by one. When an extinction occurs, one island species is randomly selected to go extinct, with n being reduced by one. If the species has relatives on the island (i.e. is cladogenetic), it is pruned from the phylogenetic tree. If only one representative from the clade remains on the island, the class of the surviving species is changed from cladogenetic to anagenetic, in order to correspond to the classification of contemporary island species based on phylogeography alone [5]. Parameter choice For each island ontogeny scenario (volcanic, symmetrical and constant area), we ran simulations with 16 combinations of parameter values in order to explore a wide range of the parameter space at low and high γ, μ, λa and λc (Table S1). We selected parameters to be relevant to real world islands, in order to make island sizes, time and rates interpretable in units of km2, millions of years, and events/million years, respectively. In addition, we repeated all simulations with and without diversity-dependence of γ and λc (DI and DD versions of the model). For each parameter combination we ran 1000 replicates (islands) with tend = 10 and Amax= 5000. The size of the source pool did not qualitatively affect the results; given that the size of the source pool and IW immigration rate are highly correlated, we chose to use a constant source pool size of m = 1000 in all simulations, and instead varied γ. We set K’ = ∞ and K’ = 0.05A for the DI and DD versions, respectively. Model output For each simulation we calculated the IW rates of immigration, extinction, anagenesis, and cladogenesis at intervals of t = 0.01, in order to average across simulations for plotting purposes. For each parameter set we calculated each rate type at each time interval across 1000 replicate simulations. We also recorded the number of total, immigrant, anagenetic and cladogenetic species present across the simulations at each time step. In order to compare ages of colonization and species ages at different island stages, we extracted this information at t equal to 0.2, 0.5 and 0.9 proportions of tend, representing, respectively, islands at young, intermediate (mature) and old stages. At each island life stage the age of immigrant species is defined as the age at which it colonized the island, which is equal to the difference between island age and the timing of the most recent immigration event of the species (the same immigrant species may colonize the island several times). The age of anagenetic species is defined as the age of colonization of its immigrant ancestor. The age of cladogenetic species is defined as the branch length of the terminal tip that represents the species in the phylogenetic tree. The age of colonization of an endemic island radiation is equal to the age of immigration of the colonist ancestor of the radiation. We examined the shape of phylogenetic trees of island radiations at different proportions of tend (0.05, 0.2, 0.5, 0.8 and 0.9), encompassing the stages where the island is growing and waning in size. We fitted different classes of birth-death models of diversification [6] to the branch lengths of all the phylogenies with five or more species, in order to investigate whether we could detect changes in speciation and extinction rates over time. We repeated the analyses fitting models only to clades with 15 or more species, but found that the results were consistent with the analyses with all clades with five or more species, and therefore report the results from the latter, as sample size is larger. We fitted the following models: constant speciation rate with no extinction (pure birth model, PB); constant speciation and extinction rates (birth-death model, BD); exponentially declining speciation rate through time and constant extinction rate (SPVAR model); exponentially declining speciation rate through time and no extinction (SPVAR_μ0 model); and exponentially increasing extinction rate through time and constant speciation (EXVAR model) [6]. Models were fitted using existing functions within the R package LASER [7] with the exception of SPVAR_μ0, for which we modified the SPVAR model. For each clade we calculated Akaike Information Criterion (AIC) weights for the five models – with a high AIC weight indicating a low relative AIC score for that model and thus high support. We then calculated the mean AIC weight for each of the models across all clades in an island and the fraction of simulations in which a model was the preferred (i.e. had the lowest AIC score) one for that island. For each parameter combination we plotted median AIC weights and mean proportion of times a model is preferred for each of the five life stages. For certain parameter combinations there were no clades with five or more species. We repeated the same procedure using small-sample corrected AIC (AICc) as a metric, with sample size equal to the number of branching times in the phylogeny. For each island life stage we also calculated the median number of radiations (clades with two or more species) and the median number of species per radiation. The simulation model and subsequent analyses were implemented in R (R Development Core Team 2013) and code is available in the supplementary information. Supplementary references: 1. Price, J. P. & Clague, D. A. 2002 How old is the Hawaiian biota? Geology and phylogeny suggest recent divergence. Proc. Biol. Sci. 269, 2429–35. (doi:10.1098/rspb.2002.2175) 2. Gillespie, D. T. 1976 A general method for numerically simulating the stochastic time evolution of coupled chemical reactions. J. Comput. Phys. 22, 403–434. 3. Allen, G. E. & Dytham, C. 2009 An efficient method for stochastic simulation of biological populations in continuous time. Biosystems. 98, 37–42. (doi:10.1016/j.biosystems.2009.07.003) 4. Rhymer, J. M. & Simberloff, D. 1996 Extinction by hybridization and introgression. Annu. Rev. Ecol. Syst. 27, 83–109. (doi:10.1146/annurev.ecolsys.27.1.83) 5. Stuessy, T. F., Jakubowsky, G., Gomez, R. S., Pfosser, M., Schluter, P. M., Fer, T., Sun, B.-Y. & Kato, H. 2006 Anagenetic evolution in island plants. J. Biogeogr. 33, 1259–1265. (doi:10.1111/j.1365-2699.2006.01504.x) 6. Rabosky, D. L. & Lovette, I. J. 2008 Explosive evolutionary radiations: decreasing speciation or increasing extinction through time? Evolution 62, 1866–75. (doi:10.1111/j.1558-5646.2008.00409.x) 7. Rabosky, D. 2006 LASER: A maximum likelihood toolkit for detecting temporal shifts in diversification rates from molecular phylogenies. Evol. Bioinform. Online 2, 247–250. (doi:PMC2674670) Simulation results of symmetrical and constant area ontogenies Island-wide rates The results for the symmetrical area scenario (Fig. S2) are similar to those of volcanic ontogeny (Fig. S1), but the rate of extinction is not a hump-shaped function of age under high μ (instead it increases through time), and the peaks in cladogenesis and anagenesis rates tend to occur at older ages. Under the null model (constant area), IW immigration, extinction, cladogenesis and anagenesis rates reach a plateau very early on (Fig. S3). Species richness In the symmetrical area scenario, species richness is a hump-shaped function of age under high μ (as in the volcanic area scenario), but the peak in richness for all classes of species occurs at a later island age (Fig. S4). In the constant area scenario, the total number of species reaches a plateau at early island age, remaining in dynamic equilibrium thereafter (Fig. S5). Species ages and colonization times Under the symmetrical area model (Fig. S6), results are generally similar to the volcanic scenario, but at late island ages (0.9tend) there is no major difference in ages and timings of colonization between high and low μ scenarios. Under the constant area scenario (Fig. S7), there are no major differences in species ages and timings of colonization at different life stages, except under DD and high λc, where ages and colonization times tend to be older at older ages. Phylogenetic tree shape and in situ radiations Under the symmetrical area scenario, the pure birth model is most often preferred under DI (Fig S8). While under DD, the pure birth model tends to be preferred on islands that have recently emerged, but, similar to the volcanic model, declining speciation and zero extinction tends to be preferred at intermediate and old ages. Under the null model (constant area), the pure birth model is almost always preferred regardless of island age (Fig. S9). Supplementary Figures S1 – S9 Figure S1. Time-series of island-wide (IW) immigration, anagenesis, cladogenesis and extinction rates for simulations of the volcanic island ontogeny. Each plots shows results for a different combination of parameter values indicated in the outer axes. Low and high parameter values are given in Table S1. Dashed lines – low rate of anagenesis; solid lines – high rate of anagenesis. γ – immigration rate; μ – extinction rate; λc – cladogenesis rate. DI – diversity-independent version (a to h), K’ = ∞; DD – diversity-dependent version (i to p), K’ = 0.05. Results are the medians for intervals of t = 0.01 over 1000 replicates. Vertical lines indicate the island ages at which maximum A and K’ are reached. Figure S2 – Time-series of island-wide (IW) immigration, anagenesis, cladogenesis and extinction rates for simulations of the symmetrical area island ontogeny. Each plots shows results for a different combination of parameter values indicated in the outer axes. Low and high parameter values are given in Table S1. Dashed lines – low rate of anagenesis; solid lines – high rate of anagenesis. γ – immigration rate; μ – extinction rate; λc – cladogenesis rate. DI – diversity-independent version (a to h), K’ = ∞; DD – diversitydependent version (i to p), K’ = 0.05. Results are the medians for intervals of t = 0.01 over 1000 replicates. Vertical lines indicate the island ages at which maximum A and K’ are reached. Figure S3 – Time-series of island-wide (IW) immigration, anagenesis, cladogenesis and extinction rates for simulations of the constant area island ontogeny (null model). Each plots shows results for a different combination of parameter values indicated in the outer axes. Low and high parameter values are given in Table S1. Dashed lines – low rate of anagenesis; solid lines – high rate of anagenesis. γ – immigration rate; μ – extinction rate; λc – cladogenesis rate. DI – diversity-independent version (a to h), K’ = ∞; DD – diversity-dependent version (i to p), K’ = 0.05. Results are the medians for intervals of t = 0.01 over 1000 replicates. Vertical lines indicate the island ages at which maximum A and K’ are reached. Figure S4 – Time-series of total, immigrant, anagenetic and cladogenetic species richness for simulations of a symmetrical area island ontogeny. Each plots shows results for a different combination of parameter values indicated in the outer axes. Low and high parameter values are given in Table S1. Dashed lines – low rate of anagenesis; solid lines – high rate of anagenesis. γ – immigration rate; μ – extinction rate; λc – cladogenesis rate. DI – diversity-independent version (a to h), K’ = ∞; DD – diversity-dependent version (i to p), K’ = 0.05. Results are the medians for intervals of t = 0.01 over 1000 replicates. Vertical lines indicate the island ages at which maximum A and K’ are reached. Figure S5 – Time-series of total, immigrant, anagenetic and cladogenetic species richness for simulations of a constant area island ontogeny (null model). Each plots shows results for a different combination of parameter values indicated in the outer axes. Low and high parameter values are given in Table S1. Dashed lines – low rate of anagenesis; solid lines – high rate of anagenesis. γ – immigration rate; μ – extinction rate; λc – cladogenesis rate. DI – diversity-independent version (a to h), K’ = ∞; DD – diversity-dependent version (i to p), K’ = 0.05. Results are the medians for intervals of t = 0.01 over 1000 replicates. Vertical lines indicate the island ages at which maximum A and K’ are reached. Figure S6 – Distribution of ages of immigrant, anagenetic, and cladogenetic species as well as of timings of colonization for young, mature and old island life stages for simulations of a symmetrical area island ontogeny. Each plots shows the results for a different combination of parameter values indicated in the outer axes. Low and high parameter values are given in Table S1. γ – immigration rate; μ – extinction rate; λc – cladogenesis rate. Simulations were run with high rates of anagenesis (λa = 10). DI (a to h) – diversity-independent scenario (K’ = ∞); DD (i to p) – diversity-dependent scenario (K’ = 0.05). Results are 0.25-0.75 (boxes), and 0.025-0.975 (vertical lines) quantiles over 1000 replicates. Median ages are shown with a horizontal line. Figure S7 – Distribution of ages of immigrant, anagenetic, and cladogenetic species as well as of timings of colonization for young, mature and old island life stages for simulations of a constant area island ontogeny (null model). Each plots shows the results for a different combination of parameter values indicated in the outer axes. Low and high parameter values are given in Table S1. γ – immigration rate; μ – extinction rate; λc – cladogenesis rate. Simulations were run with high rates of anagenesis (λa = 10). DI (a to h) – diversity-independent scenario (K’ = ∞); DD (i to p) – diversity-dependent scenario (K’ = 0.05). Results are 0.25-0.75 (boxes), and 0.025-0.975 (vertical lines) quantiles over 1000 replicates. Median ages are shown with a horizontal line. Figure S8 – Diversification model fitting to phylogenies of island clades of symmetrical area ontogeny simulations. Each plot shows results for a different combination of parameter values (Table S1). Vertical bars represent different island life stage - 0.05, 0.2, 0.5, 0.7 and 0.9 proportions of total island age. For each parameter combination, the five bars on the left show AIC weights for each of the models at each life stage, and the five bars on the right show the proportion of times each model was preferred at each life stage. Asterisks above AIC weight bars represent number of clades used to fit models – no asterisk >0 &<10; *>=10 & <100; **>=100 & <1000; ***>=1000. If there were no clades with five or more species for a given life stage no results are plotted. γ – immigration rate; μ – extinction rate; λc – cladogenesis rate. Simulations run with high rates of anagenesis (λa = 10). Figure S9 – Diversification model fitting to phylogenies of island clades of constant area ontogeny simulations. Each plot shows results for a different combination of parameter values (Table S1). Vertical bars represent different island life stage - 0.05, 0.2, 0.5, 0.7 and 0.9 proportions of total island age. For each parameter combination, the five bars on the left show AIC weights for each of the models at each life stage, and the five bars on the right show the proportion of times each model was preferred at each life stage. Asterisks above AIC weight bars represent number of clades used to fit models – no asterisk >0 &<10; *>=10 & <100; **>=100 & <1000; ***>=1000. If there were no clades with five or more species for a given life stage no results are plotted. γ – immigration rate; μ – extinction rate; λc – cladogenesis rate. Simulations run with high rates of anagenesis (λa = 10). Table S1. Values of parameters used in the simulations. Numbers in brackets are values used in the scenarios with constant area (values differ due to the fact that, when using the standard parameter values, zero or few species were obtained in the constant area simulations). parameter name low value high value γ0 Initial per lineage rate of 0.001 (0.1) 0.1 (0.3) 1 10 (3) immigration μmin Minimum per lineage rate of extinction at maximum area λa Per lineage rate of anagenesis 0.1 10 λc0 Initial per lineage rate of 0.00001 DI: 0.00005; cladogenesis DD: 0.001 Table S2- Number of species per clade and number of clades obtained for simulations with volcanic island ontogeny. Low and high parameter values are given in Table S1. γ – immigration rate; μ – extinction rate; λc – cladogenesis rate; DI – diversity-independent version; DD – diversity-dependent version; Prop_time – proportion of island age; # - number; sp. – species. Low μ High μ Low λc Median # Median Prop_time clades sp./clade DI Low γ High γ DD Low γ High λc Median # Median clades sp./clade Low λc Median # Median clades sp./clade High λc Median # Median clades sp./clade 0.2 0.5 0.7 0.8 0.9 0 0 1 1 1 0.3 1.3 1.9 2.1 2.3 0 2 3 4 4 1.1 3.5 4.2 4.3 4.4 0 0 0 0 0 0.2 1.1 1.4 0.9 0.0 0 2 2 2 0 1.1 3.4 3.9 3.3 0.2 0.2 0.5 0.7 0.8 0.9 9 54 89.5 105 113 3.0 3.0 3.0 3.0 3.0 41 201 318 368 400 3.0 3.6 4.0 4.0 4.0 9 46 57 34 0 3.0 3.0 3.0 3.0 1.3 41 184 246 194 7 3.0 3.4 3.8 3.3 3.0 0.2 0.5 0.7 0.8 0.9 0 0 1 1 1 0.3 1.4 1.8 2.1 1.6 1 2 2 2 2 83.4 140.7 138.4 134.9 140.3 0 0 0 0 0 0.3 1.1 1.2 0.8 0.0 1 2 2 1 1 77.1 133.1 111.7 24.5 12.5 High γ 0.2 0.5 0.7 0.8 0.9 4 9 9 7 8 3.0 3.0 3.0 3.0 3.0 37 38 37 34 37 5.2 5.1 5.2 5.3 5.1 4 8 5 0 0 3.0 3.0 3.0 0.4 0.1 37 36 31 8 3 5.1 5.0 4.7 3.4 3.2 Table S3- Number of species per clade and number of clades obtained for simulations with symmetrical island ontogeny. Low and high parameter values are given in Table S1. γ – immigration rate; μ – extinction rate; λc – cladogenesis rate; DI – diversity-independent version; DD – diversity-dependent version; Prop_time – proportion of island age; # - number; sp. – species. Low μ Low λc Median # Median clades sp./clade High λc Median # Median clades sp./clade Low λc Median # Median clades sp./clade High λc Median # Median clades sp./clade 0.2 0.5 0.7 0.8 0.9 0 0 1 1 1 0.1 1.3 2.1 2.4 2.6 0 2 3 4 5 0.9 3.4 4.4 4.6 4.9 0 0 1 1 1 0.2 1.2 1.9 2.2 2.3 0 2 3 4 4 0.7 3.3 4.3 4.6 4.6 0.2 0.5 0.7 0.8 0.9 6 52 102 129 155 3.0 3.0 3.0 3.0 3.0 29 201 354 430 497 3.0 3.4 4.0 4.0 4.3 5 46 89 111 118 3.0 3.0 3.0 3.0 3.0 25 180 324 388 427 3.0 3.2 4.0 4.0 4.0 0.2 0.5 0.7 0.8 0.9 0 0 1 1 1 0.2 1.2 2.1 2.4 2.5 1 2 2 2 2 30.9 119.4 127.4 124.6 125.1 0 0 1 1 1 0.2 1.2 1.8 2.2 2.2 1 2 2 2 2 25.3 125.9 124.3 120.2 114.7 0.2 2 2.7 39 4.1 2 2.6 38 4.2 Prop_time DI Low γ High γ DD Low γ High γ High μ 0.5 0.7 0.8 0.9 8 10 10 10 3.0 3.0 3.0 3.0 46 46.5 46 46 4.5 4.5 4.5 4.6 8 9 9 8 3.0 3.0 3.0 3.0 45 44 43 41 4.6 4.6 4.5 4.4 Table S4- Number of species per clade and number of clades obtained for simulations with constant area island ontogeny (null model). Low and high parameter values are given in Table S1. γ – immigration rate; μ – extinction rate; λc – cladogenesis rate; DI – diversity-independent version; DD – diversity-dependent version; Prop_time – proportion of island age; # - number; sp. – species. Low μ Low λc Median # Median clades sp./clade High λc Median # Median clades sp./clade Low λc Median # Median clades sp./clade High λc Median # Median clades sp./clade 0.2 0.5 0.7 0.8 0.9 2 2 3 2 2 2.5 2.8 2.8 2.8 2.8 9 14 15 15 15 3.0 3.0 3.0 3.0 3.0 0 0 0 0 0 0.7 0.8 0.7 0.7 0.7 1 1 1 1 1 2.3 2.3 2.3 2.4 2.4 0.2 0.5 0.7 0.8 0.9 5 7 7 7 7 3.0 3.0 3.0 3.0 3.0 29 42 44 44 44 3.0 3.0 3.0 3.0 3.0 1 1 1 1 1 1.6 1.6 1.7 1.7 1.7 4 4 4 4 4 3.0 3.0 3.0 3.0 3.0 0.2 0.5 0.7 0.8 0.9 1 1 1 1 1 2.1 2.2 2.2 2.2 2.2 30 31 30 30 31 5.1 5.1 5.1 5.1 5.1 0 0 0 0 0 0.6 0.6 0.6 0.5 0.6 22 22 22 22 22 4.5 4.4 4.4 4.4 4.4 0.2 1 2.3 44 3.9 0 1.1 31 3.6 Prop_time DI Low γ High γ DD Low γ High γ High μ 0.5 0.7 0.8 0.9 1 1 1 1 2.4 2.4 2.4 2.4 44 44 44 44 3.9 3.9 3.9 3.9 0 0 0 0 1.0 1.1 1.1 1.0 31 31 31 31 3.6 3.6 3.6 3.6