9
10
11
12
4
5
6
7
8
1
2
3
13
14
15
16
17
18
19
20
21
22
23
24
25
26
Supporting Information
Guschanski et al.
SI M ATERIAL AND M ETHODS
Definition of Ancestral Ranges for Lagrange Analysis
To reconstruct ancestral ranges in Lagrange v
20110117 (Ree et al. 2005; Ree & Smith
2008) we defined following geographical regions (Fig. 1):
the Congo basin;
northern DRC - north of the Congo River and extending into Central African
northwestern part on Zambia;
southeastern Africa - southern part of the East African region;
northeastern Africa - northern part of the East African region, extending to the Rift
Republic, confined to the West by the Oubangui River and to the east by the Rift
Valley;
northern Rift Valley - the northern part of the Albertine Rift Valley, extending as far south as Rwanda and Burundi and north to the border with Sudan;
Upper Guinean region and Lower Guinean region - the Dahomey Gap defines the border between the two;
Angola - corresponding to the northern part of the country;
southeastern DRC - Katanga (DRC), including the eastern part of Angola and the
Valley in Tanzania and Uganda and reaching the border with Ethiopia;
Zambia - northeastern part of the country and southern Rift Valley;
Ethiopia/Sudan - only the southernmost part of Sudan is taken into account.
1
42
43
44
45
46
38
39
40
41
47
48
49
50
51
31
32
33
34
35
36
37
27
28
29
30
these regions are based on the current distribution of guenon species.
LASER Analyses on Different Taxonomic Units
The birth
–death likelihood (BDL) method as implemented in LASER (Rabosky 2006)
strongly relies on the taxonomic scheme. Therefore, to account for the current taxonomic uncertainty of the guenon radiation, we chose to independently analyze three different taxonomic scenarios: (A) trees pruned to a single representative per species following the
taxonomy of Grubb et al. (2003) (Table S1); (B) trees pruned to a single representative per
species following Wilson & Reeder (2005); and (C) trees pruned to a single representative
per subspecies following Grubb et al. (2003). The result of the latter scenario is presented in the main text.
For each analysis, the critical value of ΔAIC
RC
(difference in AIC between the best rateconstant model (RC) and each of the rate-variable models) was calculated based on a null distribution of ΔAIC
RC
generated by fitting the candidate models to a set of 1,000 phylogenies simulated with the same number of taxa as the original chronograms and the
ML birth rate estimated by LASER. To account for incomplete taxon sampling, we also calculated the critical ΔAIC
RC
from a set of 1,000 phylogenies simulated with complete sampling and the ML birth rate, but pruned randomly to reach the sampled number of taxa.
Finally, in order to test whether the best fit model for the maximum credibility trees was consistently selected across the posterior distribution of branching times, we fitted BDL models to each of 1,000 BEAST chronograms under the three taxonomic scenarios, generating a posterior distribution of the difference in ΔAIC
RC scores.
2
55
56
57
58
59
52
53
54
63
64
65
66
60
61
62
72
73
74
75
76
67
68
69
70
71
SI R ESULTS
Which Factors Influence DNA Recovery from Museum Specimens?
We tested if the recovery of DNA from museum samples (measured as the number of fragments that could be mapped to the reference genome) differed between samples from different sources (dried tissue from skeleton and skull, teeth, bone material, nasal bones, ear cartilage, finger tips), different museums, was dependent on the age of the specimen or the weight of the sample (Table S2). We also analyzed if there was an effect of the extraction set and the enrichment pool. The linear regression model in R first showed that age and weight did not have an effect on the recovery of endogenous DNA (p=0.89 and p=0.97, respectively). The age category spanned more than 100 years from 1894 to 1997.
The weight ranged from 10 to 253 mg and thus differed by more than 25-fold. Next, accounting for all other variables, we tested if samples from different museums performed differently, which could be an indication of different storage conditions. Only samples from
Royal Belgian Institute of Natural Sciences (RBINS) showed significantly lower DNA recovery compared to other museums (pairwise t-test with adjusted p-values=0.1-0.002).
Different extraction conditions for the majority of the samples (9 out of 11) could explain this result (samples were extracted in a different laboratory compared to the rest, and no siliconized tubes were used during extraction and for storage). We subsequently removed the data from this museum from further tests. Next, we tested if there was a difference between different sources of material. There was no difference in the performance of any source material (pairwise t-test with adjusted p-values=1). Our results are in agreement
with Mason et al. (2011). Different extraction sets differed significantly from each other,
with one of the extraction sets performing exceptionally well (May_4: adjusted pvalues=0.05-2x10 -16 ). There was also a significant difference among different enrichment pools, with two pools performing worse than others. One of them contained the
3
81
82
83
84
85
86
87
77
78
79
80
95
96
97
92
93
94
88
89
90
91
98
99
100
101 combination of three bait species (3 species pool 2) and the other was captured with
Erythrocebus patas (E. patas pool 1) (Table S2).
Sample Misidentification
To avoid possible biases in taxonomic and phylogenetic interpretation as result of specimen mix-up, we collected and sequenced multiple representatives of the same taxon, wherever possible. In many cases the representatives were collected in different museums. For instance, the unexpected placement of Cercopithecus solatus apart from other members of the C. preussi species group, was confirmed by two independent specimens collected in Natural History Museum, London, United Kingdom (NHM) and
Royal Museum for Central Africa, Tervuren, Belgium (RMCA). The only species within
Ceropithecini that were represented by a single specimen and whose placement was not supported by additional samples were C. dryas and C. hamlyni . The former was collected from the type specimen in RMCA, the latter was sequenced from high quality sample derived from an animal in Leipzig Zoo, Germany and obtained from DPZ (German Primate
Center, Göttingen, Germany).
We identified three cases of mislabeling of museum-preserved specimens or mix-up during sample collection. Noteworthy, all of these cases are confined to specimens from
Museum für Naturkunde in Berlin (MfN), Germany. Based on the phylogenetic placement the specimen labeled Lophocebus albigena is most likely a representative of the genus
Cercocebus , whereas two specimens labeled Cercopithecus nictitans fall within the Mona group. Another potential case could involve a specimen designated as Chlorocebus tantalus from Togo. While specimens of Chl. tantalus from Cameroon and Central African
Republic cluster together, this specimen forms a sister branch to the cluster of Chl. pygerythrus , Chl. cynosorus , Chl. tantanlus and Chl. aethiops . Given the possibility of
4
106
107
108
109
110
111
112
102
103
104
105
117
118
119
120
121
113
114
115
116 sample confusion we restrain from any phylogenetic conclusions, until further evidence has been collected.
Differential Diversification through Time Estimated with LASER
The Yule-3-rate model consistently obtained the lowest AIC score in all three taxonomic scenarios (Table S7). It provided a significantly better fit than the best rate-constant model
(pure birth) for taxonomic scenario B (Wilson & Reeder (2005) specieslevel tree, ΔAIC
RC
= 11.72, P < 0.01) and C (Grubb et al. (2003) regardless of whether the critical ΔAIC
RC subspecies tree, ΔAIC
RC
= 13.75, P < 0.01),
values were estimated from trees simulated with or without accounting for incomplete taxon sampling. For taxonomic scenario A (Grubb et al. (2003) species-level tree), Yule-3-rate was not significantly better than the constant rates models (ΔAIC
RC
= 6.56, P = 0.06), although the test approached significance.
Therefore, rate constancy could not be rejected. The maximum likelihood estimate of the first shift in diversification rate was the same for all three taxonomic scenarios, i.e. 2.77 Ma
(Table S7). According to the model, there was an increase of 2.2-2.8 times (r2/r1) in the rate of diversification across the guenon phylogeny as a whole. Near the present, all three trees also supported a strong decrease in diversification rate of over sevenfold. However, the date of the second shift differed depending on the taxonomic scenario (A: 1.21, B:
1.08, C: 0.44 Ma). That is to be expected, given that diversification at the subspecies level occurred more recently than at the species level.
122
123
124
125
126
127
Ancestral Ranges and Dispersal Events within Species Groups
Range overlap increased with time (Fig. 4 and S8), which for nuclear data was approximated by branch length. Based on nuclear data, the regression between branch length and range overlap was significantly positive (adjusted R-squared= 0.25, p= 0.03) and the Y-intercept was at -0.1 not significantly different from 0 (p=0.58).
5
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
Inferring possible geographical ranges for the ancestral nodes shed light on the dispersal events within individual guenon species groups. It also illustrated, how climatic events could have simultaneously affected speciation in multiple species groups. Because here we consider gene divergence times and not population divergences, which can be considerably younger, we can only speculate about the role of particular climatic events in guenon speciation.
C. mitis species group.
–
The C. mitis group seems to have originated around the watershed of the lower Congo River at the border between Lower Guinea and Angola (Fig.
1). After the initial split around 2.4 Ma, the western populations dispersed into the Lower
Guinea to become C. nictitans and southwards towards Angola to become C. mitis mitis .
The eastern populations went south- and eastwards towards Zambia and there split into two sub-lineages around 1.7 Ma: one of them to the west of the Albertine Rift Valley and the other to the east. This time period coincides with the major uplifts in the Malawi Rift
(Ebinger et al. 1993), which might have triggered the separation. Members of the western
sub-lineage occupied the Congo basin ( C. m. heymansi ) and also went northwards along the western side of the Albertine Rift valley. Upon entering the “northern Rift Valley” region they diversified into C. m. kandi , C. m. dogetti and C. m. stuhlmani . Cercopithecus m. kolbi , which nowadays occurs in central Kenya, could represent a remnant population of the former eastwards range expansion. Members of the eastern sub-lineage dispersed along the east African cost north- and southwards. This sub-lineage contains C. m. boutourlini , C. m. albogularis , C. m. albotorquatus , C. m. monoides , C. m. erythrarchus , C. m. moloneyi and C. m. labiatus and represents the most recent radiation event within the
Mitis group (less than 1 Ma).
C. cephus species group.
–
After splitting from C. preussi / C. lhoesti , the C. cephus group took its origin at the contact zone between Upper and Lower Guinea about 2.2 Ma. The most ancestral lineage, including C. erythrogaster pococki and C. petaurista , diversified in
6
167
168
169
170
171
172
173
162
163
164
165
166
174
175
176
177
178
154
155
156
157
158
159
160
161
Upper Guinea. The current distribution of the species in this lineage suggests a scenario of gradual westwards migration. The common ancestor entered Nigeria, which is now occupied by C. erythrogaster and went westwards through Togo. It crossed the Dahomey gap and the gene flow across this region seemed to have ceased ca. 1.4 Ma, which gave rise to C. petaurista . This species subsequently diversified into subspecies in Upper
Guinea. The second lineage of the C. cephus group diversified within Lower Guinea. It contains C. erythrotis as sister species to a subset of C. cephus , consisting of C. c. ngottoensis and a not further identified subspecies from Cameroon. Together, this C. cephus subgroup occupies the northernmost range of the species. The C. erythrotis subspecies, one of which ( C. e. erythrotis ) is endemic to the Bioko island, split ca. 0.4 Ma - slightly later than the C. preussi subspecies ( C. p. preussi and C. p. insularis ca. 0.7 Ma), which show similar distribution. The third lineage containing C. cephus cephus collected south of the lower Congo River and C. ascanius occurs further to the south. C. ascanius diversified into multiple subspecies in and around the Congo Basin in a most recent radiation event in guenons (less than 1 Ma). The most ancestral lineage within C. ascanius , C. a. schmidti occurs north and east of the Congo River, while all other subspecies are distributed within and south of the Congo basin. In contrast to forest cover fluctuations that seem to be responsible for many of the previously described diversification events, the changing course of the rivers might have played a more pronounced role for C. ascanius . Many of the subspecies are clearly confined to interfluvial areas, which might form impermeable barriers to dispersal. The Congo basin is a region of low altitudinal profile and was suggested to have been occupied by a large water body as
recent as the Pliocene (Goudie 2005). It is thus conceivable that the course of the Congo
River tributaries was established only recently, which in turn would explain the young separation ages of C. ascanius within this area.
7
192
193
194
195
196
187
188
189
190
191
179
180
181
182
183
184
185
186
197
198
199
200
201
202
203
204
C. mona species group.
– The C. mona group shows a strikingly similar pattern of diversification to that of the C. cephus group. This similarity and the large range overlap between these two species groups suggest that same environmental factors have played a role in creating today’s diversity and distribution. It is likely that the ancestor of the
C. mona group occupied the contact zone of Lower and Upper Guinea, corresponding to the today’s distribution range of C. mona . One of the lineages remained to the west of the
Nigeria/Cameroon border, the other dispersed eastwards. The western lineage entered the Dahomey gap, in which C. mona
still occurs today (Campbell et al. 2008). It eventually
gave rise to C. campbelli and C. lowei about 2.7 Ma, which are distributed further to the west, similar to the stepwise dispersal of C. erythrogaster and C. petaurista (see above).
The eastern lineage, containing C. pogonias gave rise to two clusters. One of them occurs west and north of the Congo River and contains C. p. pogonias , C. p. nigripes and C. p. grayi . Another member of this cluster could be C. p. schwarzianus from western DRC, as its relationship with the sister cluster is only weakly supported (Fig. S3). Finally, the cluster containing the other members of C. pogonias has diversified in northeastern Congo ( C. p. denti ) and within the Congo basin ( C. p. pyrogaster , C. p. wolfi and C. p. elegans ).
Particularly this last radiation within the Congo basin occurred at about the same time (ca.
1 Ma) as the diversification of C. ascanius in the same geographical area. Again, the fast changing drainage system of Congo River tributaries could have contributed to this diversification.
It is interesting to note the unexpected placement of the C. mona specimen collected from the island of Principe ( São Tomé) within C. pogonias (Fig. S3 and Fig. S4). Island radiations of guenons represent a special case, which includes dispersal by humans
(Horsburgh et al. 2003). It is possible that this specimen reflects past hybridization
between C. pogonias and C. mona . Importantly, hybrids between C. mona and C. pogonias
were previously described (Struhsaker 1970; Detwiler et al. 2005).
8
205
206
207
208
209
210
211
212
213
214
215
216
C. aethiops species group (genus Chlorocebus).
– Recovering ancient geographical signal for the widely distributed members of the C. aethiops group is challenging. It is conceivable that the ancestor originated in the area around Lower Guinea and northern
DRC. Chl. sabaeus , the most ancestral member of the C. aethiops group, could be the result of an ancient westwards movement of this ancestor. Another lineage subsequently spread eastwards north of the Congo River before entering the Congo basin, where its remnant population with the representative C. dryas still lives today. The northern population gave rise to the western Chl. tantalus , the eastern Chl. aethiops , and the
Tanzanian Chl. pygerythrus . Another branch of this dispersal moved southwards and gave rise to Chl. cynosorus and the South African Chl. pygerythrus .
9
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
237
238
239
240
241
232
233
234
235
236
R EFERENCES
Campbell G, Teichroeb J, Paterson JD 2008. Distribution of diurnal primate species in
Togo and Benin. Folia Primatol. (Basel). 79:15-30.
Couvreur TL, Chatrou LW, Sosef MS, Richardson JE 2008. Molecular phylogenetics reveal multiple tertiary vicariance origins of the African rain forest trees. BMC Biol.
6:54.
Detwiler KM, Burrell AS, Jolly CJ 2005. Conservation Implications of Hybridization in
African Cercopithecine Monkeys. Int. J. Primatol. 26:661-684.
Ebinger CJ, Deino AL, Tesha AL, Becker T, Ring U 1993. Tectonic Controls on Rift Basin
Morphology: Evolution of the Northern Malawi (Nyasa) Rift. J. Geophys. Res.
98:17821-17836.
Goudie AS 2005. The drainage of Africa since the cretaceous. Geomorphology. 67:437-
456.
Grubb P, Butynski TM, Oates JF, Bearder SK, Disotell TR, Groves CP, Struhsaker TT
2003. Assessment of the diversity of African primates. Int. J. Primatol. 24:1301-
1357.
Horsburgh KA, Matisoo-Smith E, Glenn ME, Bensen KJ 2003. Genetic Study of
Translocated Guenons: Cercopithecus mona on Grenada. In: Glenn ME, Cords M, editors. The Guenons: Diversity and Adaptation in African Monkeys. New York.
Kluwer Academic/Plenum Publishers. p. 289-306.
Linder HP 2001. Plant diversity and endemism in sub-Saharan tropical Africa. J. Biogeogr.
28:169-182.
Mason VC, Li G, Helgen KM, Murphy WJ 2011. Efficient cross-species capture hybridization and next-generation sequencing of mitochondrial genomes from noninvasively sampled museum specimens. Genome Res. 21:1695-1704.
10
250
251
252
253
254
255
256
257
258
259
260
261
262
242
243
244
245
246
247
248
249
Moodley Y, Bruford MW 2007. Molecular biogeography: towards an integrated framework for conserving pan-African biodiversity. PLoS ONE. 2:e454.
Olson DM, Dinerstein E, Wikramanayake ED, Burgess ND, Powell GVN, Underwood EC,
D'Amico JA, Itoua I, Strand HE, Morrison JC, Loucks CJ, Allnutt TF, Ricketts TH,
Kura Y, Lamoreux JF, Wettengel WW, Hedao P, Kassem KR 2001. Terrestrial ecoregions of the worlds: A new map of life on Earth. Bioscience. 51:933-938.
Rabosky DL 2006. Likelihood methods for detecting temporal shifts in diversification rates.
Evolution. 60:1152-1164.
Ree RH, Moore BR, Webb CO, Donoghue MJ 2005. A likelihood framework for inferring the evolution of geographic range on phylogenetic trees. Evolution. 59:2299-2311.
Ree RH, Smith SA 2008. Maximum likelihood inference of geographic range evolution by dispersal, local extinction, and cladogenesis. Syst. Biol. 57:4-14.
Struhsaker TT 1970. Phylogenetic implications of some vocalizations of Cercopithecus monkeys. In: Napier JR, Napier PH, editors. Old World Monkeys: evolution, systematics and behaviour. New York. Academic Press. p. 365-444.
Udvardy MDF 1975. A classification of the biogeographical provinces of the world Morges,
Switzerland: IUCN.
Wilson DE, Reeder DM 2005. Mammal species of the world: A taxonomic and geographic reference Baltimore: The Johns Hopkins University Press.
263
264
11
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
272
273
274
275
276
265
266
267
268
269
270
271
T ABLE L EGENDS
Table S1: Specimen taxonomy, collection locality and GenBank accession numbers.
Taxonomic classification by Grubb et al. (2003) and Wilson and Reeder (2005) are juxtaposed and taxonomic correspondence of the specimens is given. The table lists country and locality of origin for each specimen and indicates missed taxa.
Table S2: Sample source, extraction, sequencing and enrichment success.
The basic data for each extracted sample is summarized. First column lists the name of the specimen as given in museum records. Tree label corresponds to the name of each sample in the phylogenetic trees (Figs. 1, 2, 3, S3, S5, S6). Museum number: museuminternal catalog number. Collection date: date the specimen was collected in the wild and thus the date that the animal was killed or died. Extracted part: part of the specimen from which the sample for DNA extraction was collected. Extraction set: groups samples that have been extracted together. Each of the seven extraction sets was complemented by two negative controls. Enrichment pool indicates which samples have been pooled for mitochondrial genome capture ( C. m. monoides pool1-3, E. patas pool1-3, and 3species pool1-3, which corresponds to the combination of C. m. monoides , E. patas and C. diana mtDNA genomes). Number of merged fragments: number of fragments that have been sequenced and merged for each sample. Number of unique mapped fragments: number of the merged fragments with unique start and end coordinates that could be mapped to the reference genome. Percentage of contaminated reads: percentage of all reads that had a higher alignment score with the set of reference sequences that including the human mtDNA genome than with the set of only Cercopithecini reference sequences (see main text for more details). Average coverage: average sequencing depth per position of each sequenced mtDNA genome. Percentage of fragments mapped to reference: corresponds to the efficiency of the enrichment method to capture fragments with mitochondrial
12
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
292
293
294
295
296
297
298
299 sequence. Next three columns summarize the information about the number of sequenced nucleotides, not sequenced nucleotides and ambiguous nucleotides (1-fold coverage or inconsistent base calling) per genome. We further indicate if the genome had more than
13,000 sequenced nucleotides and thus was used in the subset of analysis that rely on almost complete mitochondrial genomes.
Table S3: Mitochondrial primers.
Sequences of the primers used to produce the mitochondrial genome from high quality sample ( Cercopithecus hamlyni ). Primers used to generate long-range PCR products for bait construction are also listed. S=sequencing primer, R=re-amplification from the initial long-range PCR product.
Table S4: Best partitioning scheme identified with PartitionFinder for mtDNA data using only RaxML evolutionary models and all models. The substitution model and the data subsets are listed for each data partition.
Table S5: Taxonomic representation of nuclear data from Perelman et al. 2011.
Table S6: Lagrange analysis: geographic species distribution, adjacency matrix and allowed ranges. Range assignment as in SI Materials and Methods: A: the Congo basin;
B: northern DRC; C: northern Rift Valley; D: Upper Guinean region; G: Lower Guinean region; I: Angola; J: southeastern DRC; K: southeastern Africa; L: northeastern Africa; M:
Zambia; N: Ethiopia/Sudan.
Table S7: Tests of diversification models in Laser.
Models were fitted to three maximum clade credibility trees pruned according to three
13
318
319
320
321
322
323
324
325
326 different taxonomic scenarios.
AIC = Akaike information criterion; ∆AIC
RC
= difference in
AIC score between the best rate-constant model (pure birth) and the model with the lowest
AIC score (Yule-3-rate); r1 = initial net diversification rate (speciation events per million years); r2 = diversification rate after first rate shift; r3 = diversification rate after second rate shift; a = extinction fraction; k= carrying capacity parameter; x = rate change parameter; st1, st2 = inferred time of rate shifts in Ma before present. P values based on comparing
∆AIC
RC with the critical ∆AIC
RC from: † trees simulated under rate constancy with the same number of taxa sampled in this study; †† trees simulated under rate constancy with complete taxon sampling and pruned to the number of species sampled.
14