A necessarily complex model to explain the biogeography of

advertisement

1 A necessarily complex model to explain the biogeography of Madagascar's amphibians and

2 reptiles

3

4 Jason L. Brown

1,2

, Alison Cameron

3

, Anne D. Yoder

1

, Miguel Vences

4

7

8

5

6

1 Department of Biology, Duke University, Durham, NC 27708 Durham, NC, USA

2 Current address: Department of Biology, The City College of New York, NY, USA

3 School of Biological Sciences, Queen's University Belfast, 97 Lisburn Road, Belfast BT9 7BL, UK

4 Zoological Institute, Technical University of Braunschweig, Mendelssohnstr. 4, 38106 Braunschweig, Germany

9

10 Abstract

11 A fundamental limitation of biogeographic analyses are that pattern and process are inextricably

12 linked, and whereas we can observe pattern, we must infer process. Yet, such inferences are

13 often based on ad-hoc comparisons using a single spatial predictor such as climate, topography,

14 vegetation, or assumed barriers to dispersal without taking into account competing explanatory

15 factors. Here we present an alternative approach, using mixed-spatial models to measure the

16 predictive potential of combinations of spatially explicit hypotheses to explain observed

17 biodiversity patterns. In this study we compiled a comprehensive dataset of 8362 occurrence

18 records from 745 amphibian and reptile species from Madagascar. These data were used to

19 estimate species richness, corrected weighted endemism, and species turnover (based on

20 generalized dissimilarity modeling). We also created or incorporated, when previously available,

21 18 spatially explicit predictions of 12 major diversification and biogeography hypotheses, such

22 as: mid-domain, topographic heterogeneity, sanctuary, and climate-related factors. Our results

23 clearly demonstrate that mixed-models greatly improved our ability to explain the observed

24 amphibian and reptile biodiversity patterns. Hence, the observed biogeographic patterns were

1

25 likely influenced by a combination of diversification processes rather than by a single

26 predominant mechanism. Further, selected genera of Malagasy amphibians and reptiles differed

27 in the major factors explaining their spatial patterns of richness and endemism. These differences

28 suggest that key factors in diversification are lineage specific and vary among major endemic

29 clades. Our study therefore emphasizes the importance of comprehensive analyses across

30 taxonomic, temporal, and spatial scales for understanding the complex diversification history of

31 Madagascar's biota. A "one-size-fits-all" model does not exist.

32

33 Keywords: Conditional Autoregressive Models, Orthogonally Transformed Beta Coefficients,

34 Generalized Dissimilarity Modelling, Species Distribution Modelling

35

36 The spatial distribution of biodiversity is at the core of biogeography, macroecology,

37

evolutionary biology, and conservation biology

1,2

. Biodiversity mapping indices are multi-

38 faceted concepts with the main components being local endemism, species richness, and species

39 turnover, of which the two latter correspond to alpha- and beta-diversity as used in community

40

ecology

3,4

. In different combinations, these components are invoked to identify biogeographic

41 regions 5-7 , prioritize geographic areas for conservation 8 , assess the effects of conservation

42 measures

9

, and/or delimit centers of speciation or extinction. These indices, however, are not

43 independent of one another. For instance, species turnover across an area is closely related to the

44 numbers of endemic species within each geographical unit or community, which in turn is often

45 used to estimate areas of endemism (AOE). These represent the coincident restrictedness of

46 taxa 10-12 and are often used to identify unique geographic areas for biodiversity conservation or

2

47

biogeography studies

13,14

. Clearly, there is an inescapable circularity to these measures, and thus

48 also to the consequent inferences made regarding biogeographic processes.

49 Inferences of speciation mechanism fall prey to similar limitations. For example, it is

50 generally assumed that species formation and diversification of a range of co-distributed taxa

51 will be triggered or inhibited by similar barriers to gene flow, topographical and geological

52 settings, climatic conditions and shifts, and competition. Accordingly, it is the default

53 expectation that similar barriers (e.g., rivers, ecotones, climatic transitions) will lead to similar

54 patterns of species endemism, turnover, and richness; again, with the underlying assumption that

55 the observation of similar patterns among diverse species reveals a general causal mechanism of

56 diversification across all taxa. But there are additional processes by which species richness may

57 be generated. For example, climatic factors, environmental stability, land area, habitat

58 heterogeneity, paleogeography, and energy available all could be spatially correlated with

59 geographical barriers. Thus, any of these mechanisms might be indirectly, but not causally

60 related to diversification

15

. Patterns of endemism, on the other hand, are generally considered to

61 reflect a particular evolutionary history, with areas of endemism corresponding to centers of

62 diversification

16

and often including some element of stochasticity. Consequently, it can be the

63 case that areas of high endemism often are also characterized by high species richness, though

64 the inverse is not necessarily true.

65 Species distribution models (SDMs) allow sophisticated calculations of centers of

66 historical habitat stability

17

. Yet, their spatial comparison with current patterns usually follows

67 narrative approaches and is similar to classical hypotheses of diversification mechanisms, with

68 no accounting for autocorrelation among the different explanatory variables. Based on either a

69 single explanatory variable or without employing statistics at all, often biogeography researchers

3

70 rely on ad-hoc comparisons with spatial distributions of single environmental factors such as

71 climate, topography, vegetation, or assumed barriers to dispersal. As a sign of progress, many

72 methodological advances are being developed to address the various problems described here.

73 For example, assessments of spatial biodiversity have typically used simple geographic measures

74 as the unit of analysis, such as the distribution range of individual species, though recent

75 methodological refinements include the inclusion of phylogenetic relationships among species

76

and their evolutionary age 2,7 . Moreover, carefully parameterized SDMs can generate accurate

77 estimates of distribution ranges

18

and novel approaches are being developed to translate patterns

78 of species richness, endemism and turnover more objectively for determining those

79

biogeographic regions in greatest need for conservation and protection

2,7,8,19-21

. Despite this

80 progress in conceptual and statistical tools, biological explanation of these patterns is still in its

81 methodological infancy.

82 Here we aim to employ the latest techniques for sophisticated and improved statistical

83 methods for identifying the causal mechanisms that have determined the spatial distribution of

84 Madagascar's herpetofauna. Though the search for the drivers of biological diversification was

85 initially focused on the Neotropics, considerable attention has more recently been focused on

86 other areas such as the Australian wet tropics 22 and Madagascar 23 . Madagascar is the world's

87 fourth largest island and hosts an extraordinary number of endemic flora and fauna. For example,

88 100% of the native species of amphibians and terrestrial mammals, 92% of reptiles, 44% of

89 birds, and >90% of flowering plants occur nowhere else

24

. This mega diverse micro-continent,

90 initially part of Gondwana, has been isolated from other continents since the Mesozoic. Its

91 current vertebrate fauna is a mix of only a few ancient Gondwanan clades and numerous

4

92 endemic radiations, originating from Cenozoic overseas colonizers arriving mainly from

93 Africa 25-27 .

94 The extraordinary levels of endemism at the level of entire clades in Madagascar, and

95 their long isolation from their sister lineages, provide a unique opportunity to study the

96 mechanisms driving divergence and diversification in situ

28

. Over the past decade, numerous

97 general mechanisms and models have been formulated to explain biodiversity distribution

98 patterns and species diversification in Madagascar, pertaining to environmental stability (or

99 instability), solar energy input, geographic vicariance triggered by topographic or habitat

100

complexity, intrinsic traits of organisms, or stochastic effects

23,29-36

. Evidence has supported

101 numerous hypotheses, though the evidence has typically been marshalled from limited or

102 phylogenetically-constrained taxa. Comprehensive statistical approaches comparing their relative

103 importance are rare 37 .

104 In this paper we apply an integrative approach to simultaneously test which of several

105 competing and complementary hypotheses are most strongly correlated with empirical

106 biodiversity patterns (Fig. 1). We first translate a total of 12 diversification mechanisms or

107 diversity models into explicit spatial representations. We then use diverse statistical approaches

108 to assess spatial concordance of these predictor variables with species richness, endemism and

109 turnover as calculated from original occurrence data of Madagascar's amphibians and reptiles,

110 with full species-level coverage. Our results best agree with the hypothesis that various

111 assemblages of species are under the influence of differing causal mechanisms. The clear

112 message is that the distribution of diverse organismal lineages will depend upon idiosyncratic

113 factors determined by their specific organismal life-history traits combined with stochastic

5

114 historical factors. Thus, any model that endeavors to explain island-wide patterns must

115 necessarily be complex.

116

117

118 R ESULTS

119 To understand spatial distribution patterns in Madagascar's herpetofauna, we first

120 compared range sizes, and computed species richness and endemism from the modeled

121 distribution areas of amphibians and non-avian reptiles (hereafter reptiles). Mean range size (±

122 standard deviation) in our data set is smaller in amphibians than reptiles taking into account all

123 species (41,673 ± 55,413 km

2

vs. 50,205 ± 84,078 km

2

; t= 3.981, p< 0.001; df=649.7) and after

124 excluding species known from only 1 or 2 localities (64,106 ± 57,532 km

2

vs. 95,294 ± 87,495

125 km 2 ; t= 4.511, p<0.001; df=427.4). Microendemics (species with distributions less than

126 1000km

2

) constitute 36.5% of all amphibian and 33.6 % of all reptile species in Madagascar

127 (difference not significant; Z =0.411, p=0.682).

128 Spatial patterns of species richness are quite similar between the two groups (Fig 2A &

129 E) and reach highest values in the eastern rainforest; in amphibians, richness peaks in the central

130 east, whereas in reptiles, it is more evenly distributed across the rainforest biome, with some

131 areas of high diversity also in the north, west, and southwest. Spatial patterns of endemism in

132 both groups (Fig 2B & F) reveal two centers in the north around the Tsaratanana Massif and in

133 the central east. Endemism values for reptiles are also high in southwestern Madagascar, the

134 most arid region of the island.

135

We applied Generalized Dissimilarity Modelling (GDM) 38,39 to identify areas of

136 endemism on the basis of turnover patterns for non-avian reptiles and amphibians together. The

6

137 major AOE obtained in a 4-class categorization of the originally continuous GDM results (Figure

138 2C & H) largely mirrors the bioclimatic regions of Cornet (1974).

139 Our test includes a total of 12 predictor hypotheses, some of which focus on the

140 geographical pattern in which species diversity is distributed, but without making any clear

141 assumption about how the species originated (e.g., the mid-domain or topography heterogeneity

142 hypotheses). Others explicitly refer to mechanisms of diversification and make predictions about

143 how these processes affected the distribution of species diversity over geographical space (see

144 Supplementary Documents for detailed accounts). We divided all the hypotheses into two

145 categories: one in which predictions for continuous two-dimensional spatial richness and

146 endemism can be derived, and another in which nominal AOE predictions can be derived. The

147 first category includes (1) the Mid-domain Effect, (2) Topographic Heterogeneity, (3) Climatic

148 Refugia, (4) Museum (montane refugia), (5) Disturbance-Vicariance, (6) Climate Stability, (7)

149 Sanctuary and (8) Montane Species Pump. The second category includes (9) the River-Refuge

150 (large river model), (10) Riverine Barrier (minor and major rivers), (11) Climatic Gradient and

151 (12) Watershed. All these hypotheses were transformed into explicit spatial representations

152 (Supplementary Materials) and used as predictor variables for further analyses.

153 We calculated unbiased correlation of the continuous predictor and test variables

154 following the method of Dutilleul

40

, which reduces the degrees of freedom according to the level

155 of spatial autocorrelation between two variables (detailed results in Supplementary Materials

156 Table S3). We found that measures of both reptile and amphibian endemism significantly

157 correlated to the predictor hypotheses of Topographic Heterogeneity, Disturbance-Vicariance

158 and Museum (montane refugia). Reptile endemism (but not amphibian) is also correlated to

159 Sanctuary. Correlations to species richness were not tied to measures of endemisms. Whereas

7

160 reptile species richness is correlated to the Mid-domain Effect (distance) and Sanctuary

161 hypotheses, amphibian richness is correlated to the Sanctuary hypothesis as well as to the

162 Topographic Heterogeneity, Montane Species Pump, Disturbance-Vicariance, Museum, and

163 River-Refuge hypotheses.

164 In the univariate correlation analyses of nominal geospatial data (those related to AOE

165 predictions) we compared the biogeographic zonation of Madagascar as suggested by the GDM

166 analysis of amphibian and reptile distributions with zonations derived from five predictor

167 hypotheses. We found all predictor variables (corresponding to the hypotheses Riverine-major

168 and Riverine-minor, Gradient, River-refuge, and Watershed) to be significantly correlated to the

169 15-class GDM, and all but watershed with the 4-class GDM zonation (Table 1). Both GDM

170 classifications share the most overlap with the Riverine and Gradient hypotheses (between 40.9–

171 54.3% and 56.2–71.1%, respectively; Table 1).

172 Given the significant correlation of each of the spatial amphibian and reptile biodiversity

173 patterns with various predictor variables we used mixed conditional autoregressive spatial

174 models (CAR models) to test the influences of various predictors simultaneously. To avoid over-

175 parameterization we used AICc (corrected Akaike Information Criterion), an information-

176 theoretical approach, to compare models with different sets of predictors. We found that complex

177 models including most of the biogeography hypotheses (continuous predictor variables)

178 performed best, based on the lowest AICc values, and consequently used these for further

179 analysis. Detailed contributions of each predictor to the models of richness, endemism and GDM

180 zonation are summarized in Supplementary Materials Table S4. The top-five variables

181 contributed 49.4–75.9% to the models. For a more simplified graphical representation (Fig. 3),

182 we summarized the three Mid-domain Effect hypotheses (latitude, longitude, and distance), the

8

183 three principal components representing the Climate Gradient hypothesis, and three hypotheses

184 focused on topography (Topographic Heterogeneity, Disturbance-vicariance, Montane Species

185 Pump) were categorized together, respectively (Figs 3 & 4) . We found relevant influences of the

186 Mid-domain Effect especially on the GDM and the species richness and endemism of reptiles

187 (30.9%, 32.9% and 45.5%, respectively). However, it is important to point out that almost all the

188 Mid-domain correlation coefficients were negative. Thus, indicating that Mid-domain Effects do

189 not play a key role in determining spatial patterning. Climate Gradient effects influenced all the

190 models of biodiversity equally, contributing roughly a quarter to each (25.1–27.7%), though in

191 many cases the sign of the contribution varied. However in this case, a positive correlation was

192 not expected. The topography variables contributed positively to the richness and endemism

193 models of amphibians and reptiles, with joint influences of 9.1% and 22.4% on richness, and

194 6.5% and 17.3% on endemism. The two unique hypotheses, Sanctuary and Museum, each

195 contributed positively to all models, with Museum contributing between 7.1–17.1% (one of the

196 few hypothesis to contribute >5% and to be positively correlated to all biodiversity

197 measurements). The Sanctuary hypothesis also contributed positively to all hypotheses, though

198 to a lesser degree than the Museum hypothesis (which demonstrated little contribution to reptile

199 endemism).

200 To assess variation in biogeography patterns among major groups of the Malagasy

201 herpetofauna, we calculated mixed CAR models using the same methods for richness and

202 endemism of four exemplar sub-clades: the leaf chameleons ( Brookesia ), tree frogs ( Boophis ),

203 day geckos ( Phelsuma ) and Oplurus iguanas (including the monotypic iguana genus

204 Chalarodon ). The top contributors to the models were drastically different for several of these

205 clades (Fig. 4). For instance, the topography variables had strong influences on Boophis richness,

9

206 with a joint contribution of 24.5%, but contributed much less to explaining the patterns of most

207 other groups. Further, the Sanctuary hypothesis had a strong influence on the Brookesia and

208 iguana models, though it contributed very little to the predictions of endemism in Boophis and

209 Phelsuma . Mid-domain Effects were apparent in most models but the sign on the correlation and

210 the contribution of each Mid-domain hypothesis varied considerably.

211

212

213 D

ISCUSSION

214 The results of this study clearly demonstrate that single-mechanism explanatory

215 hypotheses of spatial patterning in Madagascar's herpetofauna (and presumably, other Malagasy

216 vertebrates) are inadequate. Thus, we propose a novel method for examining and synthesizing

217 spatial parameters such as species richness, endemism, and community similarity. In this

218 framework, biogeographic hypotheses are explanatory variables. The resulting mixed-model

219 geospatial approach to biogeography analyses is both more robust, and more realistic. Our

220 approach has the potential to reduce bias and subjectivity in the search for prevalent factors

221 influencing the distribution of biodiversity, both in Madagascar and elsewhere. Currently,

222 researchers typically approach such questions by univariate and sometimes narrative analyses

223 that examine the fit of the observed patterns to only single explanatory models or mechanisms

224

(e.g. in Madagascar

33,35,41

) or compare a limited number of competing variables in univariate

225 approaches

37

. Such analyses are hampered, however, by spatial autocorrelation of biodiversity

226

patterns and predictor variables thereby inflating type-I errors in traditional statistical tests

42,43

.

227 Several solutions have been proposed for this problem. Some authors attempt to exclude spatial

10

228 autocorrelation from models

44

, whereas others incorporate spatial autocorrelation as a predictive

229 parameter in geospatial models 45-47 , as was applied in this study.

230 The results obtained here for some sub-clades are in agreement with previous analyses,

231 while others are not. For example, the high influence of the mid-domain effect on Boophis

232 treefrogs, one of the most species-rich frog genera in Madagascar, agrees with a previous

233 analysis performed by Colwell & Lees

48

for all Malagasy amphibians (with a high representation

234 of Boophis ). On the contrary, the negative contributions of the mid-domain effects on the

235 biodiversity patterns of the other genera in the analysis are obvious given that their centers of

236 richness and endemism are in either southern or northern Madagascar, but not in central parts of

237 the island. Previous studies postulated a high influence of topography on the diversification of

238 leaf chameleons ( Brookesia

),

41,49

though this is not supported by our analysis. This latter

239 example exemplifies a dilemma of scale, inherent in all comparisons of spatial data sets. In fact,

240 the distribution of Brookesia is highly specific to certain mountain massifs in northern

241 Madagascar while the genus is largely absent from the equally topographically heterogeneous

242 south-east. This absence is probably due to its evolutionary history, with a diversification mainly

243 in the north and limited capacity for range expansion

41

. This historical distribution pattern

244 probably accounts for low influence of the topographic hypotheses on Madagascar-wide

245 Brookesia richness and endemism, while at a smaller spatial scale (northern Madagascar) these

246 hypotheses might well have a strong predictive value.

247

248

While patterns of richness and endemism of the Malagasy herpetofauna have been

analyzed several times for various purposes based on partial data sets

8,35,37,41,48

the analysis of

249 turnover of species composition and the definition of biogeographic regions following from such

250 explicit analyses are still in their infancy. For reptiles, Angel's

50

proposal of biogeographic

11

251 regions based on classical phytogeography, i.e., regions based on plant community

252 composition 51 , has usually been adopted 52 . Later, Schatz 53 refined this zonation of Madagascar

253

254 based on explicit bioclimatic analyses, and Glaw & Vences

54

proposed a detailed geographical zonation based on the areas of endemism of Wilmé 33

. The GDM approach herein is the first

255 explicit analysis of a large herpetofaunal dataset to geographically delimit regions distinguished

256 by abrupt changes in their amphibian and reptile communities. This model turned out to agree

257

remarkably well with classical bioclimatic and phytogeographic zonations of Madagascar 51,53 ,

258 strongly correlated to climatic explanatory variables (Fig. 3). Especially in the 4-classes GDM,

259 the regions almost perfectly correspond with those proposed by Schatz

53

based on bioclimate,

260 i.e., eastern humid, central highland/montane, western arid, southwestern subarid zone. Although

261 the coincidence of the precise boundaries of these regions might be methodologically somewhat

262 biased, as we interpolated community distribution using climate variables in the analysis, the

263 model is still mainly based on real distributional information of species and thus provides

264 convincing evidence that amphibian and reptile communities strongly differ among the major

265 bioclimatic zones of Madagascar.

266 Several authors have suggested that the current distribution of biotic diversity in the

267

tropics resulted from a complex interplay of a variety of diversification mechanisms 55,56 . This

268

269 implies that no single hypothesis adequately explains the diversification of broad taxonomic groups

— our results support this assumption. Richness, endemism and turnover of large and

270 heterogeneous groups exemplified by the all-species amphibian and reptile data sets were in all

271 cases best explained by complex CAR models. These models have the advantage of

272 incorporating most or all of the originally included explanatory variables.

12

273 Several alternative explanations may account for this outcome. Patterns of biodiversity

274 may not be strongly correlated to any of the predictor mechanisms simply because none of them

275 provide the causal mechanism underlying the diversification processes. As another consideration,

276 spatial predictions of some of the biodiversity hypotheses may have been inaccurate, though we

277 took great care to avoid such mistakes. In any event, improvements in these methods may result

278 in different outcomes in future analyses.

279 Caveats aside, the results of this study almost certainly support a third explanation, that

280 different clades of organisms are each predominantly influenced by a different set of

281 diversification mechanisms. In turn, these are driven by intrinsic factors, such as morphological

282 or physiological constraints, or to extrinsic factors, such as an initial diversification in an area

283 characterized by a certain topography, climate, or biotic composition. This alternative is

284 supported by the observation that the patterns of several of the smaller subgroups in our analysis

285 were indeed best explained by opposing predominant variables, e.g., topographic heterogeneity

286 and museum ( Boophis endemism) vs. climate stability and sanctuary ( Brookesia endemism). An

287 overarching message is that the taxonomic scale of analysis is of extreme importance when

288 attempting to derive global explanations of biodiversity distribution patterns. Including too many

289 taxa will blur the existing differences among clades and lead to complex explanatory models,

290 whereas patterns within specific clades may be best explained by simple models.

291 The method proposed herein allows for a more objective quantification of the influences

292 of particular diversification mechanisms on biodiversity patterns, compared to traditional,

293 univariate approaches. Further developments of the method should especially focus on including

294 a phylogenetic dimension, and when appropriate (for predictor hypotheses), a temporal

295 component. Geospatial analyses of biodiversity pattern typically use species as equivalent and

13

296 independent data points, though in reality, they are entities with substantial variation in

297 parameters such as evolutionary age, dispersal capacity and population density, and with

298 different degrees of relatedness depending on their position in the tree of life. This multilayered

299 information can be included in various ways in the CAR/OTBC approach, e.g. by plotting

300 richness and endemism of evolutionary history rather than taxonomic identity, calculating

301 turnover only for sister species with adjacent ranges, or repeating the calculations for sets of

302 species defined by particular nodes on a phylogenetic tree. This latter approach— iterating the

303 analysis for successively more inclusive clades

— appears particularly promising for identifying

304 those moments in evolutionary history wherein shifts in prevalent diversification mechanisms

305 have occurred. Given this perspective, we can begin to tease apart the diversification histories of

306 individual clades versus prevailing biogeoclimatic events that shape entire biotas.

307

308

309 M ATERIALS AND M ETHODS

310

311 Biodiversity Estimates

312

313 Species Distribution Modeling

314 Species data consisted of 8362 occurrence records of 745 Malagasy amphibian and

315 reptile species (325 and 420 species, respectively). Species distribution models (SDMs) were

316 limited to species that had, at minimum, 3 unique occurrence points at the 30 arc-second spatial

317 resolution (ca. 1 km). The reduced dataset represented 453 species (consisting of 5440 training

318 points of 248 reptile and 205 amphibian species), with a mean of 12 training points per species

14

319 (max= 131). For 107 amphibian and 119 reptile species with only 1-2 occurrence records a 10km

320 buffer was applied to point localities in place of SDM. The species distribution models were

321 generated in MaxEnt v3.3.3e

57

using the following parameters: random test percentage = 25,

322 regularization multiplier = 1, maximum number of background points = 10000, replicates = 10,

323 replicated run type = cross validate.

324 One limitation of presence-only data SDM methods is the effect of sample selection bias,

325 where some areas in the landscape are sampled more intensively than others 58 . MaxEnt requires

326 an unbiased sample. To account for sampling biases, we used a bias file representing a Gaussian

327 kernel-density of all species occurrence localities. The bias file upweighted presence-only data

328 points with fewer neighbors in the geographic landscape

59

. Species distributions were modeled

329 for the current climate using the 19 standard bioclimatic variables derived from interpolation of

330 climatic records between 1950 and 2000 from weather stations around the globe (Worldclim

331 1.4

60

). Non-climatic variables geology, aspect, elevation, solar radiation, and slope were also

332

included

61,62

. All layers were projected to Africa Alber’s Equal-Area Cylindrical projection in

333 ArcMap at a resolution of 0.91 km

2

.

334

335 Correcting SDMs for Over-prediction

336

337

To limit over-prediction of SDMs, a problem common with modeling distributions of

Madagascar biota

8,37

, we clipped each SDM following the approach of Kremen

et al.

8

. This

338 method produces models that represent suitable habitat within an area of known occurrence

339 (based on a buffered MCP), excluding suitable habitat greatly outside of observed range. The

340 size of the buffer was based on the area of the MCP. We used buffer distances of 20km, 40km,

341 and 80km, respectively, for three MCP area classes, 0-200km

2

, 200-1000 km

2

, and >1000 km

2

.

15

342 All corrected SDMs were proofed by taxonomic experts to ensure reliability; if a model did not

343 tightly match knowledge of areas where distributions were well documented, or if little prior

344 information existed regarding a species distribution, or taxonomy was convoluted, and because

345 of, its expected distribution could not be evaluated, the species was excluded from analyses (n=

346 71).

347

348 Range Sizes, Species Richness and Corrected Weighted Endemism

349

350

For descriptive range-size statistics, distribution range-sizes were sampled for all species at 0.01 degrees

2

from corrected SDMs (or buffered point data where applicable) and a student’s

351 t-test with unequal variance was performed between amphibian and reptile species. To assess

352 differences in the frequency of microendemics among the two groups, we converted all

353 distributions that were > or < than 1000 km 2 to a value of 0 and 1, respectively. We then

354 calculated the mean frequency for both groups and ran a binomial test among both groups.

355 Species richness was calculated separately for amphibians and reptiles by summing the

356 respective corrected binary SDMs (based on a maximum training sensitivity plus specificity

357 threshold) and, for species with 1-2 occurrence records, buffered points in ArcGIS. This

358 provided a high resolution estimate of richness that is less affected by spatial scale and

359 incomplete sampling than traditional measurements based solely on occurrence records.

360 Measures of endemism are inherently dependent on spatial scale. We chose a grid scale

361 of 82 x 63 km, separating Madagascar into 24 latitudinal and 8 longitudinal rows, to reduce

362

problems associated with estimating endemism over too small or large areas

12,35

. Endemism was

363 measured as corrected weighted endemism (CWE), where the proportion of endemics are

364 inversely weighted by their range size (species with smaller ranges are weighted more than those

16

365 with large

63

) and this value divided by the local species richness

12

. CWE emphasizes areas that

366 have a high proportion of animals with restricted ranges, but not necessarily high species

367 richness. We calculated CWE separately for reptiles and amphibians using SMDtoolbox v1

368 (Brown in review).

369

370 Generalized Dissimilarity Modeling

371 Generalized Dissimilarity Modeling (GDM) is a statistical technique extended from

372 matrix regressions designed to accommodate nonlinear data commonly encountered in ecological

373 studies

38

. One use of GDM is to analyze and predict spatial patterns of turnover in community

374 composition across large areas. In short, a GDM is fitted to available biological data (the absence

375 or presence of species at each site and environmental and geographic data) then compositional

376 dissimilarity is predicted at unsampled localities throughout the landscape based on

377 environmental and geographic data in the model. The result is a matrix of predicted

378 compositional dissimilarities (PCD) between pairs of locations throughout the focal landscape.

379 To visualize the PCD, multidimensional scaling was applied, reducing the data to three

380 ordination axes, and in a GIS each axis was assigned a separate RGB color (red, green or blue).

381 Due to computation limitations associated with pairwise comparisons of large datasets,

382 we could not predict composition dissimilarities among all sites in our high resolution

383 Madagascar dataset. To address this, we randomly sampled 2500 points throughout Madagascar

384 from a ca. 10 km

2

grid. We then measured the absence or presence of each of the 679 species at

385 each locality. We used the same high resolution environmental and geography data used in the

386 SDM. These 23 layers were reduced to nine vectors in a principal component analyses which

387 represented 99.4% of the variation of the original data. These data were sampled at the same

17

388 2500 localities. Both data (species presence and environmental data) were input into a

389 generalized dissimilarity model using the R package: GDM R distribution pack v1.1

390 (www.biomaps.net.au/gdm/GDM_R_Distribution_Pack_V1.1.zip). We then extrapolated the

391 GDM into the high resolution climate dataset by assigning ordination scores using k-nearest

392 neighbor classification (k=3, numeric Manhattan distance), calculating each ordination axes

393 independently

38

.

394 The continuous GDM was transformed into a model with four major classes, and each of

395 these was then classified separately into 3-5 minor classes. The numbers of major and minor

396 classes were based on hierarchical cluster analyses in in SPSS v19

64

using a “bottom up”

397 approach. The number of classes equaled the number of dendrogram nodes with relative

398 distances (scaled from 0-1) at 0.71 and 0.63 for major and minor groups, respectively. The

399 distance cut off can be somewhat arbitrary, however in our data there were obvious

400 discontinuities (long dendrogram branches between nodes) at these two values. The resulting

401 classified models were interpolated into high resolution climate space using a k-nearest neighbor

402 classification as described above.

403

404 Biogeography hypotheses

405 We examined which specific spatial predictions for the three biodiversity patterns:

406 species richness, endemism and/or in areas of endemism (AOE- the coincident restrictedness of

407 taxa) in Madagascar could be derived from each of 12 biogeography hypotheses, and then

408 translated these predictions into spatial models in a GIS.

409 In a GIS, spatially explicit predictions of the three biodiversity patterns (species richness,

410 endemism or areas of endemism) were estimated for each biogeography hypothesis. For some of

18

411 the hypotheses not all three metrics of biodiversity were calculated due to lacking, or incomplete,

412 expectations (e.g. not all hypothesis make predictions about AOE). Because of these incomplete

413 biodiversity pattern predictions, comparisons among hypotheses are statistically non-trivial. This

414 is in part because few diversification hypotheses capture all facets of biodiversity (species

415 richness, endemism, AOE). Further, many estimates of biodiversity patterns rely on components

416 of climate or geography, thus some are based on the same data and are not entirely independent

417 of each other. Each hypothesis was generated at the spatial resolution of 30 arc-seconds

418 (matching the resolution of GDM and species richness estimates, later transformed to 0.91 km

2

).

419 For the endemism analyses, each biogeography hypothesis was upscaled to match resolution of

420 the endemism analyses by averaging all values encompassed in cell.

421

422 Spatial Statistics

423 The spatial predictions derived from the various biodiversity hypotheses resulted in either

424 continuous or nominal categorical data. Conducting statistical tests between data types is

425 nontrivial and, in some cases, not logical or impossible. Spatial data are represented in GIS by

426 two different formats: raster and vector. Geospatial raster data are composed of equal sized

427 squares, tessellated in a grid, with each cell representing a value (often continuous data), such as

428 elevation. Spatial vector data (commonly called ‘shapefiles’) can be represented by points, lines,

429 or polygons, such as: localities, roads and countries, respectively. Vector data are non-

430 topological and represent discrete features. They are often used to depict nominal data, where the

431 relationship of data categories to others is unknown or non-linear.

432 Raster data can be converted to vector data (and vice versa) and the data type ( i.e.

433 nominal or continuous) may or may not change when converted. For example, in some cases

19

434 continuous data can be converted to ordered categories (ordinal data) when converted from raster

435 to polygon. However if the same data were converted back to a raster file, it would remain

436 categorical data due to data loss in the first conversion. Regardless of GIS data format, statistical

437 tests need be chosen according to the data types, however GIS data format remain equally

438 important, as often a single data format is required to perform a spatial statistic of interest (i.e.

439 software input limitations).

440 Analyses- Continuous Data

441

442

To assess a global measurement of correlation between continuous data, we calculated

Pearson correlations following the unbiased correlation method of Dutilleul

40

and using the

443 software Spatial Analysis in Macroecology

65

.

444 Analyses- Nominal Categorical Data

445 Comparisons of nominal categorical spatial data ( i.e.

AOE predictions compared to

446 classified GDM) focused on the spatial distributions of the borders between the subunits. We

447 used the following methods to measure similarities and significance: (1) border overlap, and (2)

448 Pearson correlation coefficients (r) with Dutilleul’s spatial correlation (see above).

449 (1) Border overlap was calculated by sampling the landscape at 1 km resolution for the

450 presence of a border. If present, a point was placed. We then measured the spatial overlap of the

451 sampled borders of two landscapes. In all analyses, a 10 km buffer was applied to the overlap

452 calculation, and points datasets that overlap by 10km or less are were considered overlapping

453 boundaries. To account for differences in the level of subdivision of layers, overlap was

454 converted to a percentage and averaged for both layers being compared. Country outline was

455 excluded from all comparisons and thus, only intra-country boundaries were compared.

20

456 (2) To assess global correlation between two polygon shapefiles, each shapefile was

457 converted to a distance raster, measuring the closest distance from any point in the landscape to a

458 boundary. Using these layers we measured a Pearson correlation (unbiased correlation after

459 Dutilleul

40

), where high correlation coefficients represent two landscapes that have congruent

460 areas that are isolated from boundaries and others congruent areas that are adjacent to

461 boundaries. Each distance landscape was evenly sampled by 2000 points and correlations were

462 assessed on the values of these points.

463 Analyses- Mixed Models of Continuous Data

464 To determine the influence of each biogeography hypothesis in predicting the observed

465 biodiversity patterns, we integrated all continuous biogeography hypotheses into a single mixed

466 conditional autoregression model (CAR) using the software Spatial Analysis in Macroecology

65

.

467 To normalize the predictor variables, Box-Cox transformations (Box and Cox 1964) were

468 performed. The lambda parameter was estimated by maximizing the log-likelihood profile in R

469 package GeoR

47

. A Gabriel connection matrix was used to describe the spatial relationship

470 among sample points

66

. Using Gabriel networks, short connections between neighboring points,

471 is preferable (i.e. more conservative

67

) than using inverse-decaying distances because in most

472 empirical datasets the residual spatial autocorrelation tends to be stronger at smaller distance

473 classes

68

.

474 The main goal of our mixed spatial analyses were to determine the combination of

475 biogeography hypotheses that best predict the observed biodiversity patterns. If each explanatory

476 variable was incorporated natively, due to considerable multi-colinearity, often only a few

477 variables would end up contributing to a majority of the model. To estimate the true contribution

478 of each hypothesis in context of a mixed model (even if highly correlated to others), we

21

479 developed a novel approach that removes colinearity from the response variables (but in the

480 process explicit variable identity is temporarily lost). The transformed response variables are

481 then run in a CAR analysis and the resulting standardized model contributions are then

482 transformed back into original response variable identities; reflecting the relative contribution of

483 each in the model. This process is casually referenced here as Orthogonally Transformed Beta

484 Coefficients (OTBCs).

485 Orthogonally Transformed Beta Coefficients

486 Each biogeography hypothesis is standardized from zero to one. This ensured that the

487 component loadings reflected the relative contribution of each biogeography hypothesis. A

488 principal component analysis was performed on the standardized biogeography hypotheses using

489 a covariance matrix. All the resulting principal components (PCs) were extracted and then loaded

490 as explanatory variables in the CAR model. The CAR analyses were run iteratively, starting with

491 all PCs as response variables and then excluding each PC that did not contribute significantly to

492 the model (α = 0.05) until the final model included only PCs that contributed significantly to the

493 model. Because each PC represented a linearly uncorrelated variable, only the relevant,

494 independent data were incorporated into the final CAR model. The resulting standardized beta

495

496 coefficients ( β j from the CAR analyses, Fig. 1 and Equation 1) were then multiplied by the value of the corresponding component loadings (

α ij

from the PCA, see Equation 1). The absolute value

497 of the product reflects the relative contributions of each biogeography hypothesis to each PC,

498 which are weighted by the PC’s contribution in the CAR model (herein termed the weighted

499 component loadings or WCL if

, Equation 1). The weighted component loadings ( WCL if

, Equation

500 1) were then summed for each biogeography hypothesis across all PCs ( H i

) and depict the

501 contributions of each hypothesis in the CAR model. The 𝐻 𝑖

value was then converted to

22

502 percentages ( HP i

) to allow comparison among all CAR analyses. A positive or negative

503 correlation was determined for each biogeography hypothesis by running a separate CAR

504 analysis using the raw biogeography variables as a single response variable (all other parameters

505 were matched).

506

507 Equation 1

508 𝑾𝑪𝑳

𝒊𝒋

= |𝛽 𝑗

| × |𝛼 𝑖𝑗

| 𝑯 𝒊

= ∑ 𝑊𝐶𝐿 𝑖𝑗 𝑖

𝑯

𝑨𝒍𝒍

= ∑ 𝑊𝐶𝐿 𝑖𝑗 𝑖𝑗

𝑯𝑷 𝒊

𝐻 𝑖

= (

𝐻 𝑎𝑙𝑙

) ∗ 100

509

510

511 Acknowledgments

512 We are grateful to numerous friends and colleagues who provided invaluable assistance

513 during fieldwork and previous discussions of Madagascar's biogeography, we would like to

514 particularly thank Franco Andreone, Parfait Bora, Christopher Blair, Lauren Chan, Sebastian

515 Gehring, Frank Glaw, Steve M. Goodman, Jörn Köhler, Peter Larsen, David C. Lees, Brice P.

516 Noonan, Maciej Pabijan, Ted Townsend, Krystal Tolley, Roger Daniel Randrianiaina

517 Fanomezana Ratsoavina, David R. Vieites, and Katharina C. Wollenberg. Fieldwork of MV was

518 funded by the Volkswagen Foundation. JLB was supported by the National Science Foundation

519 under Grant No. 0905905 and by Duke University start-up funds to ADY.

520

521

522

523

524

23

525 References

559

560

561

562

563

564

565

566

567

545

546

547

548

549

550

551

552

553

554

555

556

557

558

526

527

528

529

530

531

532

533

534

535

536

537

538

539

540

541

542

543

544

1

2

3

4

5

6

7

Kent, M. Biogeography and macroecology. Progress in Physical Geography 29 , 256-

264, doi:10.1191/0309133305pp447pr (2005).

Beck, J.

et al.

What's on the horizon for macroecology? Ecography 35 , 673-683, doi:10.1111/j.1600-0587.2012.07364.x (2012).

Whittaker, R. H. Vegetation of the Siskiyou Mountains, Oregon and California. Ecol

Monogr 30 , 279-338, doi:10.2307/1943563 (1960).

Whittaker, R. H. Evolution and Measurement of Species Diversity. Taxon 21 , 213-251, doi:10.2307/1218190 (1972).

Williams, P. H. Mapping variations in the strength and breadth of biogeographic transition zones using species turnover. Proceedings of the Royal Society B-Biological

Sciences 263 , 579-588, doi:10.1098/rspb.1996.0087 (1996).

Kreft, H. & Jetz, W. A framework for delineating biogeographical regions based on species distributions. Journal of Biogeography 37 , 2029-2053, doi:10.1111/j.1365-

2699.2010.02375.x (2010).

Holt, B. G.

et al.

An Update of Wallace's Zoogeographic Regions of the World. Science

339 , 74-78, doi:10.1126/science.1228282 (2013).

8 Kremen, C.

et al.

Aligning conservation priorities across taxa in Madagascar with highresolution planning tools. Science 320 , 222-226, doi:10.1126/science.1155193 (2008).

Hoffmann, M.

et al.

The Impact of Conservation on the Status of the World's Vertebrates. 9

Science 330 , 1503-1509, doi:10.1126/science.1194442 (2010).

10 Platnick, N. I. On areas of endemism. Australian Systematic Botany 4 , 2pp.-2pp. (1991).

11 Harold, A. S. & Mooi, R. D. Areas of endemism: definition and recognition criteria.

Systematic Biology 43 , 261-266, doi:10.2307/2413466 (1994).

12 Crisp, M. D., Laffan, S., Linder, H. P. & Monro, A. Endemism in the Australian flora.

Journal of Biogeography 28 , 183-198 (2001).

13 Terborgh, J. & Winter, B. A method for siting parks and reserves with special reference to Columbia and Ecuador. Biological Conservation 27 , 45-58 (1983).

14 Ackery, P. R. & Vane-Wright, R. I.

Milkweed butterflies: Their cladistics and biology .

(British Museum of Natural History and Cornell University Press, 1984).

15 Hawkins, B. A.

et al.

Energy, water, and broad-scale geographic patterns of species richness. Ecology 84 , 3105-3117, doi:10.1890/03-8006 (2003).

16 Jetz, W., Rahbek, C. & Colwell, R. K. The coincidence of rarity and richness and the potential signature of history in centres of endemism. Ecology Letters 7 , 1180-1191, doi:10.1111/j.1461-0248.2004.00678.x (2004).

17 Carnaval, A. C., Hickerson, M. J., Haddad, C. F. B., Rodrigues, M. T. & Moritz, C.

Stability Predicts Genetic Diversity in the Brazilian Atlantic Forest Hotspot. Science 323 ,

785-789, doi:10.1126/science.1166955 (2009).

18 Kozak, K. H., Graham, C. H. & Wiens, J. J. Integrating GIS-based environmental data into evolutionary biology. Trends in Ecology & Evolution 23 , 141-148, doi:10.1016/j.tree.2008.02.001 (2008).

19 Lamoreux, J. F.

et al.

Global tests of biodiversity concordance and the importance of endemism. Nature 440 , 212-214, doi:10.1038/nature04291 (2006).

24

589

590

591

592

593

594

595

596

597

598

599

600

601

602

582

583

584

585

586

587

588

603

604

605

606

607

608

609

610

611

612

568

569

570

571

572

573

574

575

576

577

578

579

580

581

20 Linder, H. P.

et al.

The partitioning of Africa: statistically defined biogeographical regions in sub-Saharan Africa. Journal of Biogeography 39 , 1189-1205, doi:10.1111/j.1365-2699.2012.02728.x (2012).

21

Olivero, J., Márquez, A. L. & Real, R. Integrating fuzzy logic and statistics to improve the reliable delimitation of biogeographic regions and transition zones. Systematic biology 62 , 1-21 (2013).

22 Graham, C. H., Moritz, C. & Williams, S. E. Habitat history improves prediction of biodiversity in rainforest fauna. Proceedings of the National Academy of Sciences of the

United States of America 103 , 632-636, doi:10.1073/pnas.0505754103 (2006).

23 Vences, M., Wollenberg, K. C., Vieites, D. R. & Lees, D. C. Madagascar as a model region of species diversification. Trends in Ecology & Evolution 24 , 456-465, doi:10.1016/j.tree.2009.03.011 (2009).

24 Goodman, S. M. & Benstead, J. P. The natural history of Madagascar . (University of

Chicago Press Chicago, 2003).

25 Yoder, A. D. & Nowak, M. D. Has vicariance or dispersal been the predominant biogeographic force in Madagascar? Only time will tell. Annual Review of Ecology,

Evolution, and Systematics , 405-431 (2006).

26 Crottini, A.

et al.

Vertebrate time-tree elucidates the biogeographic pattern of a major biotic change around the K-T boundary in Madagascar. Proceedings of the National

Academy of Sciences of the United States of America 109 , 5358-5363, doi:10.1073/pnas.1112487109 (2012).

27 Samonds, K. E.

et al.

Spatial and temporal arrival patterns of Madagascar's vertebrate fauna explained by distance, ocean currents, and ancestor type. Proceedings of the

National Academy of Sciences of the United States of America 109 , 5352-5357, doi:10.1073/pnas.1113993109 (2012).

28 Yoder, A. D.

et al.

A multidimensional approach for detecting species patterns in

Malagasy vertebrates. Proceedings of the National Academy of Sciences of the United

States of America 102 , 6587-6594, doi:10.1073/pnas.0502092102 (2005).

29 Pastorini, J., Thalmann, U. & Martin, R. D. A molecular approach to comparative phylogeography of extant Malagasy lemurs. Proceedings of the National Academy of

Sciences of the United States of America 100 , 5879-5884, doi:10.1073/pnas.1031673100

(2003).

30 Goodman, S. M. & Ganzhorn, J. U. Biogeography of lemurs in the humid forests of

Madagascar: the role of elevational distribution and rivers. Journal of Biogeography 31 ,

47-55, doi:10.1111/j.1365-2699.2004.00953.x (2004).

31 Yoder, A. D. & Heckman, K. L. Mouse lemur phylogeography revises a model of ecogeographic constraint in Madagascar . (2006).

32 Dewar, R. E. & Richard, A. F. Evolution in the hypervariable environment of

Madagascar. Proceedings of the National Academy of Sciences of the United States of

America 104 , 13723-13727, doi:10.1073/pnas.0704346104 (2007).

33 Wilme, L., Goodman, S. M. & Ganzhorn, J. U. Biogeographic evolution of Madagascar's microendemic biota. Science 312 , 1063-1065, doi:10.1126/science.1122806 (2006).

34 Wollenberg, K. C., Vieites, D. R., Glaw, F. & Vences, M. Speciation in little: the role of range and body size in the diversification of Malagasy mantellid frogs. Bmc Evolutionary

Biology 11 , doi:10.1186/1471-2148-11-217 (2011).

25

634

635

636

637

638

639

640

641

642

643

644

645

646

647

627

628

629

630

631

632

633

648

649

650

651

652

653

654

655

656

657

613

614

615

616

617

618

619

620

621

622

623

624

625

626

35 Wollenberg, K. C.

et al.

Patterns of endemism and species richness in Malagasy cophyline frogs support a key role of mountainous areas for speciation. Evolution; international journal of organic evolution 62 , 1890-1907, doi:10.1111/j.1558-

5646.2008.00420.x (2008).

36 Pabijan, M., Wollenberg, K. C. & Vences, M. Small body size increases the regional differentiation of populations of tropical mantellid frogs (Anura: Mantellidae). Journal of evolutionary biology 25 , 2310-2324, doi:10.1111/j.1420-9101.2012.02613.x (2012).

37 Pearson, R. G. & Raxworthy, C. J. The evolution of local endemism in madagascar: watershed versus climatic gradient hypotheses evaluated by null biogeographic models.

Evolution; international journal of organic evolution 63 , 959-967, doi:10.1111/j.1558-

5646.2008.00596.x (2009).

38 Ferrier, S., Manion, G., Elith, J. & Richardson, K. Using generalized dissimilarity modelling to analyse and predict patterns of beta diversity in regional biodiversity assessment. Diversity and Distributions 13 , 252-264, doi:10.1111/j.1472-

4642.2007.00341.x (2007).

39 Allnutt, T. F.

et al.

A method for quantifying biodiversity loss and its application to a 50year record of deforestation across Madagascar. Conservation Letters 1 , 173-181, doi:10.1111/j.1755-263X.2008.00027.x (2008).

40 Dutilleul, P., Clifford, P., Richardson, S. & Hemon, D. Modifying the t test for assessing the correlation between two spatial processes. Biometrics , 305-314 (1993).

41 Townsend, T. M., Vieites, D. R., Glaw, F. & Vences, M. Testing Species-Level

Diversification Hypotheses in Madagascar: The Case of Microendemic Brookesia Leaf

Chameleons. Systematic Biology 58 , 641-656, doi:10.1093/sysbio/syp073 (2009).

42 Kreft, H. & Jetz, W. Global patterns and determinants of vascular plant diversity.

Proceedings of the National Academy of Sciences of the United States of America 104 ,

5925-5930, doi:10.1073/pnas.0608361104 (2007).

43 Hoeting, J. A. The importance of accounting for spatial and temporal correlation in analyses of ecological data. Ecological Applications 19 , 574-577, doi:10.1890/08-0836.1

(2009).

44 Ohlemuller, R., Walker, S. & Wilson, J. B. Local vs regional factors as determinants of the invasibility of indigenous forest fragments by alien plant species. Oikos 112 , 493-501, doi:10.1111/j.0030-1299.2006.13887.x (2006).

45 Bacaro, G. & Ricotta, C. A spatially explicit measure of beta diversity. Community

Ecology 8 , 41-46 (2007).

46 Bacaro, G.

et al.

Geostatistical modelling of regional bird species richness: exploring environmental proxies for conservation purpose. Biodiversity and Conservation 20 , 1677-

1694 (2011).

47 Diggle, P. J. & Ribeiro, P. J. J. Model-based Geostatistics . (Springer, 2007).

48 Colwell, R. K. & Lees, D. C. The mid-domain effect: geometric constraints on the geography of species richness. Trends in Ecology & Evolution 15 , 70-76, doi:10.1016/s0169-5347(99)01767-x (2000).

49 Raxworthy, C. J. & Nussbaum, R. A. Systematics, speciation and biogeography of the dwarf chameleons (Brookesia; Reptilia, Squamata, Chamaeleontidae) of northern

Madagascar. Journal of Zoology 235 , 525-558 (1995).

50 Angel, F.

Les Lézards de Madagascar

. (Academie Malgache, 1942).

26

679

680

681

682

683

684

685

686

687

688

672

673

674

675

676

677

678

689

690

691

692

693

694

695

696

697

658

659

660

661

662

663

664

665

666

667

668

669

670

671

51

Humbert, H. Les territoires phytogéographiques de Madagascar. Année Biologique 31

,

439–448 (1955).

52 Glaw, F. & Vences, M. Amphibians and Reptiles of Madagascar . (Vences, M. and Glaw

Verlags, F. GbR., 1994).

53 Schatz, G. E. in Diversity and Endemism in Madagascar (eds W.R. Lourenço & S.M.

Goodman) 1–9 (Société de Biogéographie, MNHN, ORSTOM, 2000).

54 Glaw, F. & Vences, M. Field Guide to the Amphibians and Reptiles of Madagascar .

Third Edition edn, (Vences and Glaw Verlag, 2007).

55 Bush, M. B. Amazonian speciation: a necessarily complex model. Journal of

Biogeography 21 , 5-17, doi:10.2307/2845600 (1994).

56 Oneal, E., Otte, D. & Knowles, L. L. Testing for biogeographic mechanisms promoting divergence in Caribbean crickets (genus Amphiacusta). Journal of Biogeography 37 ,

530-540, doi:10.1111/j.1365-2699.2009.02231.x (2010).

57 Phillips, S. J., Anderson, R. P. & Schapire, R. E. Maximum entropy modeling of species geographic distributions. Ecological Modelling 190 , 231-259, doi:10.1016/j.ecolmodel.2005.03.026 (2006).

58 Phillips, S. J.

et al.

Sample selection bias and presence-only distribution models: implications for background and pseudo-absence data. Ecological Applications 19 , 181-

197, doi:10.1890/07-2153.1 (2009).

59 Elith, J.

et al.

A statistical explanation of MaxEnt for ecologists. Diversity and

Distributions 17 , 43-57, doi:10.1111/j.1472-4642.2010.00725.x (2011).

60 Hijmans, R. J., Cameron, S. E., Parra, J. L., Jones, P. G. & Jarvis, A. Very high resolution interpolated climate surfaces for global land areas. International Journal of Climatology

25 , 1965-1978, doi:10.1002/joc.1276 (2005).

61 Moat, J. & Du Puy, D. (ed Kew Royal Botanic Gardens) (1997).

62 Jarvis, A., Reuter, H. I., Nelson, A. & Guevara, E. (CGIAR-CSI SRTM 90m Database,

2008).

63 Williams, P. H. Some properties of rarity scores for site-quality assessment. British

Journal of Entomology and Natural History 13 , 73-86 (2000).

64 IBM SPSS Statistics for Windows v. 19.0 (IBM Corporation, Armonk, NY, 2010).

65 Rangel, T. F., Diniz-Filho, J. A. F. & Bini, L. M. SAM: a comprehensive application for

Spatial Analysis in Macroecology. Ecography 33 , 46-50, doi:10.1111/j.1600-

0587.2009.06299.x (2010).

66 Legendre, P. & Legendre, L. Numerical ecology . Vol. 2nd (Elsevier, 1998).

67 Griffith, D. A. in Practical handbook of spatial statistics (ed S. L. Arlinghaus) 65-82

(CRC Press, 1996).

68 Bini, L. M.

et al.

Coefficient shifts in geographical ecology: an empirical evaluation of spatial and non-spatial regression. Ecography 32 , 193-204, doi:10.1111/j.1600-

0587.2009.05717.x (2009).

698

699

700

27

701

702 Figure 1. Overview of work protocol and dataflow.

Three types of original data were input

703 into the analyses: (1) biogeography hypotheses, (2) geography and climate data and (3) species

704 locality data. These data were used to predict distribution of species, and the distribution models

705 used to calculate biodiversity patterns (species richness, corrected weighted endemism,

706 turnover). We then tested for correlation of these biodiversity patterns with spatial predictions

28

707 derived from biogeography hypotheses, and used a multivariate model to simultaneously test the

708 influences of these hypotheses on the biodiversity patterns. *The response variables constituted

709 standardized principal components of the raw biogeography hypotheses ** The CAR models

710 were iterated until only response variables that contributed significantly the model were

711 included.

712

713

714

715

29

716 Figure 2 . Observed Biodiversity Data.

Reptile and amphibian species richness (A & E)

717 measures the number of species present. Endemism (B & F), based on a corrected weighted

718 calculation, reflects the proportion of unique species present within certain areas. The

719 generalized dissimiliarity model (GDM) analyzes compositional turnover of communities (here

720 jointly for amphibians and reptiles) and predicts dissimilarity throughout the landscape based on

721 an interpolation based on variation in climate and geographic data. The continuous GDM (G) can

722 then be classified into 4 major and 15 minor areas of endemism (H & C).

723

724

30

725

726 Figure 3. Explanatory contribution of continuous biogeography hypotheses to a conditional

727 autoregressive spatial model of each observed biodiversity measurements. Only hypotheses

728 contributing >5% are shown; see Supplementary Table S4 for complete and detailed data).

Mid-

729 domain: I. latitude, II. longitude, III. distance. Climate-Gradient: I. PC1, II. PC2, III. PC3.

730 Climate-Stability: I. Precipitation Stability, II. Climate Stability (temperature and precipitation).

731 Topography: I. Topographic Heterogeneity, II. Disturbance-vicariance, III. Montane Species

732 Pump. An asterisk marks hypotheses that contributed negatively to the mixed CAR model.

733

734

735

736

737

738

31

739

740

741 Figure 4. Contribution of continuous biogeography hypotheses to a conditional

742 autoregressive spatial model of species richness and endemism for four focal groups.

Only

743 hypotheses contributing >5% are shown; see Supplementary Table S4 for complete and detailed

744 data). Mid-domain: I. latitude, II. longitude, III. distance. Climate-Gradient: I. PC1, II. PC2, III.

745 PC3. Climate-Stability: I. Precipitation Stability, II. Climate Stability (temperature and

746 precipitation). Topography: I. Topographic Heterogeneity, II. Disturbance-vicariance, III.

747 Montane Species Pump. An asterisk marks hypotheses that contributed negatively to the mixed

32

748 CAR model. (1) Brookesia chameleons (number of species = 27, number of original distribution

749 points = 178). (2) Boophis treefrogs (n of spp.= 77, n of pts.=460) (3) Phelsuma day geckos (n of

750 spp.= 28, n of pts.= 304) (4) oplurid iguanas (Oplurus plus the monotypic Chalarodon ; n of

751 spp.= 7, n of pts.= 147). Subgroups and colors of pie charts as in Fig. 3. An asterisk marks

752 hypotheses that contributed negatively CAR. For all hypotheses (with exception of the climate-

753 gradient variables) a positive correlation was expected between biodiversity metrics.

754

755

756

757

33

Table 1. Correlations of nominal biodiversity hypotheses to Generalized Dissimilarity Models (upper: 15-class; lower: 4-class transformation of the original model). The left three columns show calculations based on the percentage of overlapping cells between boundaries. High mean values depict high levels of shared boundaries among GDM and AOE (derived from the respective hypothesis). R-values reflect non-spatial Pearson product-moment correlation coefficients. To assess significance of raster data, we used an unbiased correlation following the method of Dutilleul (1993) that reduces the degrees of freedom according to the level of spatial autocorrelation between the two variables.

Hypothesis Correlation to Generalized

Dissimilarity Model

Riverine- Major

Riverine- Minor

Gradient

Percent overlapping cells

(10km buffer)

GDM 15 classes

77.11%

72.43%

74.63%

Hypothesis

35.31%

49.41%

67.51%

Mean

56.21%

60.92%

71.07% r

0.427

0.503

0.403

F-stat

13.984

28.599

24.974 df

62.537

84.415

128.641 p

<.001

<.001

<.001

34

Watershed

GDM -4

Riverine – Refuge

Riverine- Major

Riverine- Minor

Gradient

Watershed

GDM -15

Riverine – Refuge

60.14%

100.00%

71.00%

GDM- 4 classes

25.92%

38.71%

20.69%

43.03%

69.35%

45.85%

0.229

0.551

0.440

5.887

26.257

11.450

Hypothesis Mean Pearson's r F-stat

106.824

60.171

47.693 df

0.017

<.001

0.001 p

41.6%

39.4%

39.6%

33.2%

52.1%

30.3%

40.3% 40.9%

58.1% 48.7%

69.0% 54.3%

33.3% 33.2%

100.0% 76.1%

22.1% 26.2%

0.556 12.519 28.042 0.001

0.51 16.345 46.550 <.001

0.378 13.012 77.926 <.001

0.213 3.775 79.701 0.056

0.551 26.257 60.171 <.001

0.837 52.355 22.433 <.001

35

SUPPLEMENTARY MATERIALS

Hypotheses

Climate Stability Hypothesis (Fig S2.J)

Climate stability is thought to create greater climatic stratification across environmental gradients (Dynesius and Jansson 2000, Jansson and Dynesius 2002). In stable climates, orbitally forced species’ range dynamics (ORD) are low, allowing localized populations to persist, and thus become highly specialized and differentiated. Like the Gradient Hypothesis (following), this model also focuses on bioclimatic disparities, but additionally incorporates climate stability. This hypothesis states that areas of climate stability, particularly those with climatic stratification, should possess higher species richness and endemism.

To estimate this model for Madagascar, coliniarity was measured for all BIOCLIM layers

(Bio1–19) using a Pearson Coefficient. If R

2

values exceeded 0.5, one of the layers was excluded. We preferentially selected layers based on raw data (e.g. selecting mean annual precipitation over seasonality). The following layers were excluded: Bio3, Bio7, Bio9, Bio13–

18. For each remaining BIOCLIM layer we calculated the standard deviation of each cell throughout the four time periods for which climate data were available (0 kya, 6kya, 21 kya, 120 kya). The resulting standard deviation of each BIOCLIM layer was standardized to 1 to account

36

for different units in raw data. All standardized stability layers were summed to create the final climate stability layer; with lower values representing higher climate stability through time.

Disturbance-Vicariance Hypothesis (Fig S2.I)

Under this model, the major factor contributing to diversification was temperature fluctuations

(Colinvaux 1993, Bush 1994, Haffer 1997), rather than fluctuations in precipitation and forest fragmentation (as in the preceding Refuge and River hypotheses). This hypothesis states cyclic fluctuations of temperature during the Quaternary caused reoccurring displacement of taxa towards lower and higher altitudes (during cool and warm periods, respectively). Range retractions of taxa into the highlands occasionally resulted in allopatric divergence into new species. The gradual displacement of temperature specialized taxa would cause reoccurring invasions and counter-invasions of heterogeneous landscapes by both montane and lowland taxa.

This hypothesis predicts species richness and endemism would be highest in areas of topographic heterogeneity and temperature instability.

This hypothesis was generated by measuring colinearity between all BIOCLIM layers corresponding to temperature (Bio1– Bio11, see Climate Stability above for details). The following layers were excluded: Bio3, Bio7 and Bio11. For each remaining BIOCLIM layer we calculated the standard deviation of each cell throughout the four time periods for which climate data were available (0 kya, 6kya, 21 kya, 120 kya). This layer was standardized from 0 to 1 to account for different units between layers. All standardized stability layers were summed to create the final temperature stability layer with lower values representing higher temperature stability. This layer was inverted and multiplied by a standardized version of the final

37

topographic heterogeneity layer (see above). Higher values represented areas with high temperature instability through time and high topographic heterogeneity.

Gradient Hypothesis (Fig S1.E)

According to the Gradient Hypothesis, diversification of Malagasy taxa was driven by bioclimatic disparities throughout the island (i.e. between the east and west), causing parapatry of populations along environmental gradients. This hypothesis was first formulated by Endler

(1982), however more recently it was adapted and formulated in detail for Madagascar by

Vences et al.

(2010). Vences et al.

focused on the climatic stratification longitudinally between the dry west and humid east of Madagascar, terming this species case the Ecogeographic hypothesis. Under the Gradient hypothesis populations adapt to local ecotones, diverging from a generalist ancestor (or one of broader ecological tolerance). Due to local specialization, gene flow within ecological similar sites is higher than those distributed at ecologically different sites across a gradient. The subdivision of populations create a scenario where drift or selection can override gene flow among ecogeographic subpopulations (Fisher 1930) and daughter species occupy separate, adjacent niches. The exact mechanism of speciation is controversial (e.g. prezygotic isolation, behavioral isolation or reproductive barriers) and beyond the focus of this study. This hypothesis invokes no barriers or mechanism of allopatry and predicts species richness and endemism to be highest in areas of high bioclimatic stratification. This GIS prediction was obtained from Pearson & Raxworthy (2009). In our continuous CAR-OBTC analyses, this hypothesis was represented by the first 3 PCs from a PCA on the 19 current

BIOCLIM data for Madagascar (Hijmans et al. 2005).

38

Mid-domain Effect Hypothesis (Fig S2. A-C)

Initially described to explain latitudinal trends in species richness, this mathematical hypothesis demonstrates that if species’ ranges are distributed randomly between northern and southern geographic limits, the highest overlap of species ranges would be in the middle (Lees 1996, Lees and Colwell 2007). For Madagascar, this hypothesis (in two dimensions) results in increased overlap of species toward the center of country, over the Ankaratra highlands, resulting from the sum of random overlapping species ranges. We explore four variants of this hypothesis: 1 dimension (the mid-domain of latitudinal, altitude or longitudinal gradients) and 2 dimensions, as described above (the mid-domain of both latitudinal and longitudinal gradients). Note the middomain of altitude is the same GIS calculation of the Museum hypothesis. Thus, throughout the manuscript this hypothesis is referred to as the Museum hypothesis rather than the mid-domain altitude hypothesis. In one dimension, the latitude hypothesis predicts that species richness would be highest around 18° S, or in two dimensions, centered around 18° S and 46.5° E. The mid-domain hypotheses are included as a null hypothesis for spatial variation in species richness; these hypotheses invoke no barriers or ecological/ habitat heterogeneity and rely solely on the random distribution of species in a defined geographic space.

Montane Species Pump Hypothesis (Fig S2.M)

This hypothesis predicts that montane regions have higher species richness because of their topographic complexity and climatic zonation - both increase opportunities for allopatric and parapatric speciation (e.g., Moritz et al. 2000; Rahbek and Graves 2001; Hall 2005; Fjeldsa° and Rahbek 2006; Kozak and Wiens 2007). This hypothesis predicts that speciation should be highest in areas of high topographic and climatic heterogeneity. Within those habitats, rates of

39

speciation should be highest in mid-elevations. This hypothesis predicts species richness and endemism would be highest in areas of topographic and climatic heterogeneity.

To create a spatially explicit model of this hypothesis we calculated the mean standard deviation of the first three climate PCs (based on Bioclim current data) at ca. 10 km

2

square neighborhood of each cell. Each was summed together and then the product was standardized from 0-1. The resulting layer was then multiplied by a standardized (0-1) topographic variation hypothesis. High values depict areas of high topographic and climatic heterogeneity.

Museum Hypothesis (Fig S2.D)

According to this hypothesis speciation occurred in montane habitats. In the Montane Museum hypothesis, more species exist at intermediate elevation because these elevations were simply occupied the longest, because of this, there has been more time for speciation and the accumulation of species in these habitats when compared to habitats at lower and higher elevations (Stebbins 1974; Stephens and Wiens 2003). This hypothesis predicts levels of endemism and richness will be highest in the middle elevations. Note that in execution, the prediction for this hypothesis is identical to a Mid-domain hypothesis of altitude. The GIS representation of this hypothesis for Madagascar represents the median elevation (378m), where the overlap of species should peak. This elevation was given a value of one and from this elevation, values linearly transitioned to zero at the maximum and minimum elevations.

Paleogeographic Hypothesis

This hypothesis states that vicariate differentiation of Malagasy lineages is associated with formation of geologic barriers to dispersal. Each hypothesis is specific to the focal

40

paleogeographic event. There are two major classes of geological events that have occurred since the separation from India (ca. 65 MYA). The first, directly a result from separation from India, is the orogeny of eastern escarpment. This hypothesis states that deep lineages, pre-65Mya, should exist between the east and western species. A second, more local barrier is the volcanisms of several regions: Tsaratanana, Manongarivo, Ampasindava (ca. 50 Mya), Ankaratra (phase 1 ca.

28 Mya, phase 2 ca.15 Mya, phase 3 ca. 2 Mya) and Ambre (ca. 2 Mya, Krause 2007). Clades should exhibit breaks around volcanic activity, though given localization of these barriers; biota in most cases should have been able to dispersal around, though perhaps experiencing reduced gene flow. All paleogeographic hypotheses are difficult to test because more strongly than other hypotheses, they are dependent on the location of ancestral populations and the timing of geological events relative to a clade's diversification. Given the variation of ages of origins in

Madagascar's vertebrates, it is likely that some clades were affected by most of the major paleogeographical events while others were only affected by some. Because of these factors, it is unlikely that species exhibit congruent biogeographic patterns. Thus in this paper, we were unable to test this hypothesis.

Refuge Hypothesis (Fig S2.D) This hypothesis holds that episodic fragmentation of forests resulted in isolated patches of wet forest and this caused vicariant differentiation between adjacent patches. These transitions were driven by periodic changes (every 20-100 Ky) in insolation associated with Milankovitch cycles (aberrations of the orbit of the earth around the sun due to the slight asymmetrical shape of the earth). Recurrent changes in insolation caused various dry periods followed by humid periods, particularly pronounced at tropical latitudes. The evolutionary consequences of paleoecological changes in climate likely depend on the regional

41

topography, predominant weather patterns and the impact on the ecosystem (for example, a slight reduction in rain may have different biological consequences in a spiny forest versus rainforest). In Amazonia during the late Tertiary and Quaternary during dry climatic periods, it has been argued that extensive humid forests survived due to subtle topographic variation that facilitated rainfall gradients adjacent to the Andes, Guianan highlands, Rondonia and hilly areas east of Pará (Haffer 1969; Vanzolinii1970, 1973; Brown et al., 1974; Prance, 1982, 1996).

In addition to changes in insolation associated with Milankovitch cycles, the gradual drifting of Madagascar northward towards the equator likely led to habitat changes, presenting another source of refugial isolation. Thus, it is plausible that refugia persisted and species with narrow ecological niches became isolated and diverged into separate species. To create a continuous spatial representation of this for Madagascar, first colinearity was measured for each

BIOCLIM precipitation layer (Bio12 – Bio19) using a Pearson Correlation. If R 2 values exceeded 0.5, one of the layers was excluded. This resulted in the exclusion of Bio13, Bio15,

Bio16 and Bio17. We preferentially selected layers based on raw data (e.g. selecting mean precipitation over seasonality). For each remaining BIOCLIM layer, we calculated the standard deviation of each cell throughout the four time periods for which climate data were available (0 kya, 6kya, 21 kya, 120 kya). This layer was standardized from 0 to 1 to account for different units/decimal places in raw data between layers. All standardized stability layers were summed to create the final precipitation stability layer with lower values representing higher precipitation stability.

Riverine Barrier Hypothesis (Fig S1. A,B).

42

Malagasy rivers are thought to have acted as barriers separating populations, resulting in intrariverine species and subspecies. As the geology of Madagascar has remained relatively stable since absolute isolation major rivers should have persisted (at least seasonally). Under this hypothesis, we expect vicariate differentiation of Malagasy lineages associated with large tributaries. There have been several criticisms of this hypothesis, such that rivers frequently change course, causing land and its inhabitants to passively transfer across the barrier, rivers cease to act as barriers due to the lack geographic separation at headwaters (Wallace 1852) or temporal fluctuations of climate causing rivers to change in size (e.g. the sizes of any rivers were dramatically reduced during the Pleistocene).

To create a spatial prediction for this hypothesis, we selected all major permanent rivers that have headwaters above 1000m and created polygons from the lowland areas between major rivers and 1000m contour line. A second calculation focused on major riverine units composed of areas between rivers with headwaters above 2000m and created biogeographic units from lowland areas between rivers and 1000m contour line.

River-Refuge Hypothesis (FigS1.C)

The River-Refuge hypothesis, initially described by Haffer (1992, 1993) for the Neotropics, was more recently proposed as a model of Malagasy diversification (Craul et al.

2007). This hypothesis combines aspects of the Riverine Barrier hypothesis and the Refuge hypothesis under which it states that lowland vicariant speciation occurs in refugia separated by broad lowland rivers and by considerable unsuitable terrain in the headwaters.

To estimate this model, we combined the Riverine and a binary refuge hypothesis. The continuous refuge layer was converted to a binary model by converting the top quartile to 1 and

43

all other values to 0. To account for differences between precipitation stability in arid areas

(versus wet), we excluded areas with less than 50cm in any of the time periods. Areas above

1000m were also excluded. Regions of the Riverine layer and binary refuge layer were then combined into smaller river-refuge subunits.

Sanctuary Hypothesis (Fig S2.L)

Past climate changes greatly altered the distributions of organisms through time; causing local extinctions, bottlenecks, isolation, range expansion and contraction of populations. Sanctuaries represent specific areas of habitat stability that have remained present through time, differing from refugia (which do not invoke geospatial consistency) and species track suitable habitat across geography (Recuero & García-París 2011). We estimated sanctuaries in Madagascar by calculating SDMs for 453 species (a subset of the GDM dataset, using species with three or more unique occurrence localities). The SDM was projected into three historic time periods: LGM,

120KYA, 6 KYA. All SDMs were converted to binary models using the maximum training sensitivity plus specificity threshold. The maximum training sensitivity plus specificity threshold: 0.080 (SD +-0.047), area: 0.316 (SD +-0.196), training omission: 0.005 (SD +- 0.017), number of training samples (mean: 12.321, max 137, min 3, SD 14.414). For each species, all four binary SDMs were summed. The resulting layer was reclassified so that values of 3 and below were converted to zero and values of 4 were converted to 1. Under this classification, areas where the species was present for all four time periods were considered ‘sanctuaries’. This was repeated for all species and all sanctuaries were summed to estimate areas of higher species richness and endemism.

44

Topographic Heterogeneity (Fig S2.E)

In several studies, the level of topographic variation has been observed to be positively correlated to species richness patterns and centers of endemism (Kerr & Packer 1997; Rahbek and Graves 20001; Jetz & Rahbek 2002; Jetz et al. 2004). To characterize this hypothesis for

Madagascar we measured the standard deviation of elevation at ca. 10 km

2

of each pixel.

Watershed Hypothesis (Fig S1.D)

One of the more recent diversification hypotheses is the Watershed hypothesis (Wilmé et al.

2006). According to this model, climatic changes caused retraction of forests to the surrounding major rivers. If the headwaters of adjacent rivers were at lower elevations, the intervening areas between rivers (the watersheds) become arid and forests populations became isolated, serving as areas of speciation. By contrast, if the headwaters of rivers were higher elevations, the watershed served as areas of retreat and forest refugia remained connected among rivers. These watersheds are expected to contain proportionally much lower diversity and endemism. This GIS prediction was obtained from Wilmé et al. (2006).

Additional references.

Brown Jr., K.S., Sheppard, P.M., Turner, J.R.G. 1974. Quaternary refugia in tropical America: evidence from race formation in Heliconius butterflies. Proc. R. Soc. Lond. B. 187: 369-378.

Bush, M.B. 1994. Amazonian speciation: a necessarily complex model. Journal of Biogeography 21::5–17.

Colinvaux, P.A. 1993. Pleistocene biogeography and diversity in tropical forests of South America P. Goldblatt

(Ed.), Biological Relationships between Africa and South America, Yale University Press, New Haven, CT.

Craul, M., Zimmermann, E., Rasoloharijaona, S., Randrianambinina, B., Radespiel, U. 2007. Unexpected species diversity of Malagasy primates ( Lepilemur spp.) in the same biogeographical zone: a morphological and molecular approach with the description of two new species. BMC Evolutionary Biology 7:83.

45

Dynesius, M., Jansson, R. 2000. Evolutionary consequences of changes in species_ geographical distributions driven by Milankovitch climate oscillations. Proc. Natl Acad. Sci. U.S.A. 97:9115–9120.

Endler, J. 1982. Pleistocene forest refuges: fact or fancy? In Prance, G.T. (Ed.). Biological Diversification in the

Tropics. New York: Columbia University Press, p. 179-200.

Fisher, R.A. 1930. The Genetical Theory of Natural Selection. Clarendon Press.

Goodman, S. M. & Benstead, J. P. The natural history of Madagascar. (University of Chicago Press Chicago, 2003).

Fjeldsa°, J., Rahbek, C. 2006. Diversification of tanagers, a species-rich bird group, from the lowlands to montane regions of South America. Integr. Comp. Biol. 46:72–81. (doi:10.1093/icb/icj009)

Haffer, J. 1969. S peciation in Amazonian forest birds. Science 165: 131-137.

Haffer, J. 1997.Alternative models of vertebrate speciation in Amazonia: an overview Biodiversity and

Conservation, 6:451–476.

Haffer,, J. 1992. On the “river effect” in some forest birds of southern Amazonia. Bol. Mus. Para. Emilio Goeldi, sér. Zool. 8:217-245.

Haffer, J. 1993. Time’s cycle and time’s arrow in the history of Amazonia. Biogeographica 69:15-45.

Hall, J. P. 2005. Montane speciation patterns in Ithomiola butterflies (Lepidoptera: Rhiodinidae): are they consistently moving up in the world? Proc. R. Soc. B 272:2457–2466. (doi:10.1098/rspb.2005.3254)

Jansson, R., Dynesius, M. 2002. The fate of clades in a world of recurrent climatic change: Milankovitch oscillations and evolution. Annu. Rev. Ecol. Syst. 33:741–777.

Jetz, W., Rahbek, C. 2002. Geographic range size and determinants of avian species richness. Science 297:1548–

1551.

Jetz, W., Rahbek, C., Colwell, R.C. 2004. The coincidence of rarity and richness and the potential signature of history in centers of endemism. Ecol. Lett. 7:1180–1191.

Kerr, J.T. Packer, L. 1997. Habitat heterogeneity determines mammalian species richness in high energy environments. Nature. 385:252-254..

Kozak, K.H., Wiens, J.J. 2007. Climatic zonation drives latitudinal variation in speciation mechanisms. Proc. R.

Soc. B 274:2995-3003. doi: 10.1098/rspb.2007.1106

46

Krause, D. W. 2003. Late Cretaceous vertebrates of Madagascar: A window into Gondwanan biogeography at the end of the Age of Dinosaurs. Pp. 40-47 in S. M. Goodman and J. P. Benstead (eds.), The Natural History of

Madagascar. University of Chicago Press, Chicago.

Lees, D. C. 1996. The Périnet effect? Diversity gradients in an adaptive radiation of butterflies in Madagascar

(Satyrinae: Mycalesina ) compared with other rainforest taxa, Pages 479-490 in W. R. Lourenço, ed.

Biogéographie de Madagascar. Paris, Editions de l'ORSTOM.

Lees, D. C., Colwell R.K. 2007. A strong Madagascan rainforest MDE and no equatorward increase in species richness: Re-analysis of 'The missing Madagascan mid-domain effect', by Kerr J.T., Perring M. & Currie D.J

(Ecology Letters 9:149-159, 2006). Ecology Letters 10:E4-E8.

Moritz, C., Patton, J. L., Schneider, C. J., Smith, T. B. 2000. Diversification of rainforest faunas: an integrated molecular approach. Annu. Rev. Ecol. Syst. 31:533–563. (doi:10. 1146/annurev.ecolsys.31.1.533)

Pearson, R.G., Raxworthy C.J. 2009. The evolution of local endemism in Madagascar: watershed versus climatic gradient hypotheses evaluated by null biogeographic models. Evolution 63:959–967.

Prance, G.T. 1982. Biological diversification in the tropics. New York: Columbia University.

Prance, G.T. 1996. Islands in Amazonia. Phil. Trans. R. Soc. Lond. B 351: 823-833.

Rahbek, C. 1997. The relationship among area, elevation, and regional species richness in Neotropical birds. Am.

Nat. 149:875–902.

Rahbek, C., Graves, G. R. 2001. Multiscale assessment of patterns of avian species richness. Proc. Natl Acad. Sci.

USA 98:4534–4539. (doi:10.1073/pnas.071034898)

Recuero, E., García-París, M. 2011. Evolutionary history of Lissotriton helveticus : Multilocus assessment of ancestral vs. recent colonization of the Iberian Peninsula. Molecular Phylogenetics and Evolution 60: 170-

182.

Stebbins, G. L. 1974. Flowering plants: evolution above the species level. Harvard University Press, Cambridge,

Mass.

Stephens, P. R., Wiens, J.J. 2003. Explaining species richness from continents to communities: the time-forspeciation effect in emydid turtles. American Naturalist 161:112–128.

Vanzolini, PE. 1970. Zoología sistemática, geografía e a origem das espécies. São Paulo: Instituto Geográfico de

São Paulo. 56 p

47

Vanzolini, P.E. 1973. Paleoclimates, relief, and species multiplication in equatorial forests. In Meggers, B.J.,

Ayensu E.S.. Duckworth, W.D. (Eds.). Tropical forest ecosystems in Africa and South America: A comparative review. Washington: Smithsonian Institution. p. 255-258.

Wallace, A.R. 1852. On the monkeys of the Amazon. London. Proc. Zool. Soc., 1852:107-110.

Wilmé, L., Goodman, S.M., Ganzhorn, J.U. 2006. Biogeographic evolution of Madagascars microendemic biota.

Science 312:1063– 1065.

48

Supplementary Figure S1. Nominal Biogeography hypotheses.

These hypotheses are comprised of non-order categories from which relationships among contained data are not known

49

or non-linear. Four hypotheses fell into this category: The Riverine (major and minor), River-

Refuge, Watershed and Gradient. See table S1 for summary of each hypotheses.

50

51

Supplementary Figure S2. Continuous Biogeography hypotheses.

Nine hypotheses fell into this category: Mid-domain (longitude, latitude and distance), Museum, Topographic

Heterogeneity, Gradient* (PC1-PC3), Disturbance-vicariance, Climate Stability, Precipitation

Stability, Sanctuary and Montane Species Pump (see table S1 for summary of each hypotheses).

Inlayed tables represent the percent contribution of each corresponding hypothesis in the CAR model with the lowest AICc of each observed biodiversity measurements. *The values of the three climate principal components are not necessary assumed to reflect a positive correlation to endemism and species richness, however, we are assuming each reflects a prediction of a linear correlation (either positive or negative). The inclusion of the three climate principal components is the result of the power CAR models and the ability to include multiple explanatory variables.

Due to the statistical limitations associated with nominal hypotheses, if a hypothesis could be depicted by continuous data (even if it required several variables) they were converted to this format and incorporated into the CAR. For example, if nominal data represented classified continuous data, we include the continuous data (such as the 3 PCs of climate data).

52

Supplementary Table S1. Major Biogeographic Predictions Relevant to Madagascar.

Hypothesis

Climate Stability Climate stability creates

(Fig S2.J) greater climatic stratification across environmental gradients

Disturbancevicariance

(Fig S2.L)

(Fig S1.E)

Gradient

Description

Allopatry results from altitudinal range retractions caused by temperature fluctuations.

Parapatry of populations due to environmental gradients

Key Factors

Stable climate; both seasonally and through geologic time; no barrier is necessary

Effects

In stable climates, orbitally forced species’ range dynamics

(ORD) are low, allowing localized populations to persist, and thus become highly specialized and differentiated.

Divergent selection and an environmental

Parapatric speciation across climate space

Predictions for Reptiles and Amphibians

Higher levels of endemism in areas of climatic stability

GIS Prediction

Use GIS and spatiotemporal explicit climate data to estimate climate stability; stable areas should harbor higher species richness and endemism.

Temperature fluctuations, changes in CO

2 and habitat heterogeneity

(usually associated with changes in altitude)

Decreased temperatures allow cool adapted species to disperse

Diversity with south. Cyclic fluctuations in temperature cause populations to region. Most common habitat track attitudinally, when associated with a single ancestor of sister clades

Estimate areas of high monophyletic lineages are temperate fluctuations adjacent areas of slope; higher values reflect higher species richness temperatures are at their highest, date to Pleistocene. populations become isolated on sky islands

Sister taxa are found in different habitats along an current climate data to environmental gradient

Cluster analyses of all estimate areas of endemism.

Temporal

Scope

None

Key Citations

Dynesius & Jansson,

2000; 2002

Quaternary Colinvaux 1993,

Bush 1994, Haffer

1997; Raxworthy

Nussbaum 1995.

None Endler 1982

53

Mid-domain

Effect

(Fig S2. A-C )

Montane Species

Pump

(Fig S2. M) gradient; no barrier is necessary

Species’ ranges are distributed randomly between

Geographic space northern and southern geographic limits, the highest overlap of species ranges would be in the middle.

Topographic complexity and climatic zonation of mountains increase opportunities for allopatric and parapatric speciation

Topographic and climatic heterogeneity

No mechanism invoked

Allopatric and parapatric speciation across elevations

Species richness should be Richness is highest in the None highest in mid-domains mid-domain of latitude, longitude and elevation.

Sister taxa share common ancestry with montane ancestor. Extant montane species display higher levels of intraspecific genetic variation.

Estimate areas of topographic and climatic

None heterogeneity- high values reflect centers of high endemism and species richness

Museum

( Fig S2.D)

More species occur at intermediate elevations simply because these elevations were occupied the longest and there has been more time for speciation and

Extended time occupying in midelevations

Increased differentiation at midelevations

More species occur at intermediate elevations

Use GIS to calculate the median elevation of because these elevations Madagascar which were occupied the longest should possess the highest species richness and endemism

None

Lees 1996, Lees and

Colwell 2007

Moritz et al. 2000;

Rahbek and Graves

2001; Hall 2005;

Fjeldsa° and Rahbek

2006; Kozak and

Wiens 2007, Roy et al 1997, Fjeldsa et al

1999

Stephens and Wiens

2003

54

the accumulation of species in these habitats relative to those at lower and higher elevations

Paleogeographical Vicariant differentiation of

Malagasy lineages is associated with formation of geologic barriers to dispersal.

Each hypothesis is specific to the focal paleogeographic event.

Geological changes Vicariant differentiation across resulting in vicariant barriers events

Distinct east and west lineages associated with the central mountains, and between the southern/central/northern massifs, tsingys.

Not Calculated Specific to each geological event

Refuge

(Fig S2. K)

Allopatry due to retraction of wet habitats

Riverine (Fig Allopatry due to rivers acting

S1. A, B) as barriers.

Repeated cycles of drastic fluctuation in forests resulting in isolated precipitation

Episodic fragmentation of patches of wet forest causing

Evolutionary lineages associated with refugia vicariant differentiation between precipitation relative to adjacent patches

(areas of continued regional mosaic of habitats)

Permanent large rivers

Vicariant differentiation of

Malagasy lineages associated with large tributaries

Use GIS and spatiotemporal explicit

Reciprocal monophyly of clades on opposite sides of areas which are areas of river

Measured inter-riverine endemism

Cenozoic

(Tertiary climate data to estimate and stable wet habitats, these Quaternary

) areas reflect centers of endemism

None

Goodman &

Benstead 2003

Haffer 1969m 1990,

1999, Endler (1982),

Brown (1987) Nores

(1999)

Wallace 1853,

Capparella 1991,

Patton et al.

1994,

55

River-Refuge

(Fig S1.C)

Sanctuary

(Fig S2.L)

Topographic

Heterogeneity

(Fig S1. D)

Allopatry due the restriction of wet habitats to lower elevations; higher elevation habitats and intervening rivers act as barriers

Extinction occurs more often in instable habitats; thus stable areas accumulate species.

Reduced precipitation, maintenance of permanent rivers

Similar to Riverine Hypothesis; fragmental faunal distributions into intra-riverine corridors, isolation is associated with increased aridity adversely affecting habitat suitability at headwater regions

Climate fluctuations Areas of niche stability through time across accumulate species through time. heterogeneous landscapes

The level of topographic variation has been observed to be positively correlated to species richness patterns and centers of endemism

Topography No mechanism invoked

Sister taxa are found in adjacent intra-riverine corridors

Combine the Riverine

Hypothesis subunits and a binary precipitation stability /low elevation layer. Resulting areas depict areas of high predicted endemism.

Late tertiary

(Post-

Miocene)

Areas of niche stability

(specific aspects of

Use GIS and SDM to estimate stable areas in climate vs. overall climate each species distribution

None in Climate stability) provide sanctuary for species though time. through time; areas of highest stability should harbor higher species richness and endemism

Species richness should be Areas of high highest in areas of high topographic heterogenety topographic heterogeneity harbor higher species richness

None

Goodman &

Ganzhorn, 2004

Haffer 1992; 1993,

Craul et al 2007

Recuero & García-

Paris 2011

Kerr & Packer 1997;

Rahbek and Graves

20001; Jetz &

Rahbek 2002; Jetz et al. 2004

56

Watershed

(Fig S1.D)

Allopatry results from altitudinal range retractions caused by simultaneous decreases in temperature and

Repeated cycles of Dispersal during warm-humid drastic simultaneous periods allows lowland species increases of both temperature and to disperse attitudinally, across headwater habitat (previously precipitation. Lower elevation precipitation. Large too arid to occupy). Allopatry rivers act as barriers. permanent rivers and mountains occurs as climate cools and the species depends into lowlands adjacent to lowlands. and lowland rivers prevent gene flow between populations, now occurring on both sides of river.

Sister taxa are found in adjacent intra-riverine corridors. Most common ancestor of sister clades date to Pleistocene.

See Wilmé et al. 2006.

Quaternary Wilmé et al 2006

57

Supplementary Table S3.

Correlations of biodiversity hypotheses to observed biodiversity patterns. R-values reflect non-spatial

Pearson product-moment correlation coefficients. To assess significance of raster data, we used an unbiased correlation following the method of Dutilleul (1993). This method reduced the degrees of freedom according to the level of spatial autocorrelation between the two variables.

Hypothesis

Correlation to Observed Endemism R F-stat

Reptile df p r

Amphibian

F-stat df p

Mid-domain: Distance

Topographic Heterogeneity

Refuge

Montane Species Pump

Disturbance-vicariance

Climate stability

Sanctuary

Museum

-0.148

0.343

0.036

0.303

0.379

0.241

0.296

0.285

0.539

5.425

0.04

2.639

4.101

0.62

2.438

4.971

24.037

40.738

32.328

48.488

22.413

14.580

25.313

56.429

0.470

0.025*

0.848

0.107

0.055

0.444

0.131

0.030*

-0.026

0.662

0.154

0.613

0.616

0.356

0.606

0.335

0.018

21.664

0.552

13.852

15.758

2.033

7.611

17.709

26.753

27.724

22.722

22.990

22.829

15.124

13.080

49.836

0.894

<.001*

0.465

0.001*

<.001*

0.174

0.016*

0.015*

58

River-Refuge (binary)

Correlation to Observed Richness

Mid-domain: Distance

Topographic Heterogeneity

Refuge

Montane Species Pump

Disturbance-vicariance

Climate stability

Sanctuary

Museum

River-Refuge (binary)

0.207 3.87

R F-stat

-0.480 10.265

0.140 2.398

-0.101 0.332

0.056 0.507

0.019 2.297

0.233 1.506

0.419 10.079

0.234 4.037

0.108 2.378

86.869 0.052 0.066 0.954 216.959 0.330 df p r F-stat df p

34.353 0.003* -0.204 1.072 24.698 0.310

119.852 0.124 0.307 6.769 64.952 0.011*

31.948 0.598 -0.093 0.216 24.605 0.646

160.576 0.477 0.296 5.506 57.196 0.022*

62.251 0.135 0.303 5.286 41.554 0.027*

26.191 0.231 0.210 0.723 15.672 0.408

47.267 0.003* 0.818 34.069 16.885 <.001*

69.472 0.048* 0.250 4.481 66.989 0.038*

201.187 0.125 0.234 4.94 85.022 0.029*

59

Supplementary Table S4. Mixed CAR spatial models of observed biodiversity data. A principal component analyses was performed on the standardized biogeography hypotheses. All the resulting principal components (PCs) were extracted and then loaded as explanatory variables. The CAR analyses were run iteratively, starting with all PCs as response variables and then excluding each PC that did not contribute significantly to the model (α = 0.05) until the final model included only PCs that contributed significantly to the model. The standardized beta coefficients ( β ) were then used to calculate contributions of each biogeography hypothesis (see methods on OTBCs) in the final CAR analysis. To compare the contributions of each biogeography hypothesis among models of observed biodiversity patterns (richness, endemism, GDM), β coefficents from each OTBC/CAR analyses were converted to the percentage of contribution. *The mean of the 3 MDS vectors loadings were calculated and contributed as a single value to the total mean.

Percent Contribution to CAR

Mid-domain- Latitude

Mid-domain- Longitude

Mid-domain- Distance***

Endemism

Amphibian

0.0%

Reptile

27.7%

Richness

Amphibian

3.5%

Reptile

3.6%

MDS-D1

5.3%

GDM

MDS-D2

20.7%

MDS-D3

1.6%

Mean*

9.2%

Mean contribution to all observed biodiversity models*

8.8%

8.1%

15.1%

8.0%

9.7%

6.7%

7.7%

9.2%

20.0%

10.9%

14.4%

10.2% 8.1% 9.7%

5.7% 15.8% 11.9%

8.4%

12.9%

60

Climate- PC1

Climate- PC2

Climate- PC3

Refuge

Climate Stability***

Topographic Heterogeneity

Disturbance-vicariance

Montane Species Pump

Sanctuary

Museum

CAR Model Summary

Explained by Predictor

Variables: r

2

(AICc)

Total Explained (Predictor and

Space): r 2 (AICc)

15.4%

10.1%

2.2%

1.4%

4.7%

6.5%

5.9%

5.0%

10.9%

14.8%

0.482

(-372.3)

0.646

( -426.0

)

0.1%

12.7%

12.8%

4.7%

4.6%

1.7%

4.8%

0.0%

0.3%

11.4%

8.5%

4.6%

6.9%

7.4%

7.1%

8.1%

7.1%

13.7%

14.5%

11.0%

1.8%

0.0%

3.0%

3.7%

3.1%

2.3%

12.6%

15.1%

10.1%

4.4%

3.8%

0.0%

3.9%

3.4%

2.9%

11.7%

4.5% 13.8% 11.1%

11.7%

11.9%

10.5%

1.0%

0.5%

2.2%

0.0%

3.1%

7.4%

3.9%

0.0%

3.8%

8.0%

6.8%

7.2%

4.7%

9.7%

6.7%

4.8%

1.6%

4.2%

4.1%

3.4%

6.5%

13.0%

0.303

(-468.5)

0.626

(-556.4)

7.1% 15.1% 14.3% 18.1% 19.0% 17.1%

0.624

(7861.9)

0.309 0.869 0.435 0.659

(21849.9) (-4037.8) (-1084.2) (-3327.7)

0.849

(6349.3)

0.745 0.954 0.942 0.868

(19364.2) (-6659.7) (-6760.4) (-5700.9)

10.5%

10.4%

5.6%

3.5%

4.2%

4.6%

5.2%

3.6%

8.8%

13.4%

61

Model significance: n, F, p-val 141, 42.5, 141, 9.7,

<0.001 <0.001

2501,

303.5

<0.001

2501,

278.1,

<0.001

2501,

1265.9,

<0.001

2501,

147.2,

<0.001

2501,

396.8,

<0.001

NA

62

Supplementary Table S5. Mixed CAR spatial models of focal subgroups. A principal component analyses was performed on the standardized biogeography hypotheses. All the resulting principal components (PCs) were extracted and then loaded as explanatory variables. The CAR analyses were run iteratively, starting with all PCs as response variables and then excluding each PC that did not contribute significantly to the model (α = 0.05) until the final model included only PCs that contributed significantly to the model. The standardized beta coefficients ( β ) were then used to calculate contributions of each biogeography hypothesis (see methods on OTBCs) in the final CAR analysis. To compare the contributions of each biogeography hypothesis among models of observed biodiversity patterns (richness, endemism, GDM), β coefficents from each OTBC/CAR analyses were converted to the percentage of contribution.

*The mean of the 3 MDS vectors loadings were calculated and contributed as a single value to the total mean.

Percent Contribution to CAR Endemism

Boophis Brookesia Oplurus* Phelsuma Boophis

Richness

Brookesia Oplurus* Phelsuma

Mid-domain- Latitude

Climate- PC1

Climate- PC2

Climate- PC3

Refuge

Mid-domain- Longitude

Mid-domain- Distance***

3.0%

8.4%

16.0%

12.2%

6.0%

5.2%

0.0%

4.0%

9.9%

9.9%

12.2%

14.1%

4.3%

10.5%

2.5%

10.6%

6.3%

13.5%

13.2%

2.8%

11.3%

17.0%

11.8%

6.4%

5.8%

7.0%

18.3%

2.3%

0.0%

6.1%

10.6%

14.4%

13.0%

1.6%

9.9%

5.1%

8.3%

12.0%

11.7%

11.2%

7.9%

4.5%

2.3%

6.3%

11.0%

15.9%

8.9%

0.0%

3.3%

4.2%

10.2%

20.4%

13.1%

9.7%

3.9%

0.0%

63

Climate Stability***

Sanctuary

Museum

Topographic Heterogeneity

Disturbance-vicariance

Montane Species Pump

CAR Model Summary

3.6%

9.0%

7.0%

8.4%

0.1%

21.1%

8.8%

2.2%

4.7%

0.6%

18.9%

0.0%

11.2%

1.0%

4.1%

0.0%

22.6%

0.8%

5.9%

5.1%

7.2%

4.3%

0.0%

8.8%

Explained by Predictor

Variables: r 2 (AICc)

Total Explained (Predictor and

Space): r

2

(AICc)

Model significance: n, F, p-val

0.251

(-252.2)

0.524

( -316.0

)

0.303

(-176.1)

0.685

(-235.3)

0.432

(-317.9)

0.756

(-436.8)

0.209

(-253.4)

0.759

(-354.6)

0.569

(9710.5)

0.863

(6851.0)

141, 46.6, 141, 49.7, 141, 20.5,

<0.001 <0.001 <0.001

141, 20.1,

<0.001

2501, 364.7,

<0.001

8.9%

2.8%

5.8%

2.7%

20.3%

3.8%

0.444

(5392.1)

0.812

(2680.8)

2501, 220.6,

<0.001

3.6%

2.1%

2.5%

0.0%

17.2%

13.9%

5.8%

6.5%

6.8%

4.5%

23.6%

5.2%

0.178

(9789.1)

0.504

(8530.6)

0.559

(6731.4)

0.794

(4837.3)

2501, 550.1, 2501, 632.4,

<0.001 <0.001

1.8%

4.1%

2.7%

2.7%

7.9%

19.3%

64

Download