Supporting Information (SI) Molecular Ecology Fine-grained adaptive divergence in an amphibian: genetic basis of phenotypic divergence and the role of non-random gene flow in restricting effective migration among wetlands Alex Richter-Boix; María Quintela; MarcinKierczak; Marc Franch; AnssiLaurila MATERIAL INCLUDED Section 1.PREDATOR ABUNDANCE ESTIMATION DETAILS AND PCR INFORMATION. Table S1. Environmental description of localities Figure S1. Map and localization of the study sites Section 2.COMPONENT ANALYSES OF ENVIRONMENTAL VARIABLES AND SPATIAL DISTRIBUTION Table S2.Component analysis for mixtures of quantitative and qualitative variables. Table S3.Correlograms results from habitat parameters (CS1 and CS2) Figure S2.Direction of environmental traits across CS1 and CS2. Figure S3.Correlograms results from habitat parameters (CS1 and CS2) Section 3.LANDSCAPE RESISTANCE MODELS: TESTING ISOLATION-BY-RESISTANCE Table S4. Conductance values for each land-cover type Table S5. Results with the correlation coefficients of the different IBR models. Figure S4. Map with the study area showing the basic cover types landscapes and sampling sites. Figure S5. Resistance map with the best IBR model. Figure S6.Mean ±2SE of (a) mass at metamorphosis, (b) larval period, and (c) growth rate Section 4.TRβ HAPLOTYPES DESCRIPTION AND GENOTYPIC INFORMATION PER POPULATION Table S6. Base composition of TRβ haplotypes detected and frequencies Table S7. Summary statistics for TRβ gene variation in the 17 Ranaarvalispopulations Table S8. FST TRβ genetic differentiation between pairs of populations Table S9.Results from the machine learning approach for phenotypic-TRβ gene relationship. SECTION 1.PREDATOR ABUNDANCE ESTIMATION DETAILS AND PCR AMPLIFICATION DETAILS. Predator abundance estimation details.Macroinvertebrate predators and newts were sampled in mid June and early July. Sweep net (diameter dimension: 32 cm, mesh size: 0.08 cm) samples were taken by holding the net in a vertical position and moving from the water surface through the vegetation and water column five times over a distance of 3-5 m (Cheal et al. 1993). Five sites were selected in each wetland, resulting in 25 samples for each pond. Sampling was restricted to the littoral region and to depths of 0.5-1.5 m. Invertebrates and newts were identified, classified as potential predators, counted and then returned to the water. Three types of insects were considered as predators: large dragonflies (aeshnid and libellulid naiads), aquatic heteroptera (notonectids) and diving beetles (larvae or adults). Predator densities were determined based on the water volume sampled, calculated by using the dimensions of the net frame x distance walked. PCR information. Multiplexed PCR amplifications were performed in a final volume of 10 μl containing 50 ng DNA template, 5 µL PCR mix (QIAGEN Multiplex PCR Kit), 1 µL 10xprimer mix, and 3 µL RNAsefree water in an Applied Biosystems Gene Amp PCR Systems 2700 thermal cycler. PCR profiles consisted of 3 min denaturation at 94°C followed by 23-25 cycles of 30s denaturation at 94°C, 30 s annealing at 50-55°C and 30 s extension at 72°C with two final steps: 2 min 48 °C and 5 min at 72ºC respectively. PCR products were genotyped on a MegaBACE 1000 automatic sequencer (Amersham Biosciences, Buckinghamshire, UK) and scoring was performed in a blind manner using the software Fragment Profiler (Fragment Profiler 1.2, Amersham Biosciences, 2003). Figure S1. Map with the spatial distribution of ponds and potential connectivity among ponds within the study area. Red dots represent the 17 ponds used in the present study; small blue circles are other wetlands where reproduction of Rana arvalis has been observed. X and Y axis are expressed in UTM coordinates. Each tick corresponds to five kilometres. Table S1.- Ecological parameters per sites: Geographical coordinates (GPS), breeding time (Day) respect first of January, habitat category (Hab: M=marshes, P=ponds), water permanency category (Per: T=temporary, P=permanent), mean of water temperature, percentage of forest canopy cover, relative density of potential predators, percentage of emergent vegetation, number of clutches used in genetic analyses. Site d1 d2 d3 d4 d5 d6 d7 d9 d10 d11 d12 d13 d14 d15 d16 d17 d18 GPS 59°43'54.12"N 16°59'9.24"E 59°43'39.78"N 16°49'52.14"E 59°44'28.92"N 16°50'22.32"E 59°45'17.34"N 17°2'6.72"E 59°51'10.17"N 17°28'21.31"E 59°37'28.55"N 17°13'3.27"E 59°45'14.04"N 17°24'37.50"E 59°38'24.60"N 16°54'16.20"E 59°36'57.30"N 16°55'35.64"E 59°42'53.76"N 17°4'15.60"E 59°46'44.88"N 17°1'24.90"E 59°47'56.46"N 16°52'12.90"E 59°50'29.46"N 17°21'48.18"E 59°43'41.52"N 16°53'58.92"E 59°45'42.84"N 16°54'8.94"E 59°52'40.98"N 17°12'8.40"E 59°55'53.64"N 17°17'49.74"E Day Hab Per WTemp Canopy Predators EmVeg N clutches 92 M T 11.15 60 0.88 83 27 92 M T 13.16 60 1.31 24 26 87 P P 13.94 0 3.12 29 30 95 P P 12.24 40 1.54 20 30 94 P P 12.86 0 2.38 46 25 99 M T 12.86 0 3.81 65 25 100 M T 13.62 0 1.44 22 30 96 P T 12.81 100 2.27 27 7 101 P P 11.35 80 3.09 42 30 102 M P 10.82 40 3.83 43 22 103 M T 11.89 40 0.44 61 30 103 M T 11.16 20 0.75 85 20 107 M T 9.49 80 0.86 59 24 108 M T 11.27 40 1.42 78 28 108 P P 14.03 20 1.78 38 18 109 M T 8.94 100 1.03 77 26 105 P P 13.45 60 1.23 39 29 Section 2. COMPONENT ANALYSES OF ENVIRONMENTAL VARIABLES AND SPATIAL DISTRIBUTION Table S2. Results of the analysis of a mixture of numeric variables and factors using “dudi.mix” function in ade4 R library. temp = temperature; pred = predator abundance; eveg = emergent vegetation; cate = category (marsh or pond); cano = forest canopy; perm = permanency (temporary or permanent pond); dept = water depth. CS1 CS2 CS3 Temperature 0.405 0.339 -0.006 Forest Canopy -0.281 -0.555 -0.013 Depth 0.336 -0.467 -0.047 Predator abundance 0.452 0.141 -0.020 Emergent vegetation -0.450 -0.155 0.019 Category (M vsPo) 0.344 -0.3856 0.748 Permanency (T vs P) 0.341 -0.405 -0.660 Eigenvalues 4.6702 2.0211 0.2345 Proportion Variance 0.6671 0.2887 0.0335 Cumulative Variance 0.6671 0.9559 0.9894 0.8 0.6 temp CS2 (28.87%) 0.4 0.2 pred 0.0 -0.2 eveg cate -0.4 perm dept -0.6 -0.8 -0.8 cano -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 0.8 CS1 (66.71%) Figure S2. Analysis of a mixture of numeric variables and factors using “dudi.mix” function in ade4 R library showing the weight and direction of the parameters across the two principal components (0.95% explained). temp = temperature; pred = predator abundance; eveg = emergent vegetation; cate = category (marsh or pond); cano = forest canopy; perm = permanency (temporary or permanent pond); dept = water depth. Table S3.Correlograms results from habitat parameters (CS1 and CS2), assuming randomly distributed data, and inverse distance weighting and with permutation tests based on 2000 permutations. CS2 Min Max Pairs I Prob RandProb I Prob RandProb 0.00000 5771.17 14 -0.066 0.976 0.780 0.013 0.777 0.956 5771.17 9809.48 16 0.002 0.788 0.989 0.086 0.521 0.701 9809.48 11863.11 15 0.295 0.134 0.221 -0.005 0.821 0.983 11863.11 14895.78 15 0.162 0.313 0.451 -0.139 0.714 0.509 14895.78 18097.95 15 -0.042 0.941 0.838 0.173 0.299 0.426 18097.95 21223.42 16 -0.171 0.607 0.440 -0.819 0.0005 0.002 21223.42 24950.84 15 -0.151 0.690 0.488 -0.104 0.843 0.671 24950.84 28245.22 15 -0.255 0.395 0.259 0.111 0.460 0.640 28245.22 32700.17 16 -0.157 0.656 0.451 -0.221 0.459 0.331 32700.17 40860.96 16 -0.247 0.365 0.220 0.281 0.102 0.188 1.0 1.0 0.5 0.5 Moran's I PCA2_pond Moran's I PCA1_pond CS1 0.0 -0.5 -1.0 0 10000 20000 30000 40000 50000 0.0 -0.5 -1.0 0 10000 20000 30000 40000 50000 Figure S3.Correlograms results from habitat parameters (CS1 and CS2), assuming randomly distributed data, and inverse distance weighting and with permutation tests based on 2000 permutations. Section 3. LANDSCAPE RESISTANCE MODELS: TESTING ISOLATION-BY-RESISTANCE Landscape traits included in the resistance models.The input for Circuitscape is a raster map in which each cell is assigned a conductance value corresponding to the relative probability of R. arvalis moving through the habitat type encoded by the cell. To derivate the grids, we used landscape variables that we predicted would have possible relevance to habitat selection, vagility and gene flow based on relevant literature and expert opinion. The value of resistance assigned to each habitat type was based on previous studies on this species (Arens et al. 2007; Loman and Lardner 2009), which demonstrated the negative impact of crops and human infrastructures (roads and urban settlement) on frog movements and gene flow, and empirical studies with other forest amphibian species which showed the negative effects of forest clearcuts and open canopy areas on species vagility (e.g. Semlitsch et al. 2008; Popescu and Hunter 2011). Because the absence of hills or mountains in our study area, we did not consider slopes and ridges in our models. Soil texture has been considered an important variable for landscape resistance because it reflects the presence of subterranean refugia and the risk of dehydration for amphibians (Cosentino et al. 2011). As we did not have information about soil texture, this factor was not included in the models. To generate the land cover data sets, we acquired a 5-m-resolution digital land cover map of the study area (http://www.metria.se/), which contains hundreds of detailed land classes. We simplified these classes into eight general categories representing basic cover types that previous studies have demonstrated to influence R. arvalis dispersal (Figure S2 and Table S2). To generate the land cover input file, we assigned different conductance values to each land cover category. Low conductance value indicates a high resistance to frog movement: big water bodies = 1; urban and industrial areas = 1; low urban areas = 3; grassland and agriculture = 5; agriculture-forest transition = 6; recently clearcut forest = 6; forest = 8; wetlands = 10. These ranks were chosen to reflect the relative order of conductance of these landscape features based on previous empirical and non-empirical studies on amphibians. The magnitudes of these ranks, although reflecting expert opinion, were arbitrary, and assigning resistance values to different land cover categories can be problematic (Spear et al. 2010). Subsequently we created different conductance maps and models, to evaluate the extent to which the results were influenced by the magnitude of these rankings (Table S4). Grids were exported from ArcGIS for Circuitscape analysis using the “Export to Circuitscape” tool (http://www.circuitscape.org/Circuit- scape/ArcGIS.html). Table S4. Conductance values for each land-cover type for calculating resistance distances between breeding ponds. We created four resistance models: IBD is the control Euclidean distance model, where conductance for each land-cover is maximum and equal; this model did not consider landscape heterogeneity. IBRlow, IBRm and IBR high are three resistance models considering landscape heterogeneity, from low resistance to high resistance model. Landclasses Conductancevalues IBD IBRlow IBRm IBRhigh Big waterbodies 10 1 1 1 Urban and industrial areas 10 2 1 1 Lowdensityurbanareas 10 5 3 2 Grassland and agriculture 10 6 5 3 Agriculture-foresttransition 10 7 6 4 Forestrecentlyclearcut 10 8 6 5 Forest 10 9 8 8 Wetlands 10 10 10 10 Figure S4. The study area in Sweden showing basic cover types landscapes and breeding sampling sites. Yellow: crops and agricultural areas. Light green: forest. Dark green: recently clearcut forest. Red: semiurban areas. Grey: urban and industrial areas. Blue: big water bodies. The relative importance of various landscape features in predicting population genetic structure across the study area was analyzed by conducting a series of Mantel tests examining the correlation between pairwise genetic structure (FST/(1-FST)) and models of pairwise landscape resistance distance: isolation-by-resistance (McRay 2006). We used Circuitscape 2.2 to generate pairwise matrices of landscape resistance based on the landscape. We used simple Mantel tests and significance testing to compare the models with the software PASSaGE 2 (Rosenberg and Anderson 2011). Significance of Mantel correlations was assessed based on 200000 random permutations of the data. Results obtained of the Mantel correlations for each model are summarized in Table S5. All resistance models considering landscape heterogeneity showed a higher correlation coefficient than the simple Euclidean model (IBD). As IBRm showed the highest r-value, it was the model used in the main text and the other analyses. Figure S5 illustrates the resistance map between the populations within the area of study created with the IBRm land conductance values. Table S5. Correlation coefficient and corresponding significance for Mantel test comparing neutral genetic distance (FSTN) and geographical distance assuming, either the Euclidean model (IBD not considering landscape heterogeneity), or one of the three resistance distance models. Bold letters highlight the model with the highest correlation coefficient, and thereby the one used in the other analyses included in the main text. Matrix Mantel r P-value FSTN-IBD 0.0790 0.2364 FSTN-IBRlow 0.0704 0.2445 FSTN-IBRm 0.0946 0.2017 FSTN-IBRhigh 0.0847 0.2388 Figure S5. Resistance map between the sampling breeding ponds within the area of study. Blue color shows areas of low conductance, which are expected to have low densities of dispersing frogs; yellow and red color show well connected areas with low resistance to animal movements. REFERENCES Arens P, van der Sluis T, van’t Westende W, et al.(2007) Genetic population differentiation and connectivity among fragmented Moor frog (Ranaarvalis) populations in The Netherlands. Landscape Ecology22, 1489-1500. Cosentino B J,Schooley RL, Phillips CA (2011) Connectivity of agroecosystems: dispersal costs can vary among crops. Landscape Ecology26, 371–379. Loman J, Lardner B (2009) Does landscape and habitat limit the frogs Ranaarvalis and Ranatemporaria in agricultural landscapes? A field experiment. Applied Herpetology6, 227-236. Murphy MA, Dezzani R, Pilliod DS, Storfer A (2010) Landscape genetics of high mountain frog metapopulations. Molecular Ecology19, 3634-3649. Popescu VD, Hunter ML (2011) Clear-cutting affects habitat connectivity for a forest amphibian by decreasing permeability to juvenile movements. Ecological Applications21, 1283-1295. Rosenberg MS, Anderson CD (2011) PASSaGE: Pattern Analysis, Spatial Statistics and Geographic Exegesis. Version 2. Methods in Ecology and Evolution2, 229-232. Semlitsch, RD, Conner CA, Hocking DJ, Rittenhouse TAG, Harper EB (2008) Effects of timber harvesting on pond-breeding amphibian persistence: testing the evacuation hypothesis. Ecological Applications18, 283–289. Spear SF, Balkenhol N, Fortin M-J, McRae BH, Scribner KIM (2010) Use of resistance surfaces for landscape genetic studies: considerations for parameterization and analysis. Molecular Ecology19, 3576-3591. Spear SF, Peterson CR, Matocq MD, Storfer A (2005) Landscape genetics of the blotched tiger salamander (Ambystomatigrinummelanostictum). Molecular Ecology14, 2553-2564. Figure S6.- Mean ±2SE of (a) mass at metamorphosis, (b) larval period, and (c) growth rate from each locality (d1, d2, […], d17) in the common-garden experiment. Localities are ordered by breeding time, with earlier localities in the beginning of the x-axis, and later breeding localities at the end of the axis. Section 4. TRβ HAPLOTYPES DESCRIPTION AND GENOTYPIC INFORMATION PER POPULATION Number of Loci detected: 13 Position of loci: 63 71 79 121 124 160 183 230 321 537 693 723 759 Synonymous changes (7 of 13): 63 183 321 537 693 723 759 Non-synonymous changes (6 of 13): 71 79 121 124 160 230 Haplotypes parsimony informative sites > 1% Table S6.Base composition of haplotypes detected in the families analysed. No. haplotype Base composition Frequency HTRβ 1 GCTAGGAGGTGGC 5% HTRβ 2 GCTAGGGGGTGGC 1% HTRβ 3 GCTAGGGGGTGAC 3% HTRβ 4 GCTAGGGGACGGC 1% HTRβ 5 GCTTGGAGGTGGC 40% HTRβ 6 GCTTGGAGGTGAC 1% HTRβ 7 GCTTGGAGGCGGC 2% HTRβ 8 GCTTGGAGATGGC 1% HTRβ 9 GCTTGGGGGTGGC 42% Fast development = short larval period Loci 4 and 7 are different from HTRβ 1 HTRβ 10 GCTTGGGGGTGAC 10% Slow development = long larval period Loci 12 are different from HTRβ 1, 9 and 13 HTRβ 11 GCTTGGGGGCGGC 7% HTRβ 12 GCTTGGGGGCGAC >1% HTRβ 13 GCTTGGGGATGGC 8% Slow development = long larval period Loci 9 are different from HTRβ 1, 9 and 10 HTRβ 14 GCTTGGGGATGAC 3% HTRβ 15 GCTTGGGGACGGC 27% HTRβ 16 GCTTGGGGACGAC 1% Fast development = short larval period Loci 4 and 7 are different from HTRβ 9, 10 and 13 HTRβ 17 GCTTGGGCGCCGA 1% HTRβ 18 GCTTGAAGGTGGC >1% HTRβ 19 GCTTGAGGGTGAC 1% HTRβ 20 GCTTGAGGATGGC >1% HTRβ 21 GCTTGAGGACGGC >1% HTRβ 22 GCTTAGGGGTGGC >1% HTRβ 23 GCTTAGGGGTGAC 1% HTRβ 24 GCTTAGGGGCGGC 1% HTRβ 25 GCTTAGGGATGGC >1% HTRβ 26 GCTTAGGGACGGC 1% HTRβ 27 GCATGGAGGTGGC 1% HTRβ 28 GCATGGGGGTGGC 1% HTRβ 29 GCATGGGGGCGGC >1% HTRβ 30 GCATGAGGGTGGC >1% HTRβ 31 GATTGGGGATGGC 1% HTRβ 32 GATTGGGGACGGC 1% HTRβ 33 GATTAGGGACGGC 1% HTRβ 34 ACTTGGAGGTGGC 5% HTRβ 35 ACTTGGAGACGGC >1% HTRβ 36 ACTTGGGGGTGGC 1% Table S7. Summary statistics for TRβ gene variation in the 17 Rana arvalis wetlands analyzed. d1 d2 d3 d4 d5 d6 d7 d9 d10 d11 d12 d13 d14 d15 d16 d17 d18 No of bp 882 882 882 882 882 882 882 882 882 882 882 882 882 882 882 882 882 Singletonsites 5 1 5 4 0 1 4 1 3 2 4 4 3 2 1 0 1 Parsimonyinformativesites 7 9 5 9 6 5 9 3 8 4 6 5 4 12 4 4 9 No of mutations 12 10 10 11 6 6 8 4 11 6 10 9 7 13 5 4 10 Syn 5 8 5 5 4 4 5 3 6 4 7 7 4 10 4 4 7 Non-syn 7 2 5 6 2 2 3 1 5 2 3 2 3 3 1 0 3 No haplotypes 15 13 15 18 14 11 12 5 21 7 22 14 8 14 8 8 12 Haplotypediversity 0.831 0.879 0.875 0.895 0.877 0.816 0.883 0.758 0.845 0.775 0.877 0.813 0.717 0.904 0.782 0.828 0.828 Table S8. FST genetic distances between pairs of wetlands for the TRβ gene. Bold indicate significant FST values for q-values<0.05 using a FDR approach. d1 d2 d3 d4 d5 d6 d7 d9 d10 d11 d12 d13 d14 d15 d16 d17 d1 -- d2 0.071 -- d3 0.026 0.015 -- d4 0.051 0.027 0.013 -- d5 0.042 0.021 0.015 0.001 -- d6 0.016 0.038 0.020 0.018 -0.003 -- d7 0.045 0.033 -0.006 0.026 0.032 0.047 -- d9 -0.001 0.022 0.000 -0.005 -0.014 -0.040 0.032 -- d10 0.037 0.106 0.047 0.033 0.042 0.043 0.058 0.016 -- d11 0.008 0.043 -0.006 0.035 0.024 0.014 0.017 -0.014 0.039 -- d12 0.025 0.019 -0.005 0.004 0.005 0.007 0.017 -0.010 0.024 0.008 -- d13 0.000 0.103 0.057 0.081 0.065 0.046 0.062 0.014 0.054 0.023 0.062 -- d14 0.025 0.062 0.012 0.025 0.008 -0.010 0.040 -0.037 0.019 -0.005 0.007 0.028 -- d15 0.084 0.045 0.026 0.037 0.049 0.061 0.047 0.037 0.046 0.037 0.022 0.114 0.050 -- d16 -0.011 0.036 0.008 0.034 0.011 -0.004 0.028 -0.034 0.044 -0.012 0.014 -0.007 -0.008 0.065 -- d17 0.095 -0.004 0.005 0.022 0.023 0.066 0.007 0.060 0.099 0.045 0.022 0.124 0.079 0.041 0.057 -- d18 0.015 0.031 0.013 0.022 0.003 0.011 0.020 -0.012 0.045 0.006 0.021 0.011 0.009 0.052 -0.012 0.031 d18 - Table S9. Detailed results obtained from the machine learning approach to establish a relationship between phenotypic traits and TRβ gene sequence. Larval period Correctly Classified Instances Incorrectly Classified Instances Growth rate Mass at metamorphosis 120 64.86% 107 57.83% 77 41.62% 65 35.13% 78 42.16% 108 58.37% Kappa statistic 0.1745 0.2891 0.2176 Mean absolute error 0.1679 0.1772 0.1954 Root mean squared error 0.3145 0.3278 0.3459 Relative absolute error 82.63% 82.53% 88.71% Root relative squared error 99.46% 100.55% 104.46% Coverage of cases (0.95 level) 87.56% 89.72% 82.16% Larval period Class ROC Area -inf - 37 0.626 37 – 38.4 0.818 38.4 – 39.8 0.589 39.8 – 41.2 0.601 41.2 - inf 0.773 Growth rate Class ROC Area -inf – 0.0096 0.185 0.0096 – 0.0103 0.942 0.0103 – 0.0109 0.769 0.0109 – 0.0116 0.651 0.0116 – 0.0122 0.454 0.0122 – inf 0.775 Weighted average Weighted average 0.723 0.717 Mass at metamorphosis Class ROC Area -inf – 0.416 0.418 0.416 – 0.442 0.781 0.442 – 0.468 0.606 0.468 – 0.494 0.641 0.494 – 0.520 0.549 0.520 – 0,546 0.516 0.546 - inf 0.748 Weighted 0.652 average