Additional file 14 Phylogenetic analysis (Antirrhinum matrix) Phylogenetic analyses for 190 trnS-trnG/trnK-matK concatenated sequences were conducted using Bayesian inference (BI), maximum likelihood (ML) and maximum parsimony (MP). In addition, Bayesian phylogenetic analyses were also performed on the separate matrices to examine plastid gene tree congruence. Gambelia speciosa and Misopates orontium were selected as the outgroup based on previous phylogenetic evidence [1], and gaps were treated as missing data. The MP analysis was performed in TNT 1.1 [2] using a heuristic search with 10,000 replicates saving two mostparsimonious trees per replicate, followed by a second heuristic search retaining all best trees and using the trees obtained in the previous 10,000 replicates as the starting ones. Bootstrap support (MP-BS) of clades was assessed using 1000 standard replicates. For ML and BI analyses, the simplest model of sequence evolution that best fits the sequence data was determined under the Akaike Information Criterion (AIC) in jModeltest 0.1.1 [3]. The General Time Reversible model incorporating invariant sites and a gamma distribution (GTR+I+G) was selected for the two plastid DNA regions. ML was implemented in PhyML 3.0 [4] with 500 non-parametric bootstrap replicates (ML-BS). BI was performed in MrBayes v3.1.2 [5]. Two identical searches with 10 million generations each and a sample frequency of 1000 were performed. Chain convergence was assessed with Tracer 1.5 [6], and a 50% majority rule consensus tree with Bayesian posterior probabilities (PP) of clades was calculated, using the sumt command, to yield the final Bayesian estimate of phylogeny after removing the first 10% generations as burn-in. Trees were visualized using FigTree 1.3.1. [7]. Ancestral area reconstructions (Antirrhinum matrix) A discrete phylogeographic analysis (DPA) that uses standard MCMC sampling implemented in BEAST [8] was performed to assess the probability distribution of the geographic locations in each node of the maximum clade credibility tree. A total of 14 discrete areas were delimited: (i) the four Iberian quadrants (northeastern Iberia, NE; northwestern Iberia, NW; southeastern Iberia, SE; southwestern Iberia, SW), as divided by the geographical coordinates 40ºN/5ºW [see 9]; (ii) Eastern, Central and Western Pyrenees, as the three recognized biogeographic regions within the Pyrenees (see below); (iii) the other two northern areas sampled nearby the Pyrenees (Southern French basin and Southwestern Alps); and (iv) the remaining five regions sampled across Mediterranean basin (Morocco, Sicily, Sardinia, Italy and Turkey). Statistical significance for the rates of the dispersal events was assessed via Bayes factor test (BF) as described by Lemey et al. [8]. Dispersal rates were allowed to be zero with some probability in the framework of Bayesian stochastic search variable selection (BSSVS). The analysis consisted of two independent runs of 100 million generations each sampling every 10000 generations. Chain convergence was examined in Tracer 1.5 [10]. The two runs were combined in LogCombiner 1.6.2 after discarding the first 10% of sampled generations as burn-in. A consensus chronogram with the maximum sum of clade credibilities (MCC), was obtained with TreeAnotator v.1.6.2 and visualized in FigTree 1.3.1 [7]. Well-supported rates of dispersal (BF>3) were visualized in Google Earth using the RateIndicatorBF tool added to the BEAST code. Additional ancestral range reconstructions were conducted using the Bayesian timecalibrated molecular phylogeny with the aim to discriminate between northern and southern origin of Pyrenean lineages. For this purpose, only four areas were delimited (i) Iberian Peninsula, (ii) Pyrenees and adjacent areas, (iii) South-western Alps, and (iv) samples from the Mediterranean basin (excluding Iberia). Ancestors were allowed to be present in all of them. Distribution ranges of sequences (haplotypes) instead of species was used [see 11]. Two alternative reconstruction methods were used: (a) statistical dispersal-vicariance analysis (S-DIVA) implemented in the program RASP 1.1 [12], a parsimony-based approach (DIVA; [13]) that determines the probability of each geographical region for each node, accounting for the uncertainty of the Bayesian phylogenetic analysis [14]; and (b) dispersal-extinction-cladogenesis analysis (DEC) implemented in the software package Lagrange v2.0.1 [15], a parametric likelihoodbased approach that estimates the most likely geographic distribution of two daughter lineages following a speciation event. Whereas the first method estimates the actual state at the node, the second estimates the states of the branches emanating from a given node. For the S-DIVA analysis we followed the method of Harris & Xiang [16]. Two hundred trees randomly sampled after the burn-in period from the BEAST run were selected, and the single MCC tree was used as final tree (after pruning outgroup taxa). For DEC analysis we used the pruned MCC tree. Symmetric dispersal between both areas and constant dispersal rates through time were set. Genetic diversity and geographic structure (Pyrenees matrix) An analysis of genetic diversity was carried out across the three recognized biogeographic regions in which the Pyrenees range is divided (Eastern, Central and Western Pyrenees) (see Fig. 3a). The boundaries of this three biogeographic areas, although with slight differences, have been traditionally established by both geologists [17, 18] and phytogeographers [19-24] on the basis of geologic, climatic and floristic data. Haplotype frequencies and molecular diversity indices for each biogeographic area were calculated using DnaSP v5 [25]. In addition, to identify potential hotspots of genetic diversity across the Pyrenees, individuals were geographically grouped by means of a 10x10 km grid. Charts representing haplotype frequencies were constructed for each grid cell, which was named by a generic letter–number code (Fig. 3). To infer the spatial genetic structure we used a Bayesian model-based approach, implemented in the BAPs software, version 5.3 [26, 27]. This software assigns the genotypes into genetically structured groups (K) and incorporates the possibility to account for the dependence due to linkage between the sites within aligned sequences. Five iterations of K, for Kmax values of five, ten and 20 potential populations, were conducted to determine the optimal number of genetically homogeneous groups. ‘Clustering of groups with linked loci’ analysis was chosen, and the groups were defined by natural sampled populations. Admixture analyses [26] were run with 100 iterations to estimate admixture coefficients for individuals, 200 reference individuals from each population and 20 iterations to estimate admixture coefficients for reference individuals. To identify genetic subdivisions among the Eastern, Central and Western Pyrenees, we performed an analysis of molecular variance (AMOVA) [28], which compares haplotype variation within and between groups. Pairwise FST statistics were also calculated to estimate genetic distances. Both analysis were performed by using ARLEQUIN [29]. Additionally, an AMOVA was performed in order to assess the partitioning of variance between the Lineage E (primarily distributed in the eastern part of the Pyrenees) and the rest of lineages (see below). To evaluate the optimal grouping of the sampled sites without a priori assumptions, a spatial analysis of molecular variance (SAMOVA) implemented in the software package SAMOVA 1.0 [30] was also performed. This analysis uses a simulated annealing approach based on genetic and geographical data to identify groups of related populations. The program was run for K = 2 to 20 groups, from 100 initial conditions, and the most likely structure was identified using highest values of FCT (the proportion of genetic variation between groups of populations) excluding any groups of a single population. References 1. Vargas P, Rosselló JA, Oyama R, Güemes J: Molecular evidence for naturalness of genera in the tribe Antirrhineae (Scrophulariaceae) and three independent evolutionary lineages from the New World and the Old. Plant Systematics and Evolution 2004, 249(3-4):151-172. 2. Goloboff PA, Farris JS, Nixon KC: TNT, a free program for phylogenetic analysis. Cladistics 2008, 24(5):774-786. 3. Posada D: jModelTest: Phylogenetic model averaging. Molecular Biology and Evolution 2008, 25(7):1253-1256. 4. Guindon S, Gascuel O: A Simple, Fast, and Accurate Algorithm to Estimate Large Phylogenies by Maximum Likelihood. Systematic Biology 2003, 52(5):696-704. 5. Ronquist F, Huelsenbeck JP: MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics 2003, 19(12):1572-1574. 6. Rambaut A, Drummond A: Tracer ver. 1.5. Program available at http://beast bio ed ac uk/Tracer 2009. 7. Rambaut A: FigTree version 1.3.1. FigTree Available at http://tree.bio.ed.ac.uk/software/figtree/. In.; 2009. 8. Lemey P, Rambaut A, Drummond AJ, Suchard MA: Bayesian phylogeography finds its roots. PLoS Computational Biology 2009, 5(9). 9. Vargas P, Carrió E, Guzmán B, Amat E, Güemes J: A geographical pattern of Antirrhinum (Scrophulariaceae) speciation since the Pliocene based on plastid and nuclear DNA polymorphisms. Journal of Biogeography 2009, 36(7):1297-1312. 10. Drummond AJ, Rambaut A: BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evolutionary Biology 2007, 7(1). 11. Fernández-Mazuecos M, Vargas P: Historical isolation versus recent longdistance connections between Europe and Africa in bifid toadflaxes (Linaria sect. Versicolores). PLoS ONE 2011, 6(7). 12. Yu Y, Harris A, He X: rasp (reconstruct ancestral state in phylogenies) 2.0 beta. In.: http://mnh.scu.edu.cn/soft/blog/RASP; 2011. 13. Ronquist F: Dispersal-Vicariance Analysis: A New Approach to the Quantification of Historical Biogeography. Systematic Biology 1997, 46(1):195-203. 14. Nylander JA, Olsson U, Alström P, Sanmartín I: Accounting for phylogenetic uncertainty in biogeography: a Bayesian approach to dispersal-vicariance analysis of the thrushes (Aves: Turdus). Systematic Biology 2008, 57(2):257268. 15. Ree RH, Smith SA: Maximum likelihood inference of geographic range evolution by dispersal, local extinction, and cladogenesis. Systematic Biology 2008, 57(1):4-14. 16. Harris AJ, Xiang QY: Estimating ancestral distributions of lineages with uncertain sister groups: A statistical approach to dispersal-vicariance analysis and a case using Aesculus L. (Sapindaceae) including fossils. Journal of Systematics and Evolution 2009, 47(5):349-368. 17. Souquet P, Bilotte M, Canerot J, Debroas E, Peybernés B, Rey J: Nouvelle interprétation de la structure des Pyrénées. Comptes Rendus de l’Academie des Sciences de Paris 1975, 281:609-612. 18. Souquet P, Mediavilla F: Nouvelle hypothèse sur la formation des Pyrénées. CR Acad Sci 1976, 282:2139-2142. 19. Rivas Martinez S: Memoria del mapa de series de vegetaci6n de Espafia 1: 400000. ICONA, Madrid 1987. 20. Rivas Martínez S, Báscones Carretero JC, Díaz González TE, Fernández González F, Loidi Arregui J: Vegetación del Pirineo occidental y Navarra: VI Excursión Internacional de Fitosociología (AEFA). Itinera Geobotanica 1991(5):5-456. 21. Montserrat P: L'exploration floristique des Pyrenees occidentales.(The floristic exploration of West Pyrenees). Bol Soc Brot 1974, 47:227-241. 22. Villar L: La vegetación del Pirineo occidental: estudio de Geobotánica Ecológica. In: Príncipe de Viana Suplemento de Ciencias. 1982: 263-434. 23. Vigo J, Ninot J: Los Pirineos. In: La vegetación de España. Edited by Peinado Lorea M, Rivas Martinez S: Col. Aula Abierta. Publ. Univ. Alcalá de Henares; 1987: 351-384. 24. Izard M: Le climat. In: La Végétation des Pyrénees. Edited by Dupias G. Editions du CNRS, Paris; 1985: 17-36. 25. Librado P, Rozas J: DnaSP v5: A software for comprehensive analysis of DNA polymorphism data. Bioinformatics 2009, 25(11):1451-1452. 26. Corander J, Marttinen P, Sirén J, Tang J: Enhanced Bayesian modelling in BAPS software for learning genetic structures of populations. BMC bioinformatics 2008, 9:539. 27. Corander J, Tang J: Bayesian analysis of population structure based on linked molecular information. Mathematical Biosciences 2007, 205(1):19-31. 28. Excoffier L, Smouse PE, Quattro JM: Analysis of molecular variance inferred from metric distances among DNA haplotypes: Application to human mitochondrial DNA restriction data. Genetics 1992, 131(2):479-491. 29. Excoffier L, Laval G, Schneider S: Arlequin (version 3.0): an integrated software package for population genetics data analysis. Evolutionary bioinformatics online 2005, 1:47. 30. Dupanloup I, Schneider S, Excoffier L: A simulated annealing approach to define the genetic structure of populations. Molecular Ecology 2002, 11(12):2571-2581.