Electronic Supplementary Material 1 Materiel and method DNA amplifications PCR amplifications were conducted in 25 µl reaction mixtures containing, 1X enzyme Buffer manufacturer Qiagen®, (containing 1.5 mM Mgcl2), 0.6 unit of Taq polymerase, 17.5 pmol of each primer, 25nM of each dNTP and 4µl of DNA extract. After an initial denaturation step at 94°C for 3 min, samples were submitted to 35 cycles of 30s at 94°C, 1 min between 50°C and 56°C depending of the fragments amplified and 1min at 72 °C. PCR products were then sent to sequencing services (Macrogen, South Korea). The amplification primers were also used as sequencing primers. Sequences were cleaned and assembled using Seqscape v2.5 software (Applied Biosystems). Obtention of ultrametric trees for the Species delimitation method: Multidivtime procedure Parameters of the substitution model used by Estbranches (F84 + Γ) were estimated with the baseml program of the PAML package (Yang 1997) for each locus separately. The output from baseml was then used for the first step of the multidistribute package: paml2modelinf was run to convert these outputs into data useable by Estbranches. This program produces ML estimates of branch lengths within the optimal tree topology estimated from the combined data and a variancecovariance matrix for each locus. These output files are then employed in Multidivtime to estimate divergence times. We used the default setting for the Markov chain Monte Carlo analyses (100000 cycles in which the Markov chain was sampled 10000 times every 100th cycle following burnin). Although several aphid fossils have been described (Heie 1967; Heie 2004), none of them are recent enough to calibrate the Brachycaudus phylogenetic tree. There are obviously no fossil for Buchnera. As our aim was simply to obtain ultrametric trees for the species delimitation analysis,we arbitrarily assigned prior ages of 1.0 (SD = 1) to both lineages (see Hughes et al. 2007 for a similar approach). Following the manual recommendation, rtrate (mean of the rate of molecular evolution at the ingroup root node), was estimated by calculating the median of the branch lengths from root to ingroup tips. Buchnera Phylogenetic reconstruction The results of MP analysis were used to determine the most suitable evolutionary model for ML analysis and BI. We first performed MP analyses with PAUP* v. 4.0b10 (Swofford 2003), on the combined DNA dataset. We conducted heuristic searches with the tree bisection– reconnection branch swapping algorithm, 500 random addition sequences and a Maxtrees value of 10000. Gaps were treated as missing data. Character congruence between the three DNA partitions was then tested using the incongruence length difference test (ILD; Farris et al. 1995), by performing 500 replicate MP searches on the randomly partitioned dataset with all invariant characters excluded (Cunningham 1997). For ML reconstructions, the model of nucleotide substitution was selected in Modeltest v. 3.7 (Posada & Crandall 1998). The MP tree with the highest Ln score was used to estimate the model parameters (gamma shape, base frequencies and substitution matrix). A ML heuristic search, using a starting tree obtained by MP, was then conducted in PhyML (Guindon & Gascuel 2003), using the selected model. For both MP and ML analyses, node support was assessed with the bootstrap technique, using 500 replicates. Bayesian phylogenetic analyses were conducted in MRBAYES v. 3.1.2 (Ronquist & Huelsenbeck 2003). Different partition schemes were compared to optimize the fit of evolutionary models to the sequence data (Nylander et al. 2004; see table S3 of the electronic supplementary material). We used the GTR+I+G model, which was identified as the best-fit model for all DNA fragments. The parameters of the model were treated as unknown variables with uniform prior probabilities and were estimated during the analysis; they were allowed to vary across partitions. Two replicate analyses were run for three million generations. We ran one cold chain and three hot chains of the Markov chain Monte Carlo simulation, using a random starting point and sampling trees every 100 generations. The point of stationarity was determined as the point at which the distribution of likelihoods reached a plateau and trees preceding this point (2000–3000 trees, depending on the DNA partition) were discarded. The remaining trees were used to generate 50 per cent majority rule consensus trees. Posterior probabilities (pp) were summarized accordingly. (i) Reconciliation analyses (Page 1994) This topology-based method, implemented in TREEMAP v. 1 and TREEMAP v. 2.02b, aims to identify optimal reconstructions of the history of a host–parasite association by mapping the parasite tree onto the host tree and maximizing cospeciation events. Heuristic searches are generally used to find optimal solutions in TREEMAP v. 1, whereas TREEMAP v. 2.02 uses the Jungle algorithm (Charleston 1998). This algorithm explores all possible mappings of one tree onto another, assigning different costs to diversification events (cospeciation, host switching, lineage sorting and duplication) and finds optimal (i.e. yielding minimal costs) solutions. We used the default cost settings for analyses. The probability of obtaining the observed number of cospeciation events is then estimated by randomizing the parasite trees and generating a null distribution of the number of cospeciation events. (ii) ParaFit (Legendre et al. 2002) This distance-based method tests the null hypothesis that the diversification of hosts and parasites has been independent, using distance matrices rather than tree topologies. The null hypothesis is tested by permuting a host–parasite association matrix. Each individual host– parasite association can also be tested. ParaFit tests were carried out with ML trees, using Copycat (Meier-Kolthoff et al. 2007). Tests of random association were performed with 9999 permutations. (iii) Likelihood ratio tests This method tests the null hypothesis that the likelihoods of host and symbiont datasets do not differ significantly under the same model (including tree topology). If the null hypothesis is rejected, it is assumed that diversification events, such as host switching in the symbiont, caused the observed incongruence. We first used the Shimodeira–Hasegawa (SH) test, as described by Peek et al. (1998) and Clark et al. (2000), to compare the likelihood score of the best ML topology for the Buchnera combined dataset with the score for the best ML topology obtained with the Brachycaudus dataset. Similarly, the score of the best host tree was compared with that of the alternative Buchnera tree based on the aphid dataset. The trees and datasets compared excluded specimens for which sequences were not obtained, for all aphids and all Buchnera DNA fragments, and a single outgroup sequence was kept. The SH tests were conducted in PAUP* v. 4.0b10 with resampling estimated log-likelihood optimization and 10000 bootstrap replicates. We optimized the model parameters for each dataset constrained to each alternative tree. We then used the LRT proposed by Huelsenbeck & Bull (1996) to test for heterogeneity of trees obtained with different data partitions, to assess the conflict between Buchnera loci and the combined Brachycaudus dataset (see Huelsenbeck et al. 1997; Clark et al. 2000; Hughes et al. (2007) for applications of the LRT to cospeciation studies). Again, the best topology for a given dataset was compared with the alternative topologies obtained with other datasets. We used the SH test to conduct pairwise comparisons between scores for alternative Buchnera topologies obtained with alternative loci and the combined Brachycaudus dataset. We then calculated the statistic Δ&equals;2(ln L1–ln L0), which measures the likelihood difference between each dataset being allowed to have a different topology and all datasets being constrained to have the same topology. Under the null hypothesis of a common topology underlying all datasets, the topology chosen to establish L0 is that with the highest summed likelihood across datasets. As the tested hypotheses were not nested, the significance of Δ was assessed by generating a distribution of Δ under the null hypothesis that datasets have the same topology. Likelihood parameters and branch lengths for each Buchnera locus and the Brachycaudus combined dataset were optimized under the assumption of shared topology (that with the highest summed likelihood across datasets). One hundred sequence datasets were simulated using SEQGEN v. 1.3.2 (Rambault & Grassly 1997) with the graphical interface SG Runner v. 2.0 (T. P. Wilcox, http://homepage.mac.com/tpwilcox/SGRUNNER/FileSharing15.htm) for each Buchnera locus and the aphid combined dataset with these new parameter estimates, the length and nucleotide composition of the original dataset and the constrained topology and branch lengths. The statistic Δ was calculated for each of the 100 simulated datasets. We also examined the contribution of individual Buchnera loci to the heterogeneity of the observed dataset, by excluding individual genes from the calculation of Δ. Reconciliation analyses and ParaFit analyses were conducted on both specimenbased phylogenies (including 56 samples) and the different species-based phylogenies obtained with species delineation methods. The LRT method was used only for specimen-based phylogenies. The main advantage of this method is that it makes it possible to detect heterogeneity between data partitions and this property should not be affected by phylogenies including fewer sequences. References Charleston, M. A., 1998 Jungles: a new solution to the host/parasite phylogeny reconciliation problem. Math. Biosci. 149, 191–223 (doi:10.1016/S00255564(97)) Cunningham, C. W. 1997 Can three incongruence tests predict when data should be combined? Mol. Biol. Evol. 14. 733–740. Farris, J. S., K&auml;llerjso, M., Kluge, A. G. & Bult, C. 1995 Constructing a significance test for incongruence. Syst. Biol. 44. 570—572. (doi:10.2307/2413663) Gomez-Valero, L., Silva, F. J., Simon, J. C. & Latorre, A. 2007 Genome reduction of the aphid endosymbiont Buchnera aphidicola in a recent evolutionary time scale. Gene 389, 87-95. Guindon, S. & Gascuel, O 2003 A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst. Biol. 52. 2003. 696–704. (doi:10.1080/10635150390235520) Heie, O. E. 1967 Studies on fossil aphids (Homoptera: Aphidoidea). Spolia Zool. Musei Hauniensis 26, 1-273. Heie, O. E. 2004 The history of the studies on aphid palaeontology and their bearing on the evolutionary history of aphids. In Aphids in a new millennium (ed. J.-C. Simon, C.-A. dedryver, C. Rispe & M. Hullé), pp. 151-158. Paris: INRA Editions. Huelsenbeck, J. P. & Bull, J. J. 1996 A likelihood ratio test to detect conflicting phylogenetic signals. Syst. Biol. 45, 92–98. (doi:10.2307/2413514) Huelsenbeck, J. P., Rannala, B. & Yang, Z. 1997 Statistical tests of host–parasite cospeciation. Evolution. 51, 410–419. (doi:10.2307/2411113) Hughes, J., Kennedy, M., Johnson, K. P., Palma, R. L. & Page, R. D. M. 2007 Multiple Cophylogenetic Analyses Reveal Frequent Cospeciation between Pelecaniform Birds and Pectinopygus Lice. Syst. Biol. 56, 232-251. Johnson, K. P. & Clayton, D. H. 2004 Untangling coevolutionary history. Syst. Biol. 53, 92–94. (doi:10.1080/10635150490264824) Kergoat, G. J., Silvain, J. F., Delobel, A., Tuda, M. & Anton, K. W. 2007 Defining the limits of taxonomic conservatism in host-plant use for phytophagous insects: Molecular systematics and evolution of host-plant associations in the seed-beetle genus Bruchus Linnaeus (Coleoptera : Chrysomelidae : Bruchinae). Mol. Phyl. Evol. 43, 251-269. Meier-Kolthoff, J. P., Auch, A. F.., Huson, D. H. & G&ouml;ker, M. 2007 Copycat: cophylogenetic analysis tool. Bioinformatics. 23, 898–900. (doi:10.1093/bioinformatics/btm027) Nylander, J. A. A. Ronquist, F., Huelsenbeck, J. P, & Nieves-Aldrey, J.-L. 2004 Bayesian phylogenetic analysis of combined data. Syst. Biol. 53, 47–67. (doi:10.1080/10635150490264699) Page, R. D. M. Tangled trees: phylogeny, cospeciation and coevolution. Chicago, IL: University Chicago Press. Peek,A. S., Feldman, R. A., Lutz, R. A. & Vrijenhoek, R. C. 1998 Cospeciation of chemoautotrophic bacteria and deep sea clams. Proc. Natl Acad. Sci. USA. 95, 9962–9966. (doi:10.1073/pnas.95.17.9962) Posada, D. & Crandall, K. A. 1998 MODELTEST: testing the model of DNA substitution. Bioinformatics. 14, 817–818. (doi:10.1093/bioinformatics/14.9.817) Rambault, A. & Grassly, N. C. 1997 . Seq-gen: an application for the Monte Carlo simulation of DNA sequence evolution along phylogenetic trees. Comp. Appl. Biosc. 13, 235–238. Ronquist, F. & Huelsenbeck, J. P. 2003 MRBAYES 3: Bayesian phylogenetic inference under mixed models. Bioinformatics. 19, 1572–1574. (doi:10.1093/bioinformatics/btg180) Swofford, D. L. 2003 PAUP*. Phylogenetic analysis using parsimony (*and Other Yang, Z. 1997 PAML: a program for package for phylogenetic analysis by maximum likelihood. CABIOS 15, 555-556. Table S1 : Sample information Species Voucher Collectors Collection site Host plant B. aconiti (Mordvilko, 1928) 1790 Coeur.& Jous. France, Ariège (09), Mijanes, Col de Pailhères Aconitum sp. B. amygdalinus 1688 Coeur.& Jous. France, Var (83), Fayence Prunus dulcis (Schouteden, 1905) 1694 Coeur.& Jous. France, Bouches-du-Rhône (13), St-Martin-de-Crau Prunus dulcis 1710 Coeur.& Jous. France, Gers (32), Saint-Clar Prunus dulcis B. ballotae (Passerini, 1860) s338 G. Cocuzza Germany, Berlin Ballota nigra B. bicolor (Nevsky, 1929) 1458 Coeur d'Acier Greece, Lakonia, Lagada Boraginaceae B. cardui 1709 Coeur.& Jous. France, Haute-Garrone (31), Grenade Asteraceae sp. 1746 Coeur.& Jous. France, Haut-Rhin (68), Colmar Prunus domestica 1765 Coeur.& Jous. France, Gard (30), Le Vigan Arctium sp. B. cerinthis 1772 Coeur.& Jous. France, Hautes-Alpes (05), Villar-d'Arene Cerinthe glabra B. divaricatae (Shaposhnikov, 1956) s242 G. Cocuzza Lithuania, Vilnius, Bratoskies Prunus divaricata B. helichrys i(Kaltenbach, 1843) 1608 Coeur d'Acier Greece, Ahaia, Kalavrita Achillea sp. 1600 Coeur.& Jous. Greece 1716 Coeur.& Jous France, Tarn-et-Garonne (82), Gramont Prunus domestica 1809 Coeur.& Jous. Australia, Western Australia, Denison Helianthus annuus 1681 Coeur.& Jous. France 1749 Coeur.& Jous France B. jacobi Stroyan, 1957 s145 G. Cocuzza Italy, Sicily, Itala Myosotis sylvatica B. klugkisti (Börner, 1942) 1290 Coeur d'Acier France, Creuse (23), Peyrat-la-Noniere Silene sp. 1747 Coeur.& Jous. France, Haut-Rhin (68), Ste-Marie-Aux-Mines Silene dioica 2063 Jousselin France, Haute-Savoie (74), La Roche sur Foron Silene dioica 2064 Coeur.& Jous. France, Pyrénnées Orientales, Silene dioica B. lamii (Koch, 1854) s328 G. Cocuzza Italy, Sicily, Montalbano Elicona Lamium flexsuosum B. lateralis (Walker, 1848) 1027 Coeur d'Acier France, Finistère (29), Cleden-Cap-Sizun Senecio jacobaea 1741 Coeur.& Jous. France, Drôme (26), St-Marcel-les-Valence Senecio sp. 1751 Coeur.& Jous. France, Haut-Rhin (68), Colmar Arctium sp. s117 G. Cocuzza Italy, Sicily, Salina Chrysanthemum coronarium 1794 Coeur.& Jous. France, Lozère (48), La Bastide-Puylaurent Senecio sp. 1938 Coeur.& Jous. France, Hérault (34) Prades le lez, CBGP Linaria repens 2047 Coeur.& Jous Italy, Sicilia, Zafferana Linaria purpurea. s249 G. Cocuzza Italy, Trentino Alto Adige, Ala Plantago lanceolata B. lychnicola (Hille Ris Lambers, s317 G. Cocuzza Czech Republic, South Bohemia, Lužanská Udolí Silene flos-cuculi 1324 Coeur d'Acier France, Morbihan (56), Saint-Pierre-Quiberon Silene sp. 1698 Coeur.& Jous. France, Hérault (34) St-Guilhem-le-Desert Silene latifolia 1752 Coeur.& Jous. France, Haut-Rhin (68), Colmar Silene latifolia 1762 Coeur.& Jous. France, Gard (30), Le Vigan, Col de Faubel Silene dioica s125 G. Cocuzza Italy, Lazio, Roma Malva sylvestris B. linariae (Stroyan, 1950) B. lucifugus (Müller, 1955) 1966) B. lychnidi s(Linnaeus, 1758) B. malvae (Shaposhnikov, 1964) B. mordvilkoi (Hille Ris Lambers, s248 G. Cocuzza Italy, Trentino Alto Adige, Ala Echium vulgare G. Cocuzza Czech Republic, South Bohemia, Lužanská Udolí Aconitum 1931) B. napelli (Schrank, 1801) s316 callybotrium B. persicae (Passerini, 1860) 1077 Coeur d'Acier France, Aude (11), Quillan, La Forge Prunus spinosa 1696 Coeur.& Jous. France, Prunus sp. 1736 Coeur.& Jous. France, Drôme (26), St-Marcel-les-Valence Prunus sp. 1483 Coeur d'Acier Greece, Lakonia, Mystra Silene vulgaris 1760 Coeur.& Jous. France, Gard (30), Le Vigan, Col de Faubel Silene vulgaris B. prunicola (Kaltenbach, 1843) 1267 Coeur d'Acier France, Creuse (23), Vallieres, La Prades Prunus sp. B.rumexicolens (Patch, 1917) 1764 Coeur.& Jous. France, Gard (30), Le Vigan, Col de Faubel Rumex acetosella 1982 Coeur.& Jous Italie, Sicile, Linguaglossa Rumex acetosella B. salicinae (Börner, 1939) s307 G. Cocuzza Czech Republic, South Bohemia, Českỳ Krumlov Inula salicina B. schwartzi (Börner, 1931) 1717 Coeur.& Jous. France, Tarn-et-Garonne (82), Gramont, Hameau de Prunus persica B. populi (del Guercio, 1911) Géran B.spiraeae (Börner, 1932) B. tragopogonis (Kaltenbach, 1843 1730 Jousselin France, Centre, Loiret (45), Germigny-Des-Pres Prunus persica 1738 Coeur.& Jous. France, Drôme (26), St-Marcel-les-Valence Prunus persica 1775 Coeur.& Jous. France, Hautes-Alpes (05), La Grave Spiraea sp. 2143 Coeur.& Jous. Scotland, Kinlochewe, Spiraea salicifoliae 1378 Coeur d'Acier Greece, Korinthia, Némea Tragopogon sp. 1715 Coeur.& Jous. France, Tarn-et-Garonne (82), Gramont, Hameau de Tragopogon sp. Géran 1773 Coeur.& Jous. France, Hautes-Alpes (05), Villar-d’Arene Myzus persicae (Sulzer, 1776) 1948 Coeur.& Jous. France Myzus persicae (Sulzer, 1776) 1956 Coeur.& Jous. France Outgroups Tragopogon sp. Table S2: Name, sequences and references of primers used for Buchnera PCR and sequencing. DNA fragment Name of primer Sequence of primer References TrpB TrpBF ACWGGHGCTGGWCAACATGGWGT This study TrpBRlg CAACCAAGCATGTTCAGGACCA This study hupAF DTTAATTAATTGAGTTTTATTCAT (Gomez-Valero et al. 2007) rpoC ACWGGATATGCATATCAYAAARAACG (Gomez-Valero et al., 2007) sbbF CGAACWTCVGGATCTTGWC Carletto et al. unp. dnaB R ATCCCATTGTTCATTATCTAACAT Carletto et al. unp HupA rpoC intron Sbb-dnaB intron Table S3: We chose to partition the combined dataset according to DNA fragments identity, coding (i.e. TrpB) and non coding regions and codon position in the coding region. We compared partitioning strategies using Bayes factors (Kergoat et al. 2007), the Bayes factors (2 ln (Bp)) are figured on the left side of the matrix. Critical values of the χ2 distribution (P < 0.001) are given on the right side of the matrix, (ddl) refer to the number of additional parameters required for the most complex strategies between the two strategies being compared. Partitioning strategy Harmonic mean P1 P2 P3 P4 P5 P1. Non partitioned dataset 13994.34 - 36,123 55,476 73,402 90,573 (ddl= 14) (ddl=27) (ddl=40) (ddl=53) - 34,528 54,052 72,055 (ddl=13) (ddl=26) (ddl=39) - 34,528 54,052 (ddl=13) (ddl=26 - 34,528 P2. Trpb + (introns) P2. TrpB + intron 1 + intron 2 P4: TrpB codon 1, 2 ,3 + intron 13812.27 13636.43 13641.26 364.14 715.82 706.16 351.68 342.02 -9.66 (ddl=13) P5: TrpB codon 1,2,3 + intron 1 + intron 2 13587.00 814.68 450.54 98.86 108.52 - Table S4: Results of cospeciation tests between aphids and Buchnera species trees, maximum numbers of cospeciation events are given for Treemap analyses and numbers of significant links are given for ParaFit analyses Brachycaudus tree (Nbr of species) Buchnera tree (Nbr of species) Treemap 1 TreeMap 2.02b ParaFit taxonomic species (27) phylogenetic species, clustering method (21) 13 P < 0.001 30 P < 0.01 28 (all) P < 0.001 clustering phylogenetic species, clustering method (21) 14 P < 0.001 34 P < 0.01 22 (all) P < 0.001 phylogenetic species Pons et al. phylogenetic species, Pons et al. method (24) 16 P < 0.001 34 P < 0.01 24 (all) P < 0.001 phylogenetic species, method (21) mehod (22)