Table S1 : Sample information - Proceedings of the Royal Society B

advertisement
Electronic Supplementary Material 1
Materiel and method
DNA amplifications
PCR amplifications were conducted in 25 µl reaction mixtures containing, 1X enzyme
Buffer manufacturer Qiagen®, (containing 1.5 mM Mgcl2), 0.6 unit of Taq
polymerase, 17.5 pmol of each primer, 25nM of each dNTP and 4µl of DNA extract.
After an initial denaturation step at 94°C for 3 min, samples were submitted to 35
cycles of 30s at 94°C, 1 min between 50°C and 56°C depending of the fragments
amplified and 1min at 72 °C. PCR products were then sent to sequencing services
(Macrogen, South Korea). The amplification primers were also used as sequencing
primers. Sequences were cleaned and assembled using Seqscape v2.5 software
(Applied Biosystems).
Obtention of ultrametric trees for the Species delimitation method:
Multidivtime procedure
Parameters of the substitution model used by Estbranches (F84 + Γ) were estimated
with the baseml program of the PAML package (Yang 1997) for each locus
separately. The output from baseml was then used for the first step of the
multidistribute package: paml2modelinf was run to convert these outputs into data
useable by Estbranches. This program produces ML estimates of branch lengths
within the optimal tree topology estimated from the combined data and a variancecovariance matrix for each locus. These output files are then employed in
Multidivtime to estimate divergence times. We used the default setting for the Markov
chain Monte Carlo analyses (100000 cycles in which the Markov chain was sampled
10000 times every 100th cycle following burnin).
Although several aphid fossils have been described (Heie 1967; Heie 2004), none of
them are recent enough to calibrate the Brachycaudus phylogenetic tree. There are
obviously no fossil for Buchnera. As our aim was simply to obtain ultrametric trees
for the species delimitation analysis,we arbitrarily assigned prior ages of 1.0 (SD = 1)
to both lineages (see Hughes et al. 2007 for a similar approach). Following the
manual recommendation, rtrate (mean of the rate of molecular evolution at the
ingroup root node), was estimated by calculating the median of the branch lengths
from root to ingroup tips.
Buchnera Phylogenetic reconstruction
The results of MP analysis were used to determine the most suitable evolutionary
model for ML analysis and BI.
We first performed MP analyses with PAUP* v. 4.0b10 (Swofford 2003), on the
combined DNA dataset. We conducted heuristic searches with the tree bisection–
reconnection branch swapping algorithm, 500 random addition sequences and a
Maxtrees value of 10000. Gaps were treated as missing data. Character congruence
between the three DNA partitions was then tested using the incongruence length
difference test (ILD; Farris et al. 1995), by performing 500 replicate MP searches on
the randomly partitioned dataset with all invariant characters excluded (Cunningham
1997).
For ML reconstructions, the model of nucleotide substitution was selected in
Modeltest v. 3.7 (Posada & Crandall 1998). The MP tree with the highest Ln score
was used to estimate the model parameters (gamma shape, base frequencies and
substitution matrix). A ML heuristic search, using a starting tree obtained by MP, was
then conducted in PhyML (Guindon & Gascuel 2003), using the selected model.
For both MP and ML analyses, node support was assessed with the bootstrap
technique, using 500 replicates.
Bayesian phylogenetic analyses were conducted in MRBAYES v. 3.1.2 (Ronquist &
Huelsenbeck 2003). Different partition schemes were compared to optimize the fit of
evolutionary models to the sequence data (Nylander et al. 2004; see table S3 of the
electronic supplementary material). We used the GTR+I+G model, which was
identified as the best-fit model for all DNA fragments. The parameters of the model
were treated as unknown variables with uniform prior probabilities and were
estimated during the analysis; they were allowed to vary across partitions. Two
replicate analyses were run for three million generations. We ran one cold chain and
three hot chains of the Markov chain Monte Carlo simulation, using a random starting
point and sampling trees every 100 generations. The point of stationarity was
determined as the point at which the distribution of likelihoods reached a plateau and
trees preceding this point (2000–3000 trees, depending on the DNA partition) were
discarded. The remaining trees were used to generate 50 per cent majority rule
consensus trees. Posterior probabilities (pp) were summarized accordingly.
(i)
Reconciliation analyses (Page 1994)
This topology-based method, implemented in TREEMAP v. 1 and TREEMAP v. 2.02b,
aims to identify optimal reconstructions of the history of a host–parasite association
by mapping the parasite tree onto the host tree and maximizing cospeciation events.
Heuristic searches are generally used to find optimal solutions in TREEMAP v. 1,
whereas TREEMAP v. 2.02 uses the Jungle algorithm (Charleston 1998). This
algorithm explores all possible mappings of one tree onto another, assigning different
costs to diversification events (cospeciation, host switching, lineage sorting and
duplication) and finds optimal (i.e. yielding minimal costs) solutions. We used the
default cost settings for analyses. The probability of obtaining the observed number
of cospeciation events is then estimated by randomizing the parasite trees and
generating a null distribution of the number of cospeciation events.
(ii)
ParaFit (Legendre et al. 2002)
This distance-based method tests the null hypothesis that the diversification of hosts
and parasites has been independent, using distance matrices rather than tree
topologies. The null hypothesis is tested by permuting a host–parasite association
matrix. Each individual host– parasite association can also be tested. ParaFit tests
were carried out with ML trees, using Copycat (Meier-Kolthoff et al. 2007). Tests of
random association were performed with 9999 permutations.
(iii)
Likelihood ratio tests
This method tests the null hypothesis that the likelihoods of host and symbiont
datasets do not differ significantly under the same model (including tree topology). If
the null hypothesis is rejected, it is assumed that diversification events, such as host
switching in the symbiont, caused the observed incongruence.
We first used the Shimodeira–Hasegawa (SH) test, as described by Peek et al. (1998)
and Clark et al. (2000), to compare the likelihood score of the best ML topology for
the Buchnera combined dataset with the score for the best ML topology obtained
with the Brachycaudus dataset. Similarly, the score of the best host tree was
compared with that of the alternative Buchnera tree based on the aphid dataset. The
trees and datasets compared excluded specimens for which sequences were not
obtained, for all aphids and all Buchnera DNA fragments, and a single outgroup
sequence was kept. The SH tests were conducted in PAUP* v. 4.0b10 with
resampling estimated log-likelihood optimization and 10000 bootstrap replicates. We
optimized the model parameters for each dataset constrained to each alternative tree.
We then used the LRT proposed by Huelsenbeck & Bull (1996) to test for
heterogeneity of trees obtained with different data partitions, to assess the conflict
between Buchnera loci and the combined Brachycaudus dataset (see Huelsenbeck et
al. 1997; Clark et al. 2000; Hughes et al. (2007) for applications of the LRT to
cospeciation studies). Again, the best topology for a given dataset was compared with
the alternative topologies obtained with other datasets. We used the SH test to
conduct pairwise comparisons between scores for alternative Buchnera topologies
obtained with alternative loci and the combined Brachycaudus dataset. We then
calculated the statistic Δ=2(ln L1–ln L0), which measures the likelihood
difference between each dataset being allowed to have a different topology and all
datasets being constrained to have the same topology. Under the null hypothesis of a
common topology underlying all datasets, the topology chosen to establish L0 is that
with the highest summed likelihood across datasets. As the tested hypotheses were
not nested, the significance of Δ was assessed by generating a distribution of Δ under
the null hypothesis that datasets have the same topology. Likelihood parameters and
branch lengths for each Buchnera locus and the Brachycaudus combined dataset
were optimized under the assumption of shared topology (that with the highest
summed likelihood across datasets). One hundred sequence datasets were simulated
using SEQGEN v. 1.3.2 (Rambault & Grassly 1997) with the graphical interface SG
Runner v. 2.0 (T. P. Wilcox,
http://homepage.mac.com/tpwilcox/SGRUNNER/FileSharing15.htm) for each
Buchnera locus and the aphid combined dataset with these new parameter estimates,
the length and nucleotide composition of the original dataset and the constrained
topology and branch lengths. The statistic Δ was calculated for each of the 100
simulated datasets. We also examined the contribution of individual Buchnera loci to
the heterogeneity of the observed dataset, by excluding individual genes from the
calculation of Δ.
Reconciliation analyses and ParaFit analyses were conducted on both specimenbased phylogenies (including 56 samples) and the different species-based
phylogenies obtained with species delineation methods. The LRT method was used
only for specimen-based phylogenies. The main advantage of this method is that it
makes it possible to detect heterogeneity between data partitions and this property
should not be affected by phylogenies including fewer sequences.
References
Charleston, M. A., 1998 Jungles: a new solution to the host/parasite phylogeny
reconciliation problem. Math. Biosci. 149, 191–223 (doi:10.1016/S00255564(97))
Cunningham, C. W. 1997 Can three incongruence tests predict when data should be
combined? Mol. Biol. Evol. 14. 733–740.
Farris, J. S., Källerjso, M., Kluge, A. G. & Bult, C. 1995 Constructing a
significance test for incongruence. Syst. Biol. 44. 570—572.
(doi:10.2307/2413663)
Gomez-Valero, L., Silva, F. J., Simon, J. C. & Latorre, A. 2007 Genome reduction of
the aphid endosymbiont Buchnera aphidicola in a recent evolutionary time
scale. Gene 389, 87-95.
Guindon, S. & Gascuel, O 2003 A simple, fast, and accurate algorithm to estimate
large phylogenies by maximum likelihood. Syst. Biol. 52. 2003. 696–704.
(doi:10.1080/10635150390235520)
Heie, O. E. 1967 Studies on fossil aphids (Homoptera: Aphidoidea). Spolia Zool.
Musei Hauniensis 26, 1-273.
Heie, O. E. 2004 The history of the studies on aphid palaeontology and their bearing
on the evolutionary history of aphids. In Aphids in a new millennium (ed. J.-C.
Simon, C.-A. dedryver, C. Rispe & M. Hullé), pp. 151-158. Paris: INRA
Editions.
Huelsenbeck, J. P. & Bull, J. J. 1996 A likelihood ratio test to detect conflicting
phylogenetic signals. Syst. Biol. 45, 92–98. (doi:10.2307/2413514)
Huelsenbeck, J. P., Rannala, B. & Yang, Z. 1997 Statistical tests of host–parasite
cospeciation. Evolution. 51, 410–419. (doi:10.2307/2411113)
Hughes, J., Kennedy, M., Johnson, K. P., Palma, R. L. & Page, R. D. M. 2007
Multiple Cophylogenetic Analyses Reveal Frequent Cospeciation between
Pelecaniform Birds and Pectinopygus Lice. Syst. Biol. 56, 232-251.
Johnson, K. P. & Clayton, D. H. 2004 Untangling coevolutionary history. Syst. Biol.
53, 92–94. (doi:10.1080/10635150490264824)
Kergoat, G. J., Silvain, J. F., Delobel, A., Tuda, M. & Anton, K. W. 2007 Defining
the limits of taxonomic conservatism in host-plant use for phytophagous
insects: Molecular systematics and evolution of host-plant associations in the
seed-beetle genus Bruchus Linnaeus (Coleoptera : Chrysomelidae : Bruchinae).
Mol. Phyl. Evol. 43, 251-269.
Meier-Kolthoff, J. P., Auch, A. F.., Huson, D. H. & Göker, M. 2007 Copycat:
cophylogenetic analysis tool. Bioinformatics. 23, 898–900.
(doi:10.1093/bioinformatics/btm027)
Nylander, J. A. A. Ronquist, F., Huelsenbeck, J. P, & Nieves-Aldrey, J.-L. 2004
Bayesian phylogenetic analysis of combined data. Syst. Biol. 53, 47–67.
(doi:10.1080/10635150490264699)
Page, R. D. M. Tangled trees: phylogeny, cospeciation and coevolution. Chicago, IL:
University Chicago Press.
Peek,A. S., Feldman, R. A., Lutz, R. A. & Vrijenhoek, R. C. 1998 Cospeciation of
chemoautotrophic bacteria and deep sea clams. Proc. Natl Acad. Sci. USA. 95,
9962–9966. (doi:10.1073/pnas.95.17.9962)
Posada, D. & Crandall, K. A. 1998 MODELTEST: testing the model of DNA
substitution. Bioinformatics. 14, 817–818.
(doi:10.1093/bioinformatics/14.9.817)
Rambault, A. & Grassly, N. C. 1997 . Seq-gen: an application for the Monte Carlo
simulation of DNA sequence evolution along phylogenetic trees. Comp. Appl.
Biosc. 13, 235–238.
Ronquist, F. & Huelsenbeck, J. P. 2003 MRBAYES 3: Bayesian phylogenetic inference
under mixed models. Bioinformatics. 19, 1572–1574.
(doi:10.1093/bioinformatics/btg180)
Swofford, D. L. 2003 PAUP*. Phylogenetic analysis using parsimony (*and Other
Yang, Z. 1997 PAML: a program for package for phylogenetic analysis by maximum
likelihood. CABIOS 15, 555-556.
Table S1 : Sample information
Species
Voucher
Collectors
Collection site
Host plant
B. aconiti (Mordvilko, 1928)
1790
Coeur.& Jous.
France, Ariège (09), Mijanes, Col de Pailhères
Aconitum sp.
B. amygdalinus
1688
Coeur.& Jous.
France, Var (83), Fayence
Prunus dulcis
(Schouteden, 1905)
1694
Coeur.& Jous.
France, Bouches-du-Rhône (13), St-Martin-de-Crau
Prunus dulcis
1710
Coeur.& Jous.
France, Gers (32), Saint-Clar
Prunus dulcis
B. ballotae (Passerini, 1860)
s338
G. Cocuzza
Germany, Berlin
Ballota nigra
B. bicolor (Nevsky, 1929)
1458
Coeur d'Acier
Greece, Lakonia, Lagada
Boraginaceae
B. cardui
1709
Coeur.& Jous.
France, Haute-Garrone (31), Grenade
Asteraceae sp.
1746
Coeur.& Jous.
France, Haut-Rhin (68), Colmar
Prunus domestica
1765
Coeur.& Jous.
France, Gard (30), Le Vigan
Arctium sp.
B. cerinthis
1772
Coeur.& Jous.
France, Hautes-Alpes (05), Villar-d'Arene
Cerinthe glabra
B. divaricatae (Shaposhnikov, 1956)
s242
G. Cocuzza
Lithuania, Vilnius, Bratoskies
Prunus divaricata
B. helichrys i(Kaltenbach, 1843)
1608
Coeur d'Acier
Greece, Ahaia, Kalavrita
Achillea sp.
1600
Coeur.& Jous.
Greece
1716
Coeur.& Jous
France, Tarn-et-Garonne (82), Gramont
Prunus domestica
1809
Coeur.& Jous.
Australia, Western Australia, Denison
Helianthus annuus
1681
Coeur.& Jous.
France
1749
Coeur.& Jous
France
B. jacobi Stroyan, 1957
s145
G. Cocuzza
Italy, Sicily, Itala
Myosotis sylvatica
B. klugkisti (Börner, 1942)
1290
Coeur d'Acier
France, Creuse (23), Peyrat-la-Noniere
Silene sp.
1747
Coeur.& Jous.
France, Haut-Rhin (68), Ste-Marie-Aux-Mines
Silene dioica
2063
Jousselin
France, Haute-Savoie (74), La Roche sur Foron
Silene dioica
2064
Coeur.& Jous.
France, Pyrénnées Orientales,
Silene dioica
B. lamii (Koch, 1854)
s328
G. Cocuzza
Italy, Sicily, Montalbano Elicona
Lamium flexsuosum
B. lateralis (Walker, 1848)
1027
Coeur d'Acier
France, Finistère (29), Cleden-Cap-Sizun
Senecio jacobaea
1741
Coeur.& Jous.
France, Drôme (26), St-Marcel-les-Valence
Senecio sp.
1751
Coeur.& Jous.
France, Haut-Rhin (68), Colmar
Arctium sp.
s117
G. Cocuzza
Italy, Sicily, Salina
Chrysanthemum
coronarium
1794
Coeur.& Jous.
France, Lozère (48), La Bastide-Puylaurent
Senecio sp.
1938
Coeur.& Jous.
France, Hérault (34) Prades le lez, CBGP
Linaria repens
2047
Coeur.& Jous
Italy, Sicilia, Zafferana
Linaria purpurea.
s249
G. Cocuzza
Italy, Trentino Alto Adige, Ala
Plantago lanceolata
B. lychnicola (Hille Ris Lambers, s317
G. Cocuzza
Czech Republic, South Bohemia, Lužanská Udolí
Silene flos-cuculi
1324
Coeur d'Acier
France, Morbihan (56), Saint-Pierre-Quiberon
Silene sp.
1698
Coeur.& Jous.
France, Hérault (34) St-Guilhem-le-Desert
Silene latifolia
1752
Coeur.& Jous.
France, Haut-Rhin (68), Colmar
Silene latifolia
1762
Coeur.& Jous.
France, Gard (30), Le Vigan, Col de Faubel
Silene dioica
s125
G. Cocuzza
Italy, Lazio, Roma
Malva sylvestris
B. linariae (Stroyan, 1950)
B. lucifugus (Müller, 1955)
1966)
B. lychnidi s(Linnaeus, 1758)
B. malvae (Shaposhnikov, 1964)
B. mordvilkoi (Hille Ris Lambers, s248
G. Cocuzza
Italy, Trentino Alto Adige, Ala
Echium vulgare
G. Cocuzza
Czech Republic, South Bohemia, Lužanská Udolí
Aconitum
1931)
B. napelli (Schrank, 1801)
s316
callybotrium
B. persicae (Passerini, 1860)
1077
Coeur d'Acier
France, Aude (11), Quillan, La Forge
Prunus spinosa
1696
Coeur.& Jous.
France,
Prunus sp.
1736
Coeur.& Jous.
France, Drôme (26), St-Marcel-les-Valence
Prunus sp.
1483
Coeur d'Acier
Greece, Lakonia, Mystra
Silene vulgaris
1760
Coeur.& Jous.
France, Gard (30), Le Vigan, Col de Faubel
Silene vulgaris
B. prunicola (Kaltenbach, 1843)
1267
Coeur d'Acier
France, Creuse (23), Vallieres, La Prades
Prunus sp.
B.rumexicolens (Patch, 1917)
1764
Coeur.& Jous.
France, Gard (30), Le Vigan, Col de Faubel
Rumex acetosella
1982
Coeur.& Jous
Italie, Sicile, Linguaglossa
Rumex acetosella
B. salicinae (Börner, 1939)
s307
G. Cocuzza
Czech Republic, South Bohemia, Českỳ Krumlov
Inula salicina
B. schwartzi (Börner, 1931)
1717
Coeur.& Jous.
France, Tarn-et-Garonne (82), Gramont, Hameau de Prunus persica
B. populi (del Guercio, 1911)
Géran
B.spiraeae (Börner, 1932)
B. tragopogonis (Kaltenbach, 1843
1730
Jousselin
France, Centre, Loiret (45), Germigny-Des-Pres
Prunus persica
1738
Coeur.& Jous.
France, Drôme (26), St-Marcel-les-Valence
Prunus persica
1775
Coeur.& Jous.
France, Hautes-Alpes (05), La Grave
Spiraea sp.
2143
Coeur.& Jous.
Scotland, Kinlochewe,
Spiraea salicifoliae
1378
Coeur d'Acier
Greece, Korinthia, Némea
Tragopogon sp.
1715
Coeur.& Jous.
France, Tarn-et-Garonne (82), Gramont, Hameau de Tragopogon sp.
Géran
1773
Coeur.& Jous.
France, Hautes-Alpes (05), Villar-d’Arene
Myzus persicae (Sulzer, 1776)
1948
Coeur.& Jous.
France
Myzus persicae (Sulzer, 1776)
1956
Coeur.& Jous.
France
Outgroups
Tragopogon sp.
Table S2: Name, sequences and references of primers used for Buchnera PCR and sequencing.
DNA fragment
Name of primer
Sequence of primer
References
TrpB
TrpBF
ACWGGHGCTGGWCAACATGGWGT
This study
TrpBRlg
CAACCAAGCATGTTCAGGACCA
This study
hupAF
DTTAATTAATTGAGTTTTATTCAT
(Gomez-Valero et al. 2007)
rpoC
ACWGGATATGCATATCAYAAARAACG
(Gomez-Valero et al., 2007)
sbbF
CGAACWTCVGGATCTTGWC
Carletto et al. unp.
dnaB R
ATCCCATTGTTCATTATCTAACAT
Carletto et al. unp
HupA rpoC intron
Sbb-dnaB intron
Table S3: We chose to partition the combined dataset according to DNA fragments identity, coding (i.e. TrpB) and non coding regions and
codon position in the coding region. We compared partitioning strategies using Bayes factors (Kergoat et al. 2007), the Bayes factors (2 ln (Bp))
are figured on the left side of the matrix. Critical values of the χ2 distribution (P < 0.001) are given on the right side of the matrix, (ddl) refer to
the number of additional parameters required for the most complex strategies between the two strategies being compared.
Partitioning strategy
Harmonic mean
P1
P2
P3
P4
P5
P1. Non partitioned dataset
13994.34
-
36,123
55,476
73,402
90,573
(ddl= 14)
(ddl=27)
(ddl=40)
(ddl=53)
-
34,528
54,052
72,055
(ddl=13)
(ddl=26)
(ddl=39)
-
34,528
54,052
(ddl=13)
(ddl=26
-
34,528
P2. Trpb + (introns)
P2. TrpB + intron 1 + intron 2
P4: TrpB codon 1, 2 ,3 + intron
13812.27
13636.43
13641.26
364.14
715.82
706.16
351.68
342.02
-9.66
(ddl=13)
P5: TrpB codon 1,2,3 + intron 1 + intron 2
13587.00
814.68
450.54
98.86
108.52
-
Table S4: Results of cospeciation tests between aphids and Buchnera species trees, maximum numbers of cospeciation events are given for
Treemap analyses and numbers of significant links are given for ParaFit analyses
Brachycaudus tree (Nbr of species)
Buchnera tree (Nbr of species)
Treemap 1
TreeMap 2.02b
ParaFit
taxonomic species (27)
phylogenetic species, clustering method (21)
13 P < 0.001
30 P < 0.01
28 (all) P < 0.001
clustering phylogenetic species, clustering method (21)
14 P < 0.001
34 P < 0.01
22 (all) P < 0.001
phylogenetic species Pons et al. phylogenetic species, Pons et al. method (24)
16 P < 0.001
34 P < 0.01
24 (all) P < 0.001
phylogenetic
species,
method (21)
mehod (22)
Download