Gene Trees and Species Trees - Southeastern Louisiana University

advertisement
Gene Trees and Species Trees:
Lessons from morning glories
Lauren A. Eserman & Richard E. Miller
Department of Biological Sciences
Southeastern Louisiana University
Introduction
• DNA sequences are an important source of data for
phylogenetic reconstruction
• Single-gene trees
were considered
exciting and
sufficient at one
time
• Chase and 41 other
authors, 1993
• Phylogeny of
angiosperms using
rbcL
Introduction
• Next sequenced additional gene regions
– Philosophical argument for “total evidence” –more
data will strengthen the ability to determine species
relationships
– Used concatenated datasets to implement this idea
– Still dominates the way species trees are estimated
Introduction
• Population genetics and coalescent theory
emphasize that genes have unique histories
– Gene trees do not always reflect the true species
history
Introduction
• Gene tree heterogeneity can come about by:
– Gene duplication
events
– Horizontal gene
transfer
– Incomplete lineage
sorting (deep
coalescence)
– Branch length
heterogeneity
Edwards, 2009
• This provides evidence against concatenation
Introduction
• Paradigm shift in systematics? (Edwards, 2009)
– Moving away from notion that gene trees show true
species relationships
– Promotes synergism of phylogenetic systematics
with population genetics and coalescent theory
Introduction
• Application of the paradigm shift:
– Use collective information from multiple gene trees
to estimate a species tree
– Consider conflicting results, valid alternative
hypotheses for species relationships
Research Objectives
1. Explore how gene trees with different
phylogenetic signal influence the estimated
species tree
– Using 28 gene trees
– Effects of concatenation on estimated species
tree
2. Alternative objective is to obtain an
understanding of species relationships for the
organisms of interest (not discussed here)
Study Organisms
• Morning glories are
generally species of the
genus Ipomoea (not
monophyletic)
• Focus on tribe Ipomoeeae
− Ipomoea + 9 other genera
− c. 900 species
− Distributed throughout the
subtropics and tropics of the
world
Ipomoea nil
Methods
1. Bayesian phylogenetic analysis
of 28 gene trees
• Obtained 28 gene regions for species of Ipomoeeae based on our
research and additional genes from GenBank
• Number of taxa ranged from 6 to 129
• Alignments using MAFFT and manually adjusted
• Models of nucleotide substitution chosen using jModelTest
• Gene trees constructed using MrBayes v3.1.2
– 4 runs, 4 chains sampling every 100-200 generations
– Runs were continued until stationary distribution was estimated
– Burn-in determined as asymptote in plots of total tree length by
generations
– Convergence criteria:
• Same topology among 4 runs
• PP of clade support ±3% among 4 runs
– Majority rule consensus tree constructed from a combination of postburnin trees from all 4 runs
Methods
1. Synthesis of 28 gene trees
• ITS tree used as working hypothesis
– Densest taxon sampling (129 species)
– Good intrageneric resolution
– NOTE: Not assuming this is the species tree –
rather, a working hypothesis to compare to other
gene trees
• Topology and clade support of 27 other gene
trees compared to ITS gene tree
Results
1. Same relationships between ITS and other genes
DFRB-2
myb1
Majority rule
Majority rule
I alb a MDR6
I b atatas FG
100
I hederacea MDR6
52
100
I b atatas FY
99
I nil MDR6
100
I li ndheimeri MDR6
100
PHAR
I horsfall iae MDR
100
I nil SI1
I purpurea MDR6
97
100
I nil SI2
100
I tricolor MDR6
PHAR
98
I purpurea ST
I coccinea MDR 6
100
85
I hederifol ia MDR6
100
100
I neei MDR6
I quamocli t MDR
MINA
100
100
I tenuilob a MDR
I quamocli t MDR6
I coccinea MDR
I lacunosa MDR 6
T ob longata MD R
I tril ob a MDR6
I trifida MDR6
M di ssecta MDR
MINA
Results
2. Individual species with unique positions not shown
in any other gene tree
DFRB-2
Majority rule
100
I nil 554 A1
I nil 225 C2
bHLH3
I nil 830 O3
Majority rule
I nil 420 T
I b atatas FG
I nil 163 AD4
100
I nil 422 AJ
81
I nil 679 AK
80
I nil 447 AL
I b atatas FY
99
I nil 449 BG8
I nil 418 BA6
I horsfall iae MDR
I nil 845 BC7
I nil 767 BJ
I nil 195 BS11
82
I nil SI1
I nil 414 BW12
I nil 164 BY
I nil 771 BZ
PHAR
97
100
100
I indica 253 D
82
98
I purpurea ST
I indica 168 E
I indica 602 F
100
78
85
I indica 130 B
I quamocli t MDR
I indica 424 G
100
100
100
I neurocephala 222 A1
I alb a 129
I nil 547 BM
99
100
I tenuilob a MDR
I nil 427 BQ10
I nil SI1 AM5
79
76
PHAR
I indica 166 C
100
76
I nil SI2
100
I indica 481 A
62
I coccinea MDR
I nil 766 AY
I nil 776 BI
I nil 371 BK
T ob longata MD R
I nil 706 BL
I nil 779 BN9
I hederifol ia 506
M di ssecta MDR
MINA
Results
3. Major alternative topology in CHSE
Majority rule
I alba MTC4
CHSE
69
I purpurea SI4 A1
56
I nil MTC4 A
100
100
PHAR
I nil SI5 B
I lobata MTC4
98
100
I quamoclit MTC4
I cordatotriloba MTC4
82
I amnicola MTC4
100
I argillicola MTC4
100
I wrightii MTC4
95
100
I platensis MTC4
99
I saintronanensis MTC4
I aquatica MTC4
100
I diamantinensis MTC4
I eriocarpa MTC4
100
I plebia MTC4
70
I ochracea MTC4
I pes tigridis MTC4
P hybrida MTC4
MINA
Results
4. Identify new unnamed clades
bHLH2
Majority rule
Majority rule
waxy 1
90
62
70
100
I alb a MDR4
98
95
58
I hederacea MDR4
100
100
I nil TKS Inb HLH2
100
PHAR
98
75
79
100
100
100
90
99
97
97
I purpurea
99
100
59
99
I tricolor SI3
64
61
54
99
56
81
100
I coccinea MDR 4
58
100
54
I quamocli t MDR4
MINA
100
82
95
76
94
52
100
72
60
I horsfall iae MDR4
100
100
100
100
70
87
I lacunosa MDR 4
100
I trifida MDR4
BATA
I vi olacea MDR4
O pteri pes MDR4
‘VIOL’
72
67
100
70
53
62
I hochstetteri MDR4
100
100
98
98
96
71
A nervosa 20
T holubii
I pes tigris 12
S beraviensis
S tiliifolia 62
I obscura 26
I ochracea 22
I arachnosperma
I pedicellaris 97
L owariensis
I eriocarpa
I lonchophylla 96
I plebeia 18
I albivenia 125
I cairica 91
I sepiaria 98
I diamantinensis 45
I aquatica 7
I coccinea 13
I sesscosiana 143
I lutea 141
I neei 140
I funis 123
I lobata 39
I quamoclit 14
I orizabensis 142
I dumetorum 147
I lindheimeri 25
I indica 130
I hederacea 10
I nil 11 A
I nil 127 B
I purpurea 9 A
I purpurea 131 B
I pubescens 76
I mairetii 137
I barbatisepala 90
I marginisepala 148
I tricolor 128 B
I tricolor 16 A
I parasitica 5 A
I stans 136
I expansa 135
I seducta 146
I ampullacea 124
I ternifolia 47
I neurocephala 145
I muricata 144
I parasitica 15 B
I alba 129
I santillanii 138
I conzattii 44
I graminea 46
I platensis 49
I carnea 21
I costata 55
I polpha 56
I wolcottiana 38
I saintronanensis 50
I leptophylla 4
I pandurata 48
I muelleri 36
I gracilis 24
I argillicola 34
I asarifolia 8 A1
I amnicola 3
I imperati 53
I sumatrana W22
I batatas 1 A1
I lacunosa 40
I umbraticola 29
I setosa 6
I wrightii 33
I sagittata 132
I polymorpha
M tuberosa 19
O brownii 63
‘OBSC’
MINA
PHAR
TRIC
CALO
‘AMNI’
BATA
Methods
2. Concatenated dataset
• To address the issue of concatenation,
constructed concatenated dataset using 10 genes
• All gene trees showed similar topologies
Majority rule
Majority rule
I alb a MTC
I alb a MTC A
100
97
DFRB-1
100
100
I nil MTC
I purpurea MTC
UF3GT
100
I lob ata MTC
100
96
I nil MTC4
I nil MTC A
100
I nil MDR B
100
55
I hederacea MDR
100
100
I ampullacea REM
I neurocephala REM
96
100
I purpurea MTC A
100
I argil li col a MTC
86
I lobata MTC4
100
I quamoclit MTC4
100
I quaml oci t MTC
I amnicola MTC
50
I quamocli t MTC
80
76
I alba MTC4
CHI
I alb a MDR B
I lob ata MTC
100
Majority rule
I purpurea MTC4
86
I purpurea MDR B
I amnicola MTC4
I wri ghtii MTC
I cordatotrilob a MTC
79
67
I cordatotrilob a MTC
96
60
100
I argillicola MTC4
100
100
I wrightii MTC4
I trifida MDR
I platensis MTC
I ob scura MTC
100
I sai ntranensis MTC
100
100
I umb rati col a MTC
I wri ghtii MTC
100
75
I aquati ca MTC1
66
100
100
I aquati ca MTC2
64
I platensis MTC
I pes tigridis MTC
I umbraticola MTC4
99
I saintronanensis MTC4
I eriocarpa MTC4
I diamantinensis MTC
I eri ocarpa MTC
I platensis MTC4
100
I aquati ca MTC
100
I cordatotriloba MTC4
90
I argil li col a MTC
I diamantinensis MTC
I pes tigridis MTC4
99
100
I pleb ia MTC
83
I ochracea MTC
I amnicola MTC
I sai ntronanensi s MTC
50
100
I aquatica MTC4
I diamantinensis MTC4
78
100
I umb rati col a MTC
100
I b atatas HN
I eri ocarpa MTC
I plebia MTC4
I pleb ia MTC
I ochracea MTC4
I pes tigridis MTC
I obscura MTC4
I ob scura MTC
I ochracea MTC
Majority rule
I alba
100
Results
100
100
I purpurea
I lobata
10-gene
concatenated
dataset
100
100
I quamoclit
I cordatotriloba
100
99
I umbraticola
I platensis
100
100
• Maintains
topologies of
individual gene
trees
I nil
I saintronanensis
I amnicola
100
100
I argillicola
100
I wrightii
I aquatica
100
100
I diamantinensis
I eriocarpa
100
100
I plebia
I pes tigridis
I ochracea
I obscura
Concatenated dataset
• What happens when one more gene is added?
• Add CHSE to 10-gene concatenated dataset
– Alternative topology
– All coding region
– No indels
Majority rule
I alba
Results
100
I nil
100
100
11-gene
concatenated
dataset
I purpurea
I lobata
100
100
I quamoclit
I cordatotriloba
100
I amnicola
100
• Exhibits
topology of
CHSE –
new gene
overwhelms this
analysis
I argillicola
100
I wrightii
90
100
I platensis
100
I saintronanensis
I aquatica
100
I diamantinensis
I eriocarpa
100
100
I plebia
I pes tigridis
I ochracea
10-gene concatenated dataset 11-gene concatenated dataset
Majority rule
Majority rule
I alba
I alba
100
100
100
100
I nil
I nil
100
I purpurea
100
I purpurea
I lobata
100
I lobata
100
100
100
I quamoclit
I quamoclit
I cordatotriloba
100
99
I cordatotriloba
I umbraticola
100
I amnicola
I platensis
100
100
100
I argillicola
100
I saintronanensis
I amnicola
I wrightii
90
100
100
100
I argillicola
100
I platensis
100
I wrightii
I saintronanensis
I aquatica
100
100
I aquatica
100
I diamantinensis
I diamantinensis
I eriocarpa
100
I eriocarpa
100
I plebia
I pes tigridis
100
100
I plebia
I ochracea
I pes tigridis
I obscura
I ochracea
BEST Analysis
Bayesian Estimation of Species Trees
(Liu, 2008)
• Incorporates a multispecies coalescent model
to estimate species tree from many gene trees
• Methods:
– 11-gene concatenated dataset
– 2 runs, 4 chains
– 8 million generations (did not reach convergence
on topology)
Majority rule
I alba
I amnicola
98
BEST Analysis
I argillicola
100
I wrightii
78
I platensis
100
100
73
• Results:
– Clade present in
CHSE appears in
BEST tree
– Overall topology
differs
– Species pairs
supported
throughout
I saintronanensis
I aquatica
75
100
I diamantinensis
I cordatotriloba
99
I lobata
100
I quamoclit
84
I nil
100
I purpurea
I eriocarpa
100
100
I plebia
I pes tigridis
I ochracea
Discussion
• Analysis of 28 gene trees
– Provides an estimate of species tree
– Alternative hypotheses for species relationships
have emerged
– Overall congruence among gene trees
Discussion – Concatenated datasets
• Total evidence philosophically justified but
misleads results because of gene tree
heterogeneity
– Shown clearly in 11-gene concatenated dataset
• Left with idea that we have two alternative
hypotheses of species relationships
– Two estimates of the species tree
Discussion – Concatenated datasets
• Can now appreciate how a single gene can
overwhelm results of a concatenated dataset
– Topology of CHSE dominated
Majority rule
Majority rule
I alba
I alba
100
100
100
100
I nil
I nil
100
I purpurea
100
I purpurea
I lobata
100
I lobata
100
100
100
I quamoclit
I quamoclit
I cordatotriloba
100
99
I cordatotriloba
I umbraticola
100
I amnicola
I platensis
100
100
100
I argillicola
100
I saintronanensis
I amnicola
I wrightii
90
100
100
100
I argillicola
100
I platensis
100
I wrightii
I saintronanensis
I aquatica
100
100
I aquatica
100
I diamantinensis
I diamantinensis
I eriocarpa
100
I eriocarpa
100
I plebia
I pes tigridis
100
100
I plebia
I ochracea
I pes tigridis
I obscura
I ochracea
Acknowledgements
Research Assistants:
A. McDaniel, K. Robichaux,
W. Terry, S. Major, H. Echlin,
F. St. Cyr
Seed Donations:
M. Clegg, M. Rausher,
J. A. McDonald, J. Miller
P. Tiffin, B. Zufall, S.M. Chang
Ipomoea purpurea
Download