The impact of gene-tree/species-tree discordance on estimates of

advertisement
Gene-tree/speciestree discordance and
diversification
Sara Ruane
CUNY (CSI, GC)
R. Alex Pyron
The GW Univ.
Frank Burbrink
CUNY (CSI, GC)
Understanding patterns of
diversification
•
•
•
•
•
Ecological Opportunity
Key Innovations
Competition
Extinction
Ecological Limits
History of the study
• Paleontological Research
• Understand patterns of speciation and
extinction through time
• Simpson (1944, 1953)
• Sloss (1950)
• Sepkoski (1979)
• Stanley (1979)
• Raup (1985)
• Foote (1993)
Sloss 1950
Molecular Phylogenetic
Approaches
•
•
•
•
•
•
•
Waiting times between speciation
Slowinski and Guiher (1989)
Harvey et al. (1991, 1994)
Nee et al. (1992, 1994)
Pybus and Harvey (2000)
Rabosky (2006, 2008)
Purvis et al. (2009)
Gamma and Rate Variable
Models
• Gamma (γ) examines the density of nodes
relative to time
• Models of diversification:
1 ) Constant:
– Yule, Birth-Death
2) Variable:
– Yule2, Yule3, DDL, DDX
Plethodon ouachitae
Complex
*Even (Yule)
Early
Late
Time
Gamma = -1.3 (Yule)
Gamma = -4.5 (DDL)
Gamma = 2.2 (LB/Ext)
External Problems with Molecular
Trees
Only Looking at the
winners
Extinction dynamics must
come from data
external to the tree or
tame diversification rate
variation across a tree
Can we see all speciation
events?
Quental and Marshall (2010)
Internal Problems: Gene-Tree/Species-Tree Discordance
A
B
Present
Species/Population
Divergence (S)
Δt =2Ne
or Θ/2
Gene Divergence (G)
(See Edwards and Beerli 2000)
What does Δt tell us?
• Increasing 2Ne (Θ/2) increases Δt
• Gene div > Species div
• Δt will be small as div time between
species becomes large, because Δt will be
a small fraction of the total gene
divergence
• At deep div S=G
• At shallow div S<<G
-1.5
-2.0
-2.5
-3.0
Δt
Z100
-1.0
-0.5
0.0
Simulating gene trees from species trees
0
5
10
Tree Depth
X
15
20
Impact on diversification
estimation
• Disproportionally pushes younger nodes
towards to the root
• Density of nodes increases > root.
• Decreasing γ (early burst!)
• Variable diversification models
• Impact of Θ (4Neµ)?
Simulations
• In R
– (Phybase, Ape, Geiger, Laser, Phangorn)
• Simulate Species Trees (ST)
– Yule [γ=0] and BD
– Taxa: 25-100
• Simulate Gene Trees (GT)
• Θ: 0.0001-100
• Θ: Γ distribution
Burbrink & Pyron 2011
What are we looking for?
• Impact of Θ on γ-error (GT – ST)
• Topology (RF Distance)
Simulation Results
Implications
• Θ > 1 yields a pattern of early burst (-γ)
• Θ > 1 also increases topological
discordance
• Is Θ > 1 likely?
• Most studies of extant populations are well
below 1.0.
Empirical Study
•
•
•
•
•
Group with all species sampled
Enough genes to construct a species trees
Lampropeltis
21 species of snakes
Throughout North America and South
America
• 12 Loci
Comparisons
•
•
•
•
•
•
Tested fit of model for all genes and partitions
Species Trees (*Beast); 3-7 individuals /species
Individual Gene Trees (*Beast)
Concatenated gene trees with mtDNA
Concatenated gene trees nuclear genes only
Two calibration points
-0.5
0.0
Lampropeltis Concat Gene Tree
Depth Error
-1.0
-1.5
-2.0
DeltaT
Δt
R2=0.87
P<4.36x10-7
2
4
Tree
Depth
TreeDepth
6
8
20
0.5
0.75
0.95
0.99
5
2
Extant Lineages
10
Null Distribution
Yule
-8
-6
-4
Time
-2
20
0.5
0.75
0.95
0.99
5
2
Extant Lineages
10
Species
Tree
-0.169
Yule
-8
-6
-4
Time
-2
20
SpeciesTree
-0.169
Yule
0.5
0.75
0.95
0.99
5
2
Extant Lineages
10
Nucl Gene Trees
-1.512 to 0.415
Yule
-8
-6
-4
Time
-2
20
SpeciesTree
-0.169
Yule
0.5
0.75
0.95
0.99
5
mtDNA GT
-2.19*
DDL
2
Extant Lineages
10
Nucl GT
-1.512 to 0.415
Yule
-8
-6
-4
Time
-2
20
SpeciesTree
-0.169
Yule
5
Concat GT
mtdna and nuc
-2.474**
DDL
2
Extant Lineages
10
0.5
0.75
0.95
0.99
-8
-6
-4
Time
-2
20
SpeciesTree
-0.169
Yule
0.5
0.75
0.95
0.99
5
Concat
nuc
-2.11**
Yule2
2
Extant Lineages
10
Concat
mtdna and nuc
-2.474**
DDL
-8
-6
-4
Time
-2
-0.5
-1.0
-1.5
-2.0
γ-error
R2=0.083
P=0.787
-2.5
gamma error
0.0
0.5
Is γ-error
associated with topological
discordance?
15
20
RF
25
RF
30
Issues
• Increasing Θ increases γ-error
• Mean Θ for Lampropeltis = 0.0055 (Max=0.04)
• GT/ST divergence and γ discordance shouldn’t be high
• Nuclear gene trees γ are similar to ST
• mtDNA estimates (alone or with other genes)
of γ are underestimated (early burst)
Why mtDNA Problems?
• Model underparametrization decreases γ (Revell et al. 2005)
• Saturation Driven Compression*
– Deeper nodes are
artificially compressed
– Decreases γ
– Usually at very old time scales
10
20
30
40
50
terminals
internals
0
Density
• In Lampropeltis, mtDNA
Ε (substitutions) increases
towards terminal branches
(W = 604.5, P = 0.0002477)
Distribution of terminals and internals
*Hugall et al. 2007, Sanders et al. 2008,
Zheng et al. 2011
0.00
0.02
0.04
0.06
Estimated substitutions
0.08
0.10
Conclusions
• mtDNA GTs poorly estimate diversification dynamics
• Use Species Trees!
• And include fossils (if possible)
• Investigate Impact on other comparative methods
• Simulations account for:
–Θ
– Substitution variation
– GT uncertainty
Thanks to…
My coauthors (Sara and Alex), PSCCUNY, and NSF.
Simulation Results
Sloss 1950
Download