Gene-tree/speciestree discordance and diversification Sara Ruane CUNY (CSI, GC) R. Alex Pyron The GW Univ. Frank Burbrink CUNY (CSI, GC) Understanding patterns of diversification • • • • • Ecological Opportunity Key Innovations Competition Extinction Ecological Limits History of the study • Paleontological Research • Understand patterns of speciation and extinction through time • Simpson (1944, 1953) • Sloss (1950) • Sepkoski (1979) • Stanley (1979) • Raup (1985) • Foote (1993) Sloss 1950 Molecular Phylogenetic Approaches • • • • • • • Waiting times between speciation Slowinski and Guiher (1989) Harvey et al. (1991, 1994) Nee et al. (1992, 1994) Pybus and Harvey (2000) Rabosky (2006, 2008) Purvis et al. (2009) Gamma and Rate Variable Models • Gamma (γ) examines the density of nodes relative to time • Models of diversification: 1 ) Constant: – Yule, Birth-Death 2) Variable: – Yule2, Yule3, DDL, DDX Plethodon ouachitae Complex *Even (Yule) Early Late Time Gamma = -1.3 (Yule) Gamma = -4.5 (DDL) Gamma = 2.2 (LB/Ext) External Problems with Molecular Trees Only Looking at the winners Extinction dynamics must come from data external to the tree or tame diversification rate variation across a tree Can we see all speciation events? Quental and Marshall (2010) Internal Problems: Gene-Tree/Species-Tree Discordance A B Present Species/Population Divergence (S) Δt =2Ne or Θ/2 Gene Divergence (G) (See Edwards and Beerli 2000) What does Δt tell us? • Increasing 2Ne (Θ/2) increases Δt • Gene div > Species div • Δt will be small as div time between species becomes large, because Δt will be a small fraction of the total gene divergence • At deep div S=G • At shallow div S<<G -1.5 -2.0 -2.5 -3.0 Δt Z100 -1.0 -0.5 0.0 Simulating gene trees from species trees 0 5 10 Tree Depth X 15 20 Impact on diversification estimation • Disproportionally pushes younger nodes towards to the root • Density of nodes increases > root. • Decreasing γ (early burst!) • Variable diversification models • Impact of Θ (4Neµ)? Simulations • In R – (Phybase, Ape, Geiger, Laser, Phangorn) • Simulate Species Trees (ST) – Yule [γ=0] and BD – Taxa: 25-100 • Simulate Gene Trees (GT) • Θ: 0.0001-100 • Θ: Γ distribution Burbrink & Pyron 2011 What are we looking for? • Impact of Θ on γ-error (GT – ST) • Topology (RF Distance) Simulation Results Implications • Θ > 1 yields a pattern of early burst (-γ) • Θ > 1 also increases topological discordance • Is Θ > 1 likely? • Most studies of extant populations are well below 1.0. Empirical Study • • • • • Group with all species sampled Enough genes to construct a species trees Lampropeltis 21 species of snakes Throughout North America and South America • 12 Loci Comparisons • • • • • • Tested fit of model for all genes and partitions Species Trees (*Beast); 3-7 individuals /species Individual Gene Trees (*Beast) Concatenated gene trees with mtDNA Concatenated gene trees nuclear genes only Two calibration points -0.5 0.0 Lampropeltis Concat Gene Tree Depth Error -1.0 -1.5 -2.0 DeltaT Δt R2=0.87 P<4.36x10-7 2 4 Tree Depth TreeDepth 6 8 20 0.5 0.75 0.95 0.99 5 2 Extant Lineages 10 Null Distribution Yule -8 -6 -4 Time -2 20 0.5 0.75 0.95 0.99 5 2 Extant Lineages 10 Species Tree -0.169 Yule -8 -6 -4 Time -2 20 SpeciesTree -0.169 Yule 0.5 0.75 0.95 0.99 5 2 Extant Lineages 10 Nucl Gene Trees -1.512 to 0.415 Yule -8 -6 -4 Time -2 20 SpeciesTree -0.169 Yule 0.5 0.75 0.95 0.99 5 mtDNA GT -2.19* DDL 2 Extant Lineages 10 Nucl GT -1.512 to 0.415 Yule -8 -6 -4 Time -2 20 SpeciesTree -0.169 Yule 5 Concat GT mtdna and nuc -2.474** DDL 2 Extant Lineages 10 0.5 0.75 0.95 0.99 -8 -6 -4 Time -2 20 SpeciesTree -0.169 Yule 0.5 0.75 0.95 0.99 5 Concat nuc -2.11** Yule2 2 Extant Lineages 10 Concat mtdna and nuc -2.474** DDL -8 -6 -4 Time -2 -0.5 -1.0 -1.5 -2.0 γ-error R2=0.083 P=0.787 -2.5 gamma error 0.0 0.5 Is γ-error associated with topological discordance? 15 20 RF 25 RF 30 Issues • Increasing Θ increases γ-error • Mean Θ for Lampropeltis = 0.0055 (Max=0.04) • GT/ST divergence and γ discordance shouldn’t be high • Nuclear gene trees γ are similar to ST • mtDNA estimates (alone or with other genes) of γ are underestimated (early burst) Why mtDNA Problems? • Model underparametrization decreases γ (Revell et al. 2005) • Saturation Driven Compression* – Deeper nodes are artificially compressed – Decreases γ – Usually at very old time scales 10 20 30 40 50 terminals internals 0 Density • In Lampropeltis, mtDNA Ε (substitutions) increases towards terminal branches (W = 604.5, P = 0.0002477) Distribution of terminals and internals *Hugall et al. 2007, Sanders et al. 2008, Zheng et al. 2011 0.00 0.02 0.04 0.06 Estimated substitutions 0.08 0.10 Conclusions • mtDNA GTs poorly estimate diversification dynamics • Use Species Trees! • And include fossils (if possible) • Investigate Impact on other comparative methods • Simulations account for: –Θ – Substitution variation – GT uncertainty Thanks to… My coauthors (Sara and Alex), PSCCUNY, and NSF. Simulation Results Sloss 1950