Lecture 22 : Introduction to Phylogenetics April 4, 2014 Last Time Infinite alleles and stepwise mutation models Introduction to neutral theory Molecular clock Today Introduction to phylogenetics Phylogeography Limitations of phylogenetic analysis Coalescence introduction Influence of demography on coalescence time Phylogenetics Study of the evolutionary relationships among individuals, groups, or species Relationships often represented as dichotomous branching tree Extremely common approach for detecting and displaying relationships among genotypes Important in evolution, systematics, and ecology (phylogeography) Evolution C A D E B G H I J K L M F N Slide adapted from Marta Riutart O P Q R S T U V W X Y Z Ç What is a phylogeny? O P Q R S T U V W X Y Z Ç Homology: similarity that is the result of inheritance from a common ancestor Slide adapted from Marta Riutart Phylogenetic Tree Terms Group, cluster, clade Leaves, Operational Taxonomic Units (OTUs) terminal branches A B C D E F node interior branches ROOT Slide adapted from Marta Riutart G H I J Tree Topology Bacteria 1 Bacteria 2 Bacteria 3 Eukaryote 1 Eukaryote 2 Eukaryote 3 Eukaryote 4 (Bacteria1,(Bacteria2,Bacteria3),(Eukaryote1,((Eukaryote2,Eukaryote3),Eukaryote4))) Bacteria 1 Bacteria 2 Bacteria 3 Eukaryote 1 Slide adapted from Marta Riutart Eukaryote 2 Eukaryote 3 Eukaryote 4 Are these trees different? How about these? http://helix.biology.mcmaster.ca Rooted versus Unrooted Trees archaea eukaryote archaea Unrooted tree archaea eukaryote eukaryote eukaryote Rooted by outgroup bacteria outgroup archaea Monophyletic group archaea archaea eukaryote eukaryote root eukaryote eukaryote Slide adapted from Marta Riutart Monophyletic group Rooting with D as outgroup G A F E B D C A B C G E F Slide adapted from Marta Riutart D G A Now with C as outgroup F E B D C A G B E C G F E D F A B D C Which of these four trees is different? Baum et al. UPGMA Method Use all pairwise comparisons to make dendrogram UPGMA:Unweighted Pairwise Groups Method using Arithmetic Means Hierarchically link most closely related individuals Read the Lab 11 Introduction! Phenetics (distance) vs Cladistics (character state based) Lowe, Harris, and Ashton 2004 Parsimony Methods Based on underlying genealogical relationships among alleles Occam’s Razor: simplest scenario is the most likely Useful for depicting evolutionary relationships among taxa or populations Choose tree that requires smallest number of steps (mutations) to produce observed relationships Choosing Phylogenetic Trees MANY possible trees can be built for a given set of taxa Very computationally intensive to choose among these Lowe, Harris, and Ashton 2004 UN (2n 5)! 2 n 3 (n 3)! RN (2n 3)! (2n 3)U n n2 2 (n 2)! Choosing Phylogenetic Trees Many algorithms exist for searching tree space Local optima are problem: need to traverse valleys to get to other peaks Heuristic search: cut trees up systematically and reassemble Branch and bound: search for optimal path through tree space Felsenstein 2004 9 8 9 10 9 9 9 7 8 11 11 5 Choosing Phylogenetic Trees If multiple trees equally likely, select majority rule or consensus Strict consensus is most conservative approach Bootstrap data matrix (sample with replacement) to determine robustness of nodes E 60 Lowe, Harris, and Ashton 2004 A D F CB 60 60 Felsenstein 2004 Phylogeography The study of evolutionary relationships among individuals based on phylogenetic analysis of DNA sequences in geographic context Can be used to infer evolutionary history of populations Migrations Population subdivisions Bottlenecks/Founder Effects Can provide insights on current relationships among populations Connectedness of populations Effects of landscape features on gene flow Phylogeography Topology of tree provides clues about evolutionary and ecological history of a set of populations Dispersal creates poor correspondence between geography and tree topology Vicariance (division of populations preventing gene flow among subpopulations) results in neat mapping of geography onto haplotypes Example: Pocket gophers (Geomys pinetis) Fossorial rodent that inhabits 3-state area in the U.S. RFLP for mtDNA of 87 individuals revealed 23 haplotypes Parsimony network reveals geographic relationships among haplotypes Haplotypes generally confined to single populations Major east-west split in distribution revealed Avise 2004 Problems with using Phylogenetics for Inferring Evolution It is a black box: starting from end point, reconstructing past based on assumed evolutionary model Homologs versus paralogs Hybridization Differential evolutionary rates Assumes coalescence Gene Orthology Phylogenetics requires unambiguous identification of orthologous genes Paralogous genes are duplicated copies that do not share a common evolutionary history Difficult to determine orthology relationships paralogs paralogs Lowe, Harris, Ashton 2004 paralogs orthologs