3. The reconstruction of phylogeny The first Darwinian principle told that every phylogenetic tree has one common ancestor. Phylogenetic analysis is the study of taxonomic relationships among lineages. Phylogenetic systematics Numerical taxonomy Cladistics (greek κλάδος: branch) Willi Hennig (1913-1976) Robert Sokal (1927-) http://www.faunaeur.org/ http://tolweb.org/tree/phylogeny.html http://www.eol.org/ The cladistic methodology B A ade adf e f C D abc c d e b a Ancestor abd Apomorphies are common derived characters. Autapomorphies are characters that are restricted to single lineages. Plesiomorphies are ancestral derived characters. e: Autapomorphy of lineage D b: Synapomorphy of lineage C+D d: Plesiomorphy of lineage A It is a symplesiomorphy a: Apomorphy of the whole tree It is the ancestral state. The collective set of plesiomorphies defines the ground plan of a phylogenetic tree. B A C ade adf e f C is the sister taxon of A and B abd Character d in lineages A, B, and C is not homologous because it derived twice. It is homoplasious d d Character a in lineages A, B, and C is homologous because it synapomorph b a Ancestor Monophyletic taxon Paraphyletic taxon B A e f Polyphyletic taxon C D f d b d Ancestor b E The ultimate aim of taxonomy is to group higher taxa into monophyletic subtaxa. For this task we have to infer autapomorphies Autapomorphy defines monophyly Tetrapoda The diversification of an evolutionary tree is called cladogenesis Actinopterygia Dipnoi Amniota Archosauria Anura Urodela Mammalia Squamata Aves Therosauria Loss of tail apomorph Mammae autapomorph Reptilia (paraphyletic) Feathers apomorph Amnion apomorph Common ancestor Tetrapod limbs apomorph Lungs plesiomorph The evolutionary change within a lineage is called anagenesis Linnean systematics and cladistics Linnean approach Hennigean approach Hierachical encaptive system Hierachical encaptive system Phenomenological method based on similarity Analytical method based on lineage branching It uses grades (groups of similar body plan) It uses clades (groups of identical root) Different taxonomies are possible Only one taxonomic solution is allowed There is no clear decision intrument for taxonomies Autapomorphies decide about taxonomic position The number of higher taxa is rather small The number of higher taxa is large (Pisces, Amphibia, Reptilia, Aves, Mammalia) (Pisces, Amphibia, Reptilia are not valid taxa ) It does not assume common evolutionary history It is based on common evolutionary history It does not reconstruct evolution It does reconstruct evolution Taxonomy is independent of evolution Taxonomy is a part of evolutionary theory Low resolution trees High resolution trees Phylogenetic tree of winged insect orders Devonian Carboniferous Permian Triassian Jurassic Cretaceous Paleogene to recent Palaeodictyoptera Odonata Devonian origin Low resolution Radiation Rhyniognatha hirsti The tree lacks 9 orders that went extinct by the end of the Permian Radiation In the Triassic period all extant taxa already existed Ephemeroptera Dictyoptera Plecoptera Zoraptera Embioptera Isoptera Dermaptera Grylloblatodea Phasmida Orthoptera Mallophaga Psocoptera Thysanoptera Heteroptera Hymenoptera Neuroptera Coleoptera Siphonaptera Mecoptera Diptera Trichoptera Lepidoptera The construction of phylogenetic trees from numerical methods The principle of maximum parsimony (Occam’s razor) holds that we should accept that phylogenetic tree that can be constructed with the least number of morphological changes. The raw data Species A B C D E 1 1 1 0 0 1 2 1 1 1 0 0 Characters 3 4 0 1 1 1 0 0 1 1 1 1 5 1 1 1 0 0 6 1 1 0 1 1 A B D E C 001101 110111 101101 010010 8 changes Distance matrix Species A B C D E A 0 1 3 4 3 B 1 0 4 3 2 C 3 5 0 5 6 111111 D 4 3 5 0 1 E 3 2 6 1 0 We are looking for such a tree that minimizes the sum of distances. A Outgroup B D E C 001101 101101 010010 111111 How to define the root? 010111 110111 7 changes Parsimony analysis To find the most parsimonious tree we have to cross all combinations of lineages (trees) with all character combinations at the root. The number of possible trees Number of trees Species 2 1 3 3 4 15 5 105 6 945 7 10395 8 135135 9 2027025 10 34459425 N (2S 2)! 2S1 (S 1)! Neighbour joining A Neighbour joining is particularly used to generate phylogenetic trees B C Root You need similarities (phylogenetic distances) (XY) between all elements X and Y. F Dissimilarities E D (X) (X, Yi ) n A B Calculate C Root X F Q (X,Y) (n 2)(X, Y) (X) (Y) Select the pair with the lowest value of Q E Calculate new dissimilarities D (X, U AB ) A B Root C X Y E Calculate the distancies from the new node (n 2)(A, B) (A) (B) 2(n 2) (n 2)(A, B) (A) (B) (B, U) 2(n 2) (A, U) F D (X, A) (X, B) (A, B) 2 Distance matrix Mouse Raven Mouse Raven Octopus Lumbricus 0 0.2 0.6 0.7 0.2 0 0.6 0.8 Delta values 1.5 1.6 Octopus Lumbricus 0.6 0.7 0.6 0.8 0 0.5 0.5 0 1.7 2 Raven Octopus Mouse Lumbricus (X) (X, Yi ) n Q-values Mouse/Raven Mouse/Octopus Mouse/Lumbricus Raven/Octopus Raven/Lumbricus Octopus/Lumbricus -2.7 -2 -2.1 -2.1 -2 -2.7 Q (X,Y) (n 2)(X, Y) (X) (Y) Raven Mouse Distance matrix Mouse Raven Mouse Raven Protostomia 0 0.2 0.4 0.2 0 0.45 Delta values 0.6 0.65 Q-values Mouse/Raven Mouse/Protostomia Raven/Protostomia Protostomia 0.4 0.45 0 (X, U AB ) (X, A) (X, B) (A, B) 2 (X) (X, Yi ) n -1.25 -1.05 -0.6 Q (X,Y) (n 2)(X, Y) (X) (Y) Vertebrata Distance matrix Vertebrata Protostomia 0.85 Protostomia Vertebrata Protostomia 0 0.075 0.075 0 Protostomia Assumption of the numerical methods Birds Characters (or transitions) have to be independent. Impossible character states have to be excluded. Fish Loss of hairs Mammals Loss of feathers Hairs Feathers Scales Characters are assumed to have equal importance. In reality transitions are not comparable. To overcome this problem you give character weights. Technically you multiply the occurrence of a character in a distance matrix Incompatible http://evolution.genetics.washington.edu/phylip/software.html Trees from molecular data Distance matrix Species A B C D E Sequence A A C A C G C G G G T T T T T T T T G G A A T T T A A G G G C C G G C C C A A C C C A A C A A T T A A A G A A T T A A T A A C A A A B C D E A 0 1 11 10 5 B 1 0 10 9 5 C 11 10 0 3 9 D 10 9 3 0 6 E 5 5 9 6 0 Evolutionary time scales The molecular clock Numbers of amino acid substitutions and therefore trespective numbers of nucleotide substitutions are for many proteins and genomes approximately proportional to time. Motoo Kimura Emile Zuckerkandl Tomoko Ohta (1933-) (1924-1994) (1922-) 80 Hence, numbers of substitutions are a measure of time of divergence from the latest common ancestor. Substitutions alone provide a relative time scale 70 acid differences Nuumber of amino c c Linus Pauling (1901-1994) 60 50 40 30 20 Errors 10 0 0 200 400 600 800 1000 Paleontological divergence estimate Superoxide dismutase An appropriate calibration adds the absolute time scale Applying the molecular clock A B C D 1 3 4 2 Ancestor T→C T C A→G A G→C→G T G→C→A A A C G T T C A→G A G T G C C C T Single substitution The length of a tree segment is a measure of the duration of a lineage Is it possible to convert numbers of character changes into evolutionary time scales? The Jukes Cantor model now assumes that the probabilities l of any transition within these 4 nucleotides is the same. A l/3 l/3 Parallel substitution Back substitution Multiple substitution C G l/3 l/3 T Assuming that transition probability is time independent (every period has the same transition probability). The probability distribution follows an Arrhenius model. p trans 1 e lt p trans 1 e lt A→T: 3 1 3 lt 1 lt 1 ( 1 e ) ( 1 e ) (1 e lt ) A→A: 4 4 4 4 1 3 3( (1 e lt )) (1 e lt ) 4 4 1 1 lt (1 e lt ) A→G: (1 e ) 4 4 p trans A→C: What is the probability to get exactly x differences out of n possible? We apply the binomial: n L(x; t) p x (1 p) n x We apply the principle of maximum likelihood. x n 3 3 3 3 ln(L(x; t)) ln x ln( e lt ) (n x) ln(1 ( e lt )) 4 4 4 4 x We are interested in the time that maximizes this function. Hence we need the root of the first derivative 1 4x t ln(1 ) l 3n The distances t are now used in distance matrices to construct the phylogenetic tree Paleontological versus molecular timescales Molecular estimates point frequently much more ancient divergences of lineages than estimates based on the fossil record. The reason are different speeds of morhological and genetical changes. Time axis First fossils of placental orders (65 mya) Eomaia (125 mya) Molecular divergence of placental orders (120-140 mya) Morphological change Changes in genetic constitution involve first basic regulatory elements. Genetical change Genetical change Changes in genetic constitution accumulate to a point where basic regulatory elements are involved Gene flow up to 2 mya First fossils of erect hominids (6-7 mya) Morphological change Time axis Molecular divergence (4-5 mya) Paleontological versus molecular timescales z Matching of molecular and paleontological timescales in Echinodermata 250 estimate Molecular divergence 300 200 150 100 50 0 0 100 200 300 Paleontological divergence estimate For the majority of Echinoderm subtaxa molecular divergence estimates are higher than the paleontological estimates. Data from Smith et al. (2006) Paleontological versus molecular timescales Divergences Placental-marsupials Amniotes-amphibians Myriapods-chelicerates Mosses-vascular plants Crustaceans-insects Echinoderms-chordates Spiralian-Ecdysozoans Protostomes-deuterostomes Arthropods-chordates Cnidaria-bilaterians Sponges-chordates Data from Qun et al. (2007) Earliest fossil record 175–145 310 530 450 530 <530 560–540 560–540 560–540 <600 <600 Molecular estimates 185–161 375–345 705–579 899–515 726-539 1001–586 643–544 678–556 1200–588 724–615 1350–592 Have all phylogenetic trees a single root? Darwin’s first principle: All species of a given taxon have a common ancestor. A brush means: • No speciation. • If we except that extinction occurs this would mean a constant decrease in the number of species. • Character change within whole species. Theory of Lamarck Scale of organization Parsimony analysis cannot answer this question. A brush would always have a lower number of character changes Scala naturae Spontaneous origin of simple life forms • No genetic (character) variability within populations. • Extreme longevity of lineages. But horizontal gene transfer and might at least in bacteria result in networks and rings! Time Evolution and development (EvoDevo) August Weismann (1834-1914) The soma - germ line distinction makes it impossible to transmit acquired characters to the next generation Ernst Haeckel (1834-1919) Theory of recapitulation The ontogeny of advanced species recapitulates respective stages in ancestral forms. In fact, only basic genetic programs are conserved and modifications at all stages of ontogenesis appear. Haeckel’s rule is only a crude approximation. Today’s reading Phylogenetic systematics: http://evolution.berkeley.edu/evolibrary/article/phylogenetics_01 Cladistics: http://en.wikipedia.org/wiki/Cladistics Ernst Haeckel: Kunstformen der Natur (Internet exhibition of original drawings: http://caliban.mpiz-koeln.mpg.de/~stueber/haeckel/kunstformen/liste.html The modern molecular clock: http://awcmee.massey.ac.nz/people/dpenny/pdf/BromhamPenny_2003.pdf