Agreement among gene trees could be used as evidence of common ancestry ? Jessica Clarke and Flor Rodriguez March 21st , 2006 Arguments for common ancestry the genetic code is a “frozen accident” when life first arises it alters the environment so as to make subsequent start-ups much less probable species with common ancestor are more likely to exhibit congruence in character state patterns than species that originated separately Hypothesis of common ancestry by Penny et al. (1982) Prediction: Orthologous genes should lead to similar trees because they are expected to share the same evolutionary history developed an algorithm that guaranteed to find all minimal-length trees implemented a tree-comparison metric to measure closeness calculated the expected distribution of this metric Conclusion: Theory of evolution leads to quantitative predictions that are testable and falsifiable Measuring the difference T1-T11 complete data set T12-T17 T18 T19-T26 cytochrome c fibrinopeptide A fibrinopeptide B T27-T32 T33-T39 haemoglobin haemoglobin Measuring the difference * * * * * * The symmetric difference metric on two trees counts the number of edges that occur in one, but not both, trees Critic by Sober and Steel 2002 Common ancestry might be untestable Long ages of time might have erased the pertinent evidence I(X, Y) ≤ ne-4rt n = 1000 t = 20 million years r = 1 in 2million years No method can infer X from Y with a probability that is any better than simply ignoring Y and blindly guessing X I(X,Z) ≤ 4k maxi {nie-4rti} n = 100 t = 20 million years r = 1 in 100 million years K = 10 000 No method can reliably determine from this data how these four groups are related historically Response from Penny et al. 2003 Methods of tree construction based on parsimony assume common ancestry Methods other than parsimony can be used, and should be favored if they give more consistent results when analyzing and comparing different data sets Response from Penny et al. 2003 The hypothesis of common ancestry (CA) might be untestable Some alternatives of the theory of common ancestry can be formulated, tested and rejected The theory of influenza viruses from outer space The theory that every species was created separately (ID) Influenza viruses continue to arrive from outer space via comets Hoyle and Wickramasinghe 1984, 1986 under the theory of descent linear tree is expected if each epidemic was carried on different comets, a correlation between their order of arrival and their phylogeny is not expected Test 1: Probability of sequences occurring on a linear tree in the same order as the year of appearance P < 10-6 , that the linear tree (observed order) occurs by chance The theory of descent was not rejected Influenza viruses from space Test 2: t1 t2 t3 t4 Binary tree t1 t2 t3 t4 Star-tree 1 in 1064 Steiner tree (Binary tree) was not rejected It is not necessary that all possible alternatives to a model MUST be rejected simultaneously Intelligent design Theory of descent vs. theory of individual creation Example: Photosynthetic enzymes from plants living in hot-dry environments and those living in a moist-temperate lawn correct prediction Theory of descent leads to testable predictions Agreement Between Gene Trees Evidence for common descent….. or NOT? History of Life 3.5byo - oldest prokayotic fossils 1.7byo - oldest eukaryotic fossils 545-525myo - cambrian explosion 475myo - first land plants 400myo - origin of vascular tissue 300myo - origin of seed plants 130myo - origin of flowering plants Campbell, 1999 Main Sources www.talkorigins.org www.trueorigins.org Main Arguments Trees do not match Design not ruled out Evolution is not falsifiable Molecules do not evolve according to predictions Predictions Violated Common ancestry predicts agreement among trees. Trees do not agree perfectly. Therefore, the common ancestry claim is rejected. Response Number of Taxa Rooted Unrooted 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10 11 20 21 Number of Possible Trees 1 3 15 105 945 10,395 135,135 2,027,025 654,729,075 8,200,794,532,637,890,000,000 NR = (2n-3)!! = (2n-3)!/(2n-2(n-2)! Theobald, 2006 Design Not Rejected Anatomy and biochemistry are not independent. Organisms similar anatomically, are similar biochemically- and vise versa. Thus, gene agreement could reflect design. Brand 1997 Response There is no biological reason, besides common descent, that similar morphologies should have similar biochemestry. Besides, we can use neutral genes, and genes with vastly different functions to construct trees. Theobald, 2006 Not Falsifiable / Not Science Evolutionary predictions are shown false Evolution is not falsified. Thus, evolution is not falsifiable, and is not science. Possible examples horizontal transfer hybridization Predictions Violated Evolution predicts that divergence between lineages is proportional to evolutionary distance (constant rate of evolution). # bp changes between lineages does not match predictions Therefore, claim is false (& molecular data are bunk). Camp, 2001 Cytochrome C Turtle Rattlesnake Human 22 bp 14 bp Cytochrome C Kangaroo Horse 12 bp 10 bp Human Response Common ancestry does not predict uniform rates. Even given uniform rates, events are stochastic, and thus should not match predictions exactly. Distribution of genetic distances between human and mouse genes. The histogram is the actual data from 2,019 human and mouse genes. The solid curve shows the expected distribution of genetic distances assuming only a constant rate of background mutation (~10-9 substitutions per site per year) (reproduced from Figure 3a in Kumar and Subramanian 2002). Theobald, 2006 References Brand, Leonard. 1997. Faith, Reason, and Earth History. Andrews University Press, Berrien Springs, MI. Camp, Ashby. 2001. A critique of Douglas Theobald’s “29 Evidences for Evolution”. 09 March, 2006. www.trueorigin.org/theobald1a.asp Campbell, N., Reece J., Mitchell, L. 1999. Biology, fifth edition. Benjamin/Cummings, Menol Park, CA. Kumar, S., and Subramanian, S. 2002. Mutation rates in mammalian genomes. Proc Natl Acad Sci. 99: 803-808. Penny D., Hendy M., Zimmer E. and R. Hamby. 1990. Trees from sequences: Panacea or Pandora’s box?. Aus. Syst. Bot., 3, 21-38. Penny D., Hendy M. and M. Steel. 1991.Testing the theory of descent. In: Phylogenetic analysis of DNA sequences. 155-183. Penny D., Foulds L. and M. Hendy. 1982. Testing the theory of evolution by comparing phylogenetic trees constructed from five different protein sequences. Nature. 297:197-200. Penny D., Hendy M. and A. Poole. 2003. Testing fundamental evolutionary hypotheses. J. Theor. Biol. 223:377-385. Robinson D. and L. Foulds. 1981.Comparison of phylogenetic trees. Math. Biosc. 53:131-147. References Rokas A. and S. Carroll. 2005. More genes or more taxa?. The relative contribution of gene number and taxon number to phylogenetic accuracy. Mol. Biol. Evol. 22(5):13371344. Rokas A., Williams B., King Nicole and S. Carroll. 2003. Genome-scale approaches to resolving incongruence in molecular phylogenies. Nature. 425:798-804 Sober E. and M. Steel. 2002. Testing the hypothesis of common ancestry. J. Theor. Biol. 218:395-408. Theobald, Douglas L. "29+ Evidences Macroevolution: The Scientific Case for Common Descent." The Talk.Origins Archive. Vers. 2.85. 8 Jan, 2006 http://www.talkorigins.org/faqs/comdesc/ Theobald, Douglas. “29+ Evidences for Macroevolution: A Response to Ashby Camp’s “Critique”. 21 March, 2002 www.talkorigins.org/faqs/comdesc/camp.html The competing hypotheses Ho: CA-1 Ha: CA-i, i >1 single ancestral origin separate origination events The competing hypotheses Ho: CA-1 Ha: CA-i, i >1 Simplest model A , all trait follows the same rules B, each trait follows the same rules on all branches C, all the changes that a single character can experience on a given branch must have the same probability single ancestral origin separate origination events Most complex model -A , allow traits to follow different rules -B, allow a single trait to follow different rules on different branches -C, each possible change of a single trait on a single branch to have its own probability The competing hypotheses Graph Process model Gi Mj Gi & Mj estimate parameters in the model L(Gi & Mj) L(Gi & Mj) L(Gi & Mk) L(Gi & Mj) L(Gk& Mj) LRT does not apply One can not compare different topologies that have different process models attached to them Akaike information criterion (AIC) AIC is based on a theorem that describes how the predictive accuracy of a model M containing adjustable parameter can be estimated L(M) is the hypothesis obtained from M by assigning values to adjustable parameters that maximize the probability of the data Good fit-to-data increase predictive accuracy Penalty for complexity Applies to nested and non-nested models Trees from sequences Advantages scope or domain a character range of evolutionary rates large number of characters mechanism of evolution easier data handle expectation of useful characters cost of obtaining data Penny et al. 1990 Limitations Sampling errors: - sequences too short - unrepresentative sequences Methodological problems: - large number of possible trees - incomplete use of information - converging to an incorrect tree - deviations from the standard model Human error: - errors in data and programming - misreading the tree The limits to phylogeny reconstruction depend on the model A good method for reconstructing trees should have the properties of being fast efficient consistent robust falsifiable Results from current methods should be treated as hypotheses for future testing