Agreement among gene trees could be used as evidence of

advertisement
Agreement among gene trees
could be used as evidence of
common ancestry ?
Jessica Clarke and Flor Rodriguez
March 21st , 2006
Arguments for common ancestry

the genetic code is a “frozen accident”

when life first arises it alters the environment so
as to make subsequent start-ups much less
probable

species with common ancestor are more likely to
exhibit congruence in character state patterns
than species that originated separately
Hypothesis of common ancestry
by Penny et al. (1982)
Prediction:
Orthologous genes should lead to similar trees
because they are expected to share the same
evolutionary history



developed an algorithm that guaranteed to find all
minimal-length trees
implemented a tree-comparison metric to measure
closeness
calculated the expected distribution of this metric
Conclusion:
Theory of evolution leads to quantitative predictions
that are testable and falsifiable
Measuring the difference
T1-T11
complete data set
T12-T17
T18
T19-T26
cytochrome c
fibrinopeptide A
fibrinopeptide B
T27-T32
T33-T39
haemoglobin 
haemoglobin 
Measuring the difference
* *
*
*
*
*
The symmetric difference metric on two trees counts
the number of edges that occur in one, but not both, trees
Critic by Sober and Steel 2002
Common ancestry might be untestable
Long ages of time might have erased the pertinent evidence
I(X, Y) ≤ ne-4rt
n = 1000
t = 20 million years
r = 1 in 2million years
No method can infer X from Y
with a probability that is any
better than simply ignoring Y
and blindly guessing X
I(X,Z) ≤ 4k maxi {nie-4rti}
n = 100
t = 20 million years
r = 1 in 100 million years
K = 10 000
No method can reliably
determine from this data
how these four groups are
related historically
Response from Penny et al. 2003
Methods of tree construction based on parsimony
assume common ancestry
Methods other than parsimony can be
used, and should be favored if they give
more consistent results when analyzing
and comparing different data sets
Response from Penny et al. 2003
The hypothesis of common ancestry (CA)
might be untestable
Some alternatives of the
theory of common ancestry
can be formulated, tested and rejected


The theory of influenza viruses from outer space
The theory that every species was created separately
(ID)
Influenza viruses continue to arrive
from outer space via comets
Hoyle and Wickramasinghe 1984, 1986


under the theory of descent linear tree is expected
if each epidemic was carried on different comets, a
correlation between their order of arrival and their
phylogeny is not expected
Test 1:
 Probability of sequences occurring on a linear tree in the
same order as the year of appearance
 P < 10-6 , that the linear tree (observed order) occurs by
chance

The theory of descent was not rejected
Influenza viruses from space

Test 2:
t1
t2
t3
t4
Binary tree
t1
t2
t3 t4
Star-tree
1 in 1064


Steiner tree (Binary tree) was not rejected
It is not necessary that all possible alternatives to a
model MUST be rejected simultaneously
Intelligent design
Theory of descent vs. theory of individual creation
Example:
Photosynthetic enzymes from plants living in hot-dry
environments and those living in a moist-temperate lawn
correct prediction
Theory of descent leads to testable predictions
Agreement Between
Gene Trees
Evidence for common descent…..
or NOT?
History of Life
 3.5byo
- oldest prokayotic fossils
 1.7byo - oldest eukaryotic fossils
 545-525myo - cambrian explosion
 475myo - first land plants
 400myo - origin of vascular tissue
 300myo - origin of seed plants
 130myo - origin of flowering plants
Campbell, 1999
Main Sources
www.talkorigins.org
www.trueorigins.org
Main Arguments
 Trees
do not match
 Design not ruled out
 Evolution is not falsifiable
 Molecules do not evolve according to
predictions
Predictions Violated
 Common
ancestry predicts agreement
among trees.
 Trees do not agree perfectly.
 Therefore, the common ancestry claim is
rejected.
Response
Number of Taxa
Rooted
Unrooted
2
3
3
4
4
5
5
6
6
7
7
8
8
9
9
10
10
11
20
21
Number of Possible Trees
1
3
15
105
945
10,395
135,135
2,027,025
654,729,075
8,200,794,532,637,890,000,000
NR = (2n-3)!! = (2n-3)!/(2n-2(n-2)!
Theobald, 2006
Design Not Rejected

Anatomy and biochemistry are not independent.
 Organisms similar anatomically, are similar
biochemically- and vise versa.
 Thus, gene agreement could reflect design.
Brand 1997
Response
 There
is no biological reason, besides
common descent, that similar
morphologies should have similar
biochemestry.
 Besides, we can use neutral genes, and
genes with vastly different functions to
construct trees.
Theobald, 2006
Not Falsifiable / Not Science



Evolutionary predictions are shown false
Evolution is not falsified.
Thus, evolution is not falsifiable, and is not science.
Possible examples
 horizontal transfer
 hybridization
Predictions Violated

Evolution predicts that divergence between
lineages is proportional to evolutionary distance
(constant rate of evolution).
 # bp changes between lineages does not match
predictions

Therefore, claim is false (& molecular data are
bunk).
Camp, 2001
Cytochrome C
Turtle
Rattlesnake
Human
22 bp
14 bp
Cytochrome C
Kangaroo
Horse
12 bp
10 bp
Human
Response
 Common
ancestry does not predict
uniform rates.
 Even given uniform rates, events are
stochastic, and thus should not match
predictions exactly.
Distribution of genetic distances between human and mouse genes. The histogram is the
actual data from 2,019 human and mouse genes. The solid curve shows the expected distribution of
genetic distances assuming only a constant rate of background mutation (~10-9 substitutions per
site per year) (reproduced from Figure 3a in Kumar and Subramanian 2002).
Theobald, 2006
References









Brand, Leonard. 1997. Faith, Reason, and Earth History. Andrews University Press,
Berrien Springs, MI.
Camp, Ashby. 2001. A critique of Douglas Theobald’s “29 Evidences for Evolution”.
09 March, 2006. www.trueorigin.org/theobald1a.asp
Campbell, N., Reece J., Mitchell, L. 1999. Biology, fifth edition.
Benjamin/Cummings, Menol Park, CA.
Kumar, S., and Subramanian, S. 2002. Mutation rates in mammalian genomes. Proc
Natl Acad Sci. 99: 803-808.
Penny D., Hendy M., Zimmer E. and R. Hamby. 1990. Trees from sequences:
Panacea or Pandora’s box?. Aus. Syst. Bot., 3, 21-38.
Penny D., Hendy M. and M. Steel. 1991.Testing the theory of descent. In:
Phylogenetic analysis of DNA sequences. 155-183.
Penny D., Foulds L. and M. Hendy. 1982. Testing the theory of evolution by
comparing phylogenetic trees constructed from five different protein sequences.
Nature. 297:197-200.
Penny D., Hendy M. and A. Poole. 2003. Testing fundamental evolutionary
hypotheses. J. Theor. Biol. 223:377-385.
Robinson D. and L. Foulds. 1981.Comparison of phylogenetic trees. Math. Biosc.
53:131-147.
References





Rokas A. and S. Carroll. 2005. More genes or more taxa?. The relative contribution of
gene number and taxon number to phylogenetic accuracy. Mol. Biol. Evol. 22(5):13371344.
Rokas A., Williams B., King Nicole and S. Carroll. 2003. Genome-scale approaches to
resolving incongruence in molecular phylogenies. Nature. 425:798-804
Sober E. and M. Steel. 2002. Testing the hypothesis of common ancestry. J. Theor.
Biol. 218:395-408.
Theobald, Douglas L. "29+ Evidences Macroevolution: The Scientific Case for Common
Descent." The Talk.Origins Archive. Vers. 2.85. 8 Jan, 2006
http://www.talkorigins.org/faqs/comdesc/
Theobald, Douglas. “29+ Evidences for Macroevolution: A Response to Ashby Camp’s
“Critique”. 21 March, 2002
www.talkorigins.org/faqs/comdesc/camp.html
The competing hypotheses
Ho: CA-1
Ha: CA-i, i >1
single ancestral origin
separate origination events
The competing hypotheses
Ho: CA-1
Ha: CA-i, i >1
Simplest model
A , all trait follows the same rules
B, each trait follows the same
rules on all branches
C, all the changes that a single
character can experience on a
given branch must have the
same probability
single ancestral origin
separate origination events
Most complex model
-A , allow traits to follow
different rules
-B, allow a single trait to follow
different rules on different
branches
-C, each possible change of a
single trait on a single
branch to have its own
probability
The competing hypotheses
Graph
Process model
Gi
Mj
Gi & Mj
estimate parameters in the model
L(Gi & Mj)
L(Gi & Mj)
L(Gi & Mk)
L(Gi & Mj)
L(Gk& Mj)
LRT does not apply
One can not compare different topologies
that have different process models
attached to them
Akaike information criterion (AIC)

AIC is based on a theorem that describes how the
predictive accuracy of a model M containing adjustable
parameter can be estimated

L(M) is the hypothesis obtained from M by assigning
values to adjustable parameters that maximize the
probability of the data

Good fit-to-data increase predictive accuracy

Penalty for complexity

Applies to nested and non-nested models
Trees from sequences
Advantages
 scope or domain a character
 range of evolutionary rates
 large number of characters
 mechanism of evolution
 easier data handle
 expectation of useful characters
 cost of obtaining data
Penny et al. 1990
Limitations
 Sampling errors:
- sequences too short
- unrepresentative sequences
 Methodological problems:
- large number of possible trees
- incomplete use of information
- converging to an incorrect tree
- deviations from the standard
model
 Human error:
- errors in data and programming
- misreading the tree
The limits to phylogeny reconstruction
depend on the model
A good method for reconstructing trees should
have the properties of being
fast
efficient
consistent
robust
falsifiable
Results from current methods should be treated
as hypotheses for future testing
Download