Phylogenetics Exercise

advertisement
EPSc 230 Intro to Astrobiology
Phylogenetics Exercise
1
Phylogenetics Exercise
Goals:
1. construct a simple parsimony tree using characters
2. construct a simple distance matrix tree
3. see how easy it is to download DNA sequences from the web for free!
Two Types of Data for Constructing Trees
There are two major different types of data that can be used to construct phylogenetic
trees. One type of data is character data. Using this approach, trees are constructed
based on the similarities that organisms have amongst themselves. For instance,
conceptually you would group humans and chimps together because they share many
morphological and behavioral features, and then you would place gorillas outside of this
group, because gorillas share fewer features than humans and chimps do. Another major
type of data is DNA sequence data. We will provide two simple examples
1. Character-Based Trees
If we create a chart of characters, listing which organisms have which characters (where
0=is not a chordate and 1=is a chordate). We can then look at which organisms have
similar characters and then build up a tree from there.
1. Chordates
2. Backbone
3. Tetrapods
4. Amniotes
5. Feathers
6. Diapsids
7. Single Jaw Bone
8. Warm Blooded
Sea Squirt
1
0
0
0
0
0
0
0
Fish
1
1
0
0
0
0
0
0
Amphibians
1
1
1
0
0
0
0
0
Lizards/Snakes
1
1
1
1
0
1
0
0
Birds
1
1
1
1
1
1
0
1
Mammals
1
1
1
1
0
0
1
1
Using this dataset, try to build up a tree for yourself. Use characters 1 through 6 to build
your tree, but don’t use Character #8 (warm blooded) just yet. The answer is on the next
page. DO NOT LOOK AT THE ANSWER UNTIL YOU HAVE COMPLETED YOUR
TREE!
Need a hint? Try looking at the characters that organisms share. These shared
characters allow you to group them together into nested sets.
Definitions:
Chordate = has a spinal column with nerves
Tetrapods = has four legs
Amniotes = has an additional sac around the embryo (Amniotic Egg)
Single Jaw Bone = mammals have one continuous jaw bone, but many other
chordates have multiple bones in their jaws (like 5 or 6).
EPSc 230 Intro to Astrobiology
Phylogenetics Exercise
2
Diapsids = have two holes
on the side of their head.
We, and many other
chordates, have only one
hole in the side of our head
(you can feel it on your
temple when you chew):
Answer:
Sea Squirt
Fish
Amphibians
Lizards/Snakes Birds
Diapsids
Mammals
Single Jaw Bone
Amniotes
Tetrapods
Vertebrates
Chordates
I’m sure you’ve noticed that not all characters are created equal in that:
1. some of the characters provide useful information on groupings, and
2. some of the characters provide no information on groupings, because they are
either present in all organisms, or found in only one organism.
Also note the following on the tree:
We have outgroups and ingroups. An ingroup is the particular group you want to study,
and your outgroup is the next branch just outside of your ingroup. So, if our ingroup is
the vertebrates, sea squirts are the outgroup. If we are more interested in mammals,
lizards/snakes, and birds, then amphibians is the outgroup.
We also have sister groups. Mammals are the sister group to lizards/snakes and birds.
Now, look at the Character #8 (warm blooded). Add it onto the tree above. What
explanation do you have for this evolutionary pattern? Remember that constructing trees
is an inference process, so there may be several explanations for this evolutionary pattern.
EPSc 230 Intro to Astrobiology
Phylogenetics Exercise
3
If there are several explanations (hypotheses to explain the observed pattern), how do we
chose between the two? Usually, we use the Principle of Parsimony:
The Principle of Parsimony states that the simplest explanation is preferable over more
complex explanations.
So, let’s look at the two possibilities:
Possibility 1:
Sea Squirt
Fish
Amphibians Lizards/Snakes Birds
Mammals
Loss of
warm blooded
Diapsids
Single Jaw Bone
Amniotes
Gain of
warm blooded
Tetrapods
Vertebrates
Chordates
Here the character warm-blooded arose once before the ancestor that gave rise to
lizards/snakes, birds and mammals, and the warm-blooded character was lost early in the
lizards/snakes lineage. The net result of this pattern would be one gain, and one loss.
Possibility 2:
Sea Squirt
Fish
Amphibians Lizards/Snakes
Birds
Mammals
Gain of
warm blooded
Diapsids
Amniotes
Tetrapods
Vertebrates
Chordates
Single Jaw Bone
Gain of
warm blooded
EPSc 230 Intro to Astrobiology
Phylogenetics Exercise
4
Here the character warm-blooded arose twice, once in the ancestor to mammals and
another time (independently) in the ancestor to birds. The net result of this pattern would
be two gains.
So, which hypothesis is better? This is where we need to gather more data to see if there
is a larger body of evidence to support one hypothesis over the other.
2. Sequence Based Trees
As described in lecture, we can build our trees with DNA sequences. One way to do that
involves obtaining the sequences, aligning the sequences, and then calculating the tree.
The following is an example of how to construct a tree using distance methods. There
are other methods you can use to construct your DNA-based trees, including using
parsimony (like you did above), or using other methods that use more complicated
models of evolution, such as maximum likelihood.
Sequence alignment:
Humans
Gorilla
Pig
Rabbit
CCAGTTCGGT
CCAGATCGGT
CCAGCACGGT
CCAGGCTGGT
The first step is to construct a distance matrix. What you do is look at each pair of
sequences and count the number of sequence differences. For example, humans and
gorilla have one different out of ten positions in the gene, or a 10% distance. You then
fill that part of the distance matrix out:
Humans
Humans
Gorilla
Pig
Rabbit
Gorilla
10
Pig
20
20
Rabbit
30
30
30
Then what you do is construct the tree. First put the two most similar organisms together,
and work your way out from there. Next, you want to indicate what the lengths of the
branches are (how much sequence change has occurred along each branch). Note that the
vertical branches are not counted here, rather only the horizontal branches are. What you
do is look at the distance between your two closest organisms (humans and gorilla, 10),
and divide that in half, so each of those branch lengths is 5. Then you work your way
out. The next set of distances are pig to humans/gorillas. The distances there are 20.
Divide that by 2, and one branch is 10, and the other branches must then add up to 10
(5+5=10). Do the similar thing with Rabbit and all the other organisms. Rabbit and Pig
is 30, so the total horizontal branch lengths between pig and rabbit must add up to 30.
EPSc 230 Intro to Astrobiology
Phylogenetics Exercise
5
Here is how you would draw the tree:
5
Humans
5
5
5
10
15
Gorilla
Pig
Rabbit
Now, try a new example on your own:
Escherichia coli
Bacillus anthracis
Synechococcus
Chlorobium
Methanococcus
AACGTTCTAGGCCCATACGG
AACGTTCTAGGGCCATACGG
AACGTCGTAGGACCATCCGG
AACGTCATAGGACCATGCGG
ATCGTATAACGTCGATTCGG
In this list of organisms, E. coli is the gut organism that can make you sick if you happen
to eat it (by contaminated food), Bacillus anthracis is the organism that causes anthrax,
Synechococcus is a Cyanobacterium, Chlorobium is a Green Sulfur Bacterium
(photosynthetic), and Methanococcus is a methanogen. (notice genus and species name
are italicized and the genus name is capitalized).
So, count the number of sequence differences and calculate the distances. Fill out the
distance matrix below, try to come up with a tree shape, and then the branch lengths:
E. coli
E. coli
B. anthracis
Synechococcus
Chlorobium
Methanococcus
B. anthracis
Synechococcus
Chlorobium
Methanococcus
Download