EPSc 230 Intro to Astrobiology Phylogenetics Exercise 1 Phylogenetics Exercise Goals: 1. construct a simple parsimony tree using characters 2. construct a simple distance matrix tree 3. see how easy it is to download DNA sequences from the web for free! Two Types of Data for Constructing Trees There are two major different types of data that can be used to construct phylogenetic trees. One type of data is character data. Using this approach, trees are constructed based on the similarities that organisms have amongst themselves. For instance, conceptually you would group humans and chimps together because they share many morphological and behavioral features, and then you would place gorillas outside of this group, because gorillas share fewer features than humans and chimps do. Another major type of data is DNA sequence data. We will provide two simple examples 1. Character-Based Trees If we create a chart of characters, listing which organisms have which characters (where 0=is not a chordate and 1=is a chordate). We can then look at which organisms have similar characters and then build up a tree from there. 1. Chordates 2. Backbone 3. Tetrapods 4. Amniotes 5. Feathers 6. Diapsids 7. Single Jaw Bone 8. Warm Blooded Sea Squirt 1 0 0 0 0 0 0 0 Fish 1 1 0 0 0 0 0 0 Amphibians 1 1 1 0 0 0 0 0 Lizards/Snakes 1 1 1 1 0 1 0 0 Birds 1 1 1 1 1 1 0 1 Mammals 1 1 1 1 0 0 1 1 Using this dataset, try to build up a tree for yourself. Use characters 1 through 6 to build your tree, but don’t use Character #8 (warm blooded) just yet. The answer is on the next page. DO NOT LOOK AT THE ANSWER UNTIL YOU HAVE COMPLETED YOUR TREE! Need a hint? Try looking at the characters that organisms share. These shared characters allow you to group them together into nested sets. Definitions: Chordate = has a spinal column with nerves Tetrapods = has four legs Amniotes = has an additional sac around the embryo (Amniotic Egg) Single Jaw Bone = mammals have one continuous jaw bone, but many other chordates have multiple bones in their jaws (like 5 or 6). EPSc 230 Intro to Astrobiology Phylogenetics Exercise 2 Diapsids = have two holes on the side of their head. We, and many other chordates, have only one hole in the side of our head (you can feel it on your temple when you chew): Answer: Sea Squirt Fish Amphibians Lizards/Snakes Birds Diapsids Mammals Single Jaw Bone Amniotes Tetrapods Vertebrates Chordates I’m sure you’ve noticed that not all characters are created equal in that: 1. some of the characters provide useful information on groupings, and 2. some of the characters provide no information on groupings, because they are either present in all organisms, or found in only one organism. Also note the following on the tree: We have outgroups and ingroups. An ingroup is the particular group you want to study, and your outgroup is the next branch just outside of your ingroup. So, if our ingroup is the vertebrates, sea squirts are the outgroup. If we are more interested in mammals, lizards/snakes, and birds, then amphibians is the outgroup. We also have sister groups. Mammals are the sister group to lizards/snakes and birds. Now, look at the Character #8 (warm blooded). Add it onto the tree above. What explanation do you have for this evolutionary pattern? Remember that constructing trees is an inference process, so there may be several explanations for this evolutionary pattern. EPSc 230 Intro to Astrobiology Phylogenetics Exercise 3 If there are several explanations (hypotheses to explain the observed pattern), how do we chose between the two? Usually, we use the Principle of Parsimony: The Principle of Parsimony states that the simplest explanation is preferable over more complex explanations. So, let’s look at the two possibilities: Possibility 1: Sea Squirt Fish Amphibians Lizards/Snakes Birds Mammals Loss of warm blooded Diapsids Single Jaw Bone Amniotes Gain of warm blooded Tetrapods Vertebrates Chordates Here the character warm-blooded arose once before the ancestor that gave rise to lizards/snakes, birds and mammals, and the warm-blooded character was lost early in the lizards/snakes lineage. The net result of this pattern would be one gain, and one loss. Possibility 2: Sea Squirt Fish Amphibians Lizards/Snakes Birds Mammals Gain of warm blooded Diapsids Amniotes Tetrapods Vertebrates Chordates Single Jaw Bone Gain of warm blooded EPSc 230 Intro to Astrobiology Phylogenetics Exercise 4 Here the character warm-blooded arose twice, once in the ancestor to mammals and another time (independently) in the ancestor to birds. The net result of this pattern would be two gains. So, which hypothesis is better? This is where we need to gather more data to see if there is a larger body of evidence to support one hypothesis over the other. 2. Sequence Based Trees As described in lecture, we can build our trees with DNA sequences. One way to do that involves obtaining the sequences, aligning the sequences, and then calculating the tree. The following is an example of how to construct a tree using distance methods. There are other methods you can use to construct your DNA-based trees, including using parsimony (like you did above), or using other methods that use more complicated models of evolution, such as maximum likelihood. Sequence alignment: Humans Gorilla Pig Rabbit CCAGTTCGGT CCAGATCGGT CCAGCACGGT CCAGGCTGGT The first step is to construct a distance matrix. What you do is look at each pair of sequences and count the number of sequence differences. For example, humans and gorilla have one different out of ten positions in the gene, or a 10% distance. You then fill that part of the distance matrix out: Humans Humans Gorilla Pig Rabbit Gorilla 10 Pig 20 20 Rabbit 30 30 30 Then what you do is construct the tree. First put the two most similar organisms together, and work your way out from there. Next, you want to indicate what the lengths of the branches are (how much sequence change has occurred along each branch). Note that the vertical branches are not counted here, rather only the horizontal branches are. What you do is look at the distance between your two closest organisms (humans and gorilla, 10), and divide that in half, so each of those branch lengths is 5. Then you work your way out. The next set of distances are pig to humans/gorillas. The distances there are 20. Divide that by 2, and one branch is 10, and the other branches must then add up to 10 (5+5=10). Do the similar thing with Rabbit and all the other organisms. Rabbit and Pig is 30, so the total horizontal branch lengths between pig and rabbit must add up to 30. EPSc 230 Intro to Astrobiology Phylogenetics Exercise 5 Here is how you would draw the tree: 5 Humans 5 5 5 10 15 Gorilla Pig Rabbit Now, try a new example on your own: Escherichia coli Bacillus anthracis Synechococcus Chlorobium Methanococcus AACGTTCTAGGCCCATACGG AACGTTCTAGGGCCATACGG AACGTCGTAGGACCATCCGG AACGTCATAGGACCATGCGG ATCGTATAACGTCGATTCGG In this list of organisms, E. coli is the gut organism that can make you sick if you happen to eat it (by contaminated food), Bacillus anthracis is the organism that causes anthrax, Synechococcus is a Cyanobacterium, Chlorobium is a Green Sulfur Bacterium (photosynthetic), and Methanococcus is a methanogen. (notice genus and species name are italicized and the genus name is capitalized). So, count the number of sequence differences and calculate the distances. Fill out the distance matrix below, try to come up with a tree shape, and then the branch lengths: E. coli E. coli B. anthracis Synechococcus Chlorobium Methanococcus B. anthracis Synechococcus Chlorobium Methanococcus