Creating and comparing phylogenies using morphological data and molecular data Bioinfomatics Tools The purpose of this exercise is to develop a basic understanding of bioinfomatics and learn some of the skills involved in constructing and interpreting phylogenetic trees and cladograms. The basic tools that we will use consist of banks of data (DNA sequences and your character matrix), and programs to compare the similarities and differences between sequences or the character states in your matrix. These programs display the results in the form of a tree. Simply generating the tree is not the end of the story. Interpretation is a time-consuming and extensive process and we will spend most of our time in this lab interpreting our trees. Creating a phylogeny based on morphological data The first step is to create a data file using Word. The format is very precise, so follow the directions very carefully. On the first line space three times enter the number of taxa space three times enter the number of characters for each taxon press enter There must be a code (character state) for each character and phylum. On the second line Put a three-letter capital code to represent the phylum space 10 times enter the codes for each character state. They can wrap to the next line if necessary press enter repeat for each phylum save the file as a text file. This is very important. PhyLip can’t read word files. Go to: http://bioweb.pasteur.fr/seqanal/phylogeny/phylip-uk.html or http://portal.litbio.org/Registered/Option/phylip.html Scroll down and choose PARS. You should now be at: http://bioweb.pasteur.fr/seqanal/interfaces/pars-simple.html Enter you e-mail cut and paste your character matrix into the box provided Press “Run PARS” You should now be at: http://bioweb.pasteur.fr/cgi-bin/seqanal/pars.pl In the pulldown box select “Drawgram” Press “Run the selected program on OUTTREE” You should now be at: http://bioweb.pasteur.fr/cgi-bin/seqanal/lib/connect.pl Click on “Advanced drawgram form” Scroll down until you see “Drawgram options” Select “W:MS-Windows Bitmap from the pulldown menu Go back to the top of the page and select “Run drawgram” You should not be at: http://bioweb.pasteur.fr/cgi-bin/seqanal/drawgram.pl Double click on “plotfile” You should see your tree. Save your tree as a file and print a copy by right-clicking and selecting the appropriate option. Attach your tree to the end of this lab calling it Figure 1 and adding an appropriate title. 10 pts. Creating a phylogeny based on molecular data 1. 2. 3. 4. 5. Get on to the internet and type in http://workbench.sdsc.edu/ Click on “Set up a free account” and follow the instruction to set up an account OR use the account that you set up several weeks ago. Click on “Enter the Biology Workbench 3.2” and all you have to do is to type your name and password. Once you are in, click on “Nucleic Tools”. Highlight “Ndjinn – Multiple Database Search”. Click “Run”. There many databases available. For our project we will use GBPLN (GenBank Plant Sequences, which includes the fungi and algae.) We will be working on the plants in Table 1. Group Group 1 (flowering plants) Group 2 (bryophytes) Group 3 (algae) Group 4 (ferns) Group 5 (seedless vascular plants) Group 6 (ancestral gymnosperms) Group 7 (derived gymnosperms) Group 8 GBPLN # GBPLN:7595531 GBPLN:19515 GBPLN:547571 GBPLN:987712 GBPLN:2463094 GBPLN:515787 GBPLN:7242518 GBPLN:2564932 GBPLN:2358227 GBPLN: 511872 GBPLN: 7981618 GBPLN: 5825256 GBPLN:3401953 Different database!! GBBCT: 1668771 GBPLN:603755 GBPLN:7670254 GBPLN:10765099 GBPLN:861086 GBPLN:861139 18s rRNA gene sequence Salix reticulata Lapageria rosea Marchantia polymorpha Anthoceros agrestis Hypnum cupressiforme Siphonocladus tropicus Geosiphon pyriforme Batrachospermum turfosum Vaucheria bursata Prymnesium patelliferum Ceratium furca Euglena anabaena Fucus distichus Nostoc sp. – 16S rRNA Phylum Angiosperm Angiosperm Hepatophyta Anthoceratophyta Bryophyta Chlorophyta Bacillariophyta Rhodophyta Chrysophyta Prymnesiophyta Dinophyta Euglenophyta Phaeophyta Cyanobacteria Tmesipteris tannensis Azolla pinnata Dicksonia antarctica Psilotum nudum Selaginella umbrosa Pteridophyta Pteridophyta Pteridophyta Psilophyta Lycopodophta GBPLN:19131 GBPLN:860933 GBPLN:403026 GBPLN:2588905 Lycopodium annotinum Equisetum robustum Pinus wallichiana Pinus luchuensis Lycopodophyta Equisetophyta Gymnosperms Gymnosperms GBPLN:2588904 GBPLN:7670259 GBPLN:7595579 Pinus elliottii Gnetum montanum Welwitschia mirabilis Gymnosperms Gnetophyta Gnetophyta GBPLN: 556508 GBPLN:1777635 Ginkgo biloba Amborella Ginkgophyta Angiosperm (ancestral) Table 1. List of the organisms. Click on “GBPLN”. (You have to scroll down for this option.) Scroll up and type in the number for one of the species on the list in Group 1. Click on “Show 10 hits”. Make sure that every team chooses different combinations of species. i.e. don’t always take the first species in each group. 8. Click on “Search”. You will see a list of choices. Scroll down until you see the number you are searching for in the GBPLN column. Highlight this line and click on the “Import Sequence(s)” button located at the end of the first line of the interactive box. If you get too many sequences, then search by species name and then look for the specific GBPLN number. 9. Now we need to repeat the process to retrieve more sequences from the gene banks. Highlight “Ndjinn” and click “Run.” 10. Search and import the sequence of one organism from each group in Table 1. 6. 7. 11. Now we are ready to generate a tree. Select all the sequences EXCEPT Group 8. Click “Run”. All the boxes in front of these organism names should be checked. 12. Using the scroll box and scroll down and highlight “CLUSTALW – Multiple Sequence Alignment”. Click “Run”. 13. A new screen will appear. You can choose to make a rooted or unrooted tree by clicking on the arrow next to the box labeled “Guide tree display” and choosing rooted or unrooted. Then, click “Submit”. The screen will go blank and you may have to wait several minutes. Wait until a screen titled “CLUSTALW” with “Sequence alignment” appears. Scroll down to examine the DNA sequences and how they align with each other. Scroll further down and you will see your tree! Right click on your treee and select “Copy” from the menu. 14. Open Microsoft Word. Go to “Edit” and “Paste” and your tree will reappear. Adjust the size of your tree by selecting the tree image and resize from the lower right corner so two trees can fit on a page. Type in a label for each tree using consecutive figure numbers and adding appropriate titles. 15. Look up each number and write the corresponding phylum beside the number by hand. (Consult Table 1.) You can also label each one by pulling down the “Insert” menu. Selecting and releasing on “TextBox”. Click where you want the textbox to be located and type the name of the species. Drag the textbox next to the corresponding number. Attach your tree to the lab and label it “Phylogeny of Kingdom Plantae”. Time to think: A. Compare your phylogeny from the character matrix with the molecular data tree. Where are there differences/similarities? Why do you think this might be the case? 3 pts. B. Compare your phylogeny from the character matrix and the molecular data tree with information from your textbook. Where are there differences/similarities? What might account for any differences you see? 3 pts. C. Re-examine your character matrix and the sequences you’ve chosen from GenBank. Make any adjustments you think are desirable. State the changes you make here and why you made them: D. Rerun the programs to generate new trees. Number them and add appropriate titles and attach them to this lab report. What changes occurred? Was this what you expected? Why or why not? E. Using the information above, and information from our work in class, draw a cladogram and clearly label at least one, preferably two characteristics that separate each phylum we have studied. Indicate using sequential numbers where the evidence for this cladogram is in your phylogenetic trees and character matrix. . 5 pts. Turn over for last page of lab. Reflection (4 pts.): As a result of the series of labs we have completed on the evolution of photosynthetic organisms, what is the most important idea/concept that you have learned? What is the most important understanding you have gained about systematics and phylogeny construction? For this lab turn in: This worksheet with the answers completed. (7 answers at 3 pts. each) Phylogenetic trees as indicated in the lab with each numbered, titled and all representative phyla labeled. (at least 4 trees depending on how many revisions you make) 16 pts. Your character matrix in table form with an additional table including the character, coding for the character states, and importance of the character in determining phylogenetic relationships. 9 pts. Revised 10-29-07 BJB