Name ___________________________ BSC 695 – Molecular Systematics 2011 Final Examination Instructions: 1. Type your name on each page of your exam. 2. You may use any resources you choose; however, you must cite every source you use (except class notes) at the end of each answer. You must work independently of other people. 3. Type your answers to these questions using this file. Make sure that I can tell where the question ends and your answer begins (make the questions bold, your answers plain text). Add additional rows as needed to accommodate your answers. 4. For short answer or essay questions, your answer should be in the form of complete sentences, paragraphs, etc. Grammar and spelling will be considered in awarding points. 5. When you have completed this exam, rename the file Lastname_finalexam.doc and return the exam to me via email. 6. This examination is due no later than 2PM, Monday May 2, 2011 (although I would like to begin grading much sooner than this date!). 7. Failure to follow any instructions will reduce your point score for this examination. Questions 1. (4 pts.) Identify two advantages and two disadvantages of molecular phylogenetic methods. 2. (6 pts.) Below are two chromatographs from one specimen sequenced in the forward and reverse directions for a nuclear gene. The specimen sequenced is from a diploid species. Each pair of upper and lower red arrows correspond to the same base position in the gene. Notice that at each of these positions two bases were read by the sequencer (e.g., for the first pair, the sequencer read c/g at this position; the second and third pairs were a/g). What is the most logical explanation to account for the double base-pair reads at these positions? Explain your answer. 3. (25 pts.) Why is alignment of DNA sequences critical to phylogenetic analyses, and why is it often a difficult problem? For what kinds of DNA sequences is alignment typically most problematical, and for what kinds of sequences is it often straightforward, and why? What factors must be considered when aligning DNA sequences? 4. (3 pts.) Why are random addition sequences used in conjunction with heuristic searches? 5. (15 pts.) The three main types of tree searches we discussed were: Exhaustive, Branch and Name ___________________________ Bound, and Heuristic. Briefly describe each of these, and describe each method in terms of: a) thoroughness of search; b) probability of finding the ‘best’ tree; c) speed. 6. (2 pts.) Describe the distinction between an algorithmic method and an optimality method. 7. (10 pts.) Nonparametric bootstrapping is a random resampling method widely used in phylogenetic analysis. Describe how a bootstrap analysis is performed, and indicate what information it provides. If bootstrap analysis finds strong (100%) support for a branch on a tree, what does that mean? 8. (15 pts.) We discussed three major families of phylogenetic methods: parsimony, likelihood, and distance methods. Briefly describe each of these methods, and contrast them against each other. Be sure to note important differences in their assumptions, and give a general outline of the steps involved in performing an analysis with each method. 9. (20 pts.) Bayesian and maximum likelihood analyses are both types of likelihood methods. Describe and distinguish between the concepts of probability and likelihood. What is a prior probability? What is Bayes Theorem? What assumptions are shared by both analyses? Finally, describe the Metropolis algorithm (use plain English - no math. I’m looking for a straightforward explanation). 10. (10 pts.) Explain how the likelihood of a tree is calculated in maximum likelihood analysis of DNA data. 11. (30 pts.) Describe four models of DNA sequence evolution and indicate the relationship among these models (i.e., are they nested?). If so, indicate the hierarchy among these models. Given these four models, what assumptions can you identify that are shared by all of these methods? 12. (10 pts.) We spent a great deal of time discussing models of molecular evolution and their application in phylogenetic reconstruction. Discuss what can happen in a phylogenetic analysis when the wrong model is used in maximum parsimony, maximum likelihood, and distance based analyses. 13. (5 pts.) Maximum parsimony analyses during the mid to late 1990’s often used weighting methods to compensate for saturation of transitions relative to transversions in the data matrix. Systematists suddenly stopped weighting their parsimony analyses around 20002001. Provide the complete citation for the paper that you believe served as the basis for discontinuing the use of weighted parsimony analyses in molecular phylogenetics. 14. (20 pts.) “It is always better to have more sequence data for fewer taxa than to have more taxa with less sequence data.” Provide a thorough critique supporting or refuting this statement. 15. (10 pts.) Why is the Akaike Information Criterion considered superior to log likelihood ratio test when determining an appropriate model of sequence evolution using ModelTest? 16. (10 pts.) Consider the concept of branch lengths in the context of parsimony, distance, and likelihood analyses of DNA sequence data. Is branch lenght a similar metric with each of these methods? If so, what is the definition of branch length that applies to each of these methods? If not, how do they differ? 17. (5 pts.) Compare and contrast the ‘Felsenstein Zone’ with the ‘Farris Zone.’ Make sure to Name ___________________________ include in your discussion the conditions under which either situation might occur and how one might correct their data to account for these phenomena. 18. (15 pts.) Describe three ways that phylogenetic hypotheses can be used outside of evolutionary biology (e.g., medicine, agriculture, etc.). 19. (15 pts.) Imagine that you submitted sequences to ClustalW and got back the following alignment. Describe the changes you would make to this alignment, if any. Explain what effects on phylogenetic analysis your changes would have, and any parameters you would change if you were to submit the sequences to to ClustalW again. Nucleotide Position Taxa 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Aus A C G - A A T A C - A A T A Bus A C C A - A C A T G - - T A Cus A C C A - A C G C G - - T A Dus G C C A - A C A C G - - T G Eus A C C - C A T A C G - C T A Fus A C G - A A T A C - A A T A Gus A C C C - A T A C G - - T A Hus A C G - A A T A C - A A T A 20. (20 pts.) The field of systematic biology has it roots in centuries of comparative biology, but it is changing rapidly. Considering all the topics we covered over the past two semesters, what advice would you give to a beginning student in systematic biology, in terms of the 3-5 scientific or technical areas that will be most critical to her or his success? 21. (50 pts.) In some cases, we find a great deal of genetic differentiation between species, and in other cases we do not. Why is this so? Is it reasonable to set threshold values for the amount of genetic differentiation we would expect to find between species? For example, one often sees proposals that if two entities show a certain level of difference in DNA sequence, that they should be considered to be separate species. What do different models or case studies of speciation tell us about the amount of genetic differentiation we expect to find between species? Given your answer, what recommendations would you make to those designing protocols for DNA-based taxonomy, for example, the Consortium for the Barcode of Life? I am sure they would appreciate constructive recommendations, something other than “give up, it’s hopeless.” Name ___________________________