Name _____________________ BSC 695 – Molecular Systematics 2009 Final Examination Instructions: 1. Type your name on each page of your exam. 2. You may use any resources you choose; however, you must cite every source you use (except class notes) at the end of each answer. You must work independently of other people. 3. Type your answers to these questions using this file. Make sure that I can tell where the question ends and your answer begins (make the questions bold, your answers plain text). Add additional rows as needed to accommodate your answers. 4. For short answer or essay questions, your answer should be in the form of complete sentences, paragraphs, etc. Grammar and spelling will be considered in awarding points. 5. When you have completed this exam, rename the file Lastname_finalexam.doc and return the exam to me via email. 6. This examination is due no later than 2PM, Thursday May 7, 2009 (although I would like to begin grading much sooner than this date!). 7. Failure to follow any instructions will reduce your point score for this examination. Questions 1. (4 pts.) Identify two advantages and two disadvantages of molecular phylogenetic methods. 2. (6 pts.) Below are two chromatographs from one specimen sequenced in the forward and reverse directions for a nuclear gene. The specimen sequenced is from a diploid species. Each pair of upper and lower red arrows correspond to the same base position in the gene. Notice that at each of these positions two bases were read by the sequencer (e.g., for the first pair, the sequencer read c/g at this position; the second and third pairs were a/g). What is the most logical explanation to account for the double base-pair reads at these positions? Explain your answer. (You can delete this image when you have finished with this question.) 3. (25 pts.) Why is alignment of DNA sequences critical to phylogenetic analyses, and why is it often a difficult problem? For what kinds of DNA sequences is alignment typically most problematical, and for what kinds of sequences is it often straightforward, and why? What factors must be considered when aligning DNA sequences? 4. (3 pts.) Why are random addition sequences used in conjunction with heuristic searches? Name _____________________ 5. (15 pts.) The three main types of tree searches we discussed were: Exhaustive, Branch and Bound, and Heuristic. Briefly describe each of these, and describe each method in terms of: a) thoroughness of search; b) probability of finding the ‘best’ tree; c) speed. 6. (2 pts.) Describe the distinction between an algorithmic method and an optimality method. 7. (10 pts.) Nonparametric bootstrapping is a random resampling method widely used in phylogenetic analysis. Describe how a bootstrap analysis is performed, and indicate what information it provides. If bootstrap analysis finds strong (100%) support for a branch on a tree, what does that mean? 8. (15 pts.) We discussed three major families of phylogenetic methods: parsimony, likelihood, and distance methods. Briefly describe each of these methods, and contrast them against each other. Be sure to note important differences in their assumptions, and give a general outline of the steps involved in performing an analysis with each method. 9. (20 pts.) Bayesian and maximum likelihood analyses are both types of likelihood methods. Describe and distinguish between the concepts of probability and likelihood. What is a prior probability? What is Bayes Theorem? What assumptions are shared by both analyses? Finally, describe the Metropolis algorithm (use plain English - no math. I’m looking for a straightforward explanation). 10. (10 pts.) Explain how the likelihood of a tree is calculated in maximum likelihood analysis of DNA data. 11. (30 pts.) Describe four models of DNA sequence evolution and indicate the relationship among these models (i.e., are they nested?). If so, indicate the hierarchy among these models. Given these four models, what assumptions can you identify that are shared by all of these methods? 12. (10 pts.) We spent a great deal of time discussing models of molecular evolution and their application in phylogenetic reconstruction. Discuss what can happen in a phylogenetic analysis when the wrong model is used in maximum parsimony, maximum likelihood, and distance based analyses. 13. (5 pts.) Maximum parsimony analyses during the mid to late 1990’s often used weighting methods to compensate for saturation of transitions relative to transversions in the data matrix. Systematists suddenly stopped weighting their parsimony analyses around 20002001. Provide the complete citation for the paper that you believe served as the basis for discontinuing the use of weighted parsimony analyses in molecular phylogenetics. 14. (20 pts.) “It is always better to have more sequence data for fewer taxa than to have more taxa with less sequence data.” Provide a thorough critique supporting or refuting this statement. 15. (10 pts.) Why is the Akaike Information Criterion considered superior to log likelihood ratio test when determining an appropriate model of sequence evolution using ModelTest? 16. (10 pts.) Consider the concept of branch lengths in the context of parsimony, distance, and likelihood analyses of DNA sequence data. Is branch lenght a similar metric with each of these methods? If so, what is the definition of branch length that applies to each of these methods? If not, how do they differ? Name _____________________ 17. (5 pts.) Compare and contrast the ‘Felsenstein Zone’ with the ‘Farris Zone.’ Make sure to include in your discussion the conditions under which either situation might occur and how one might correct their data to account for these phenomena.