Name: Are you a graduate or undergraduate student? Please circle one. Bioinformatics Take Home Test #4 Due Date Monday 10/14/2013 before class (This is an open book exam based on the honors system -- you can use notes, lecture notes, online manuals, and text books. Teamwork is not allowed on the exams, write down your own answers, do not cut and paste from webpages. If your answer uses a citation, give the source of the quoted text.) Notes on Formatting Quizzes: Please make sure each answer is only on one page, by using page breaks. Splitting an answer onto two pages tend to lead to grading errors. Please do not write or type in font smaller than 12 point or write in cursive. If you submit your quiz via email, please remove the instructions and extras (blank lines, alternative answers for multiple choice questions) from your document, so that only your answers, a minimal amount white space, and optionally the questions, are left. Note: Some consider alternative definitions of monophyletic, or consider the root of the tree of life inside one of the three domains of life. In answering the questions, follow the definitions established by Willi Hennig, do not use the terminology proposed by Ashlock (see http://en.wikipedia.org/wiki/Holophyletic). Assume that the tree of life is rooted between the bacteria on one side of the root, and the archaea and the nucleocytoplasmic component of the eukaryotes on the other (i.e. the rooting pioneered by your instructor and assumed in most text books). 1. 1pt True/False Apicomplexa and Haptophytes have primary plastids, i.e. plastids that evolved form an endosymbiosis with a cyanobacterium. 2. 1pt True/False rRNA was first biomolecule used to place microorganisms onto the tree of life. 3. 1pt True/False The mitochondria originated from an endosymbiosis between an host cell and an alphaproteobacterium. 4. 1pt True/False According to Hennig a natural taxonomy should be based on shared analogous characters. 5. 1pt True/False Hennig’s principles for taxonomy are also known as clanisitics. 6. 1pt True/False An autapomorphy of a group of organisms IS useful in establishing the relation between this group and other groups. 7. 1pt True/False You canNOT define clades using, an unrooted tree (i.e. you don't know where inside the tree the ancestor is located). 8. 1pt True/False Cyanobacteria (also sometimes called blue green algae) are the ancestors of plastids. 9. 2pt Prokaryotes are characterized by the absence of a complex internal membrane system, in particular, they do not have a membrane system surrounding their genetic material (= nuclear envelope). (Assume the text book rooting of the tree of life in your answer.) a. True/False The absence of the nuclear envelope a synapomorphy. b. True/False The group formed by the presence of this character IS a proper taxonomic category, i.e. monophyletic. 10. 2pt Archaea and Eukaryotes both have so-called TATA binding proteins, which play an important role in directing the RNA polymerase to the promoter, whereas bacteria do not have a homologous protein but use sigma factors to control transcription. (Assume the text book rooting of the tree of life in your answer.) a. True/False The presence of a TATA binding protein is a shared derived character for the group containing archaea and bacteria. b. True/False The presence of a TATA binding protein supports shared ancestry between Archaea and at least part of the eukaryotic nucleocytoplasmic (the nucleus and cytoplasm, but not the mitochondria or plastids) component. 11. 1pt Which organism have primary plastids, i.e. plastids that evolved directly from a Cyanobacterial endosymbiont? (more than one right answer; must have at least one right answer and no wrong answers) A) Everything that can photosynthesize B) Red algae and green algae C) Green algae and plants D) The Glaucophytes E) The archaeplastida 12. 1pt In defining protein space JALVIEW uses A) Each of the possible aminoacid trimers as a dimension, and the frequency of the trimer as value. B) The presence or absence of conserved sequence motifs to define protein space. C) Each column in the multiple sequence alignment as a dimension to define protein space. D) A tree based on percent identity to define groups that are close to each other in sequence space. E) None of the above. 13. 1pt A group of organisms that is defined by a shared derived character is A) monophyletic B) synapomorphic C) paraphyletic D) symplesiomorphic E) polyphyletic 14. 1pt A group of organisms that is defined by a shared primitive character is A) monophyletic B) synapomorphic C) paraphyletic D) symplesiomorphic E) polyphyletic 15. 3pt Match these terms to the following definitions: A. clade B. monophyletic group C. paraphyletic group D. synapomorphies E. symplesiomorphies F. autapomorphies G. polyphyletic group H. Homoplasy 1. ____ A group that goes back to a common ancestor, but does not include all of the descendants of that ancestor (based on a symplesiomorphy). 2. ____ A group that goes back to a common ancestor and DOES include all of the descendants of that ancestor (based on a synapomorphy). 3. ____ A group that does NOT go back to a common ancestor (based on a homoplasy). 4. ____ A proper monophyletic taxonomic group. 5. ____ A shared derived character (a new change that happened between the period of time when the members of a clade split off from each other and when that group split from a previous, ancestral group. 6. ____ A character that is shared only in one taxon and is therefore of limited phylogenetic use (only useful for measuring the rate of evolution). 7. ____ A shared ancestral character that is present in members outside of the group. 8. ____ A character that is evolved independently. It is shared, but not due to homology. 16. 1pt According to the currently favored version of the tree of life, which prokaryotic domain is the closest relative of the nucleocytoplasm? A. Bacteria B. Viruses C. Archaea D. Protista E. Eukarya F. Crenarchaea G. Inteins 17. 1pt What Mac or PC application can be used to access the cluster through a command line interface? (Circle one most correct answer for the operating system you use.) a. Putty b. Clustal c. Terminal d. SSHclient e. Filezilla 18. 1pt Which command when executed in a terminal window would establish an ssh command line connection to the server? a. putty username@bbcsrv3.biotech.uconn.edu b. ftp username@bbcsrv3.biotech.uconn.edu c. sftp username@bbcsrv3.biotech.uconn.edu d. ssh username@bbcsrv3.biotech.uconn.edu 19. 1pt Which of the following is the host name of the cluster in the biotech center? a. mcb221u016 b. bbcsrv3.biotech.uconn.edu c. mcb221u016@bbcsrv3.biotech.uconn.edu d. It is variable, because it is defined by the user once you log on for the first time e. It doesn’t have one, because the computer automatically remembers the cluster 20. 1pt When using the cluster, what is the first thing that has to be typed after establishing a connection and entering your password? a. cd folder_name (i.e. whatever you named the folder your files are in) b. qlogin c. your user name and password d. mcb221u016@bbcsrv3.biotech.uconn.edu e. Nothing, you can go directly to running your program 21. 1pt What does the command qlogin do? a. Tells the cluster to reset your password b. Tells the cluster to use a secure connection to your computer, so that other people cannot hack the connection c. Log you onto a subnode, so that your programs are not being run on the head node; running things on the headnode is very bad and rude, because it will bog down the entire cluster, or worse, crash it. d. Log you onto the headnode, so that your programs are not being run on a subnode. Programs must be run on the headnode, where they are installed. e. Allows you to move from your home directory to the folder where your files are stored. 22. 1pt You want to know what processes are currently running on the node on the cluster you are logged into. a. ps aux lists all processes currently registered b. ls –l lists all processes currently registered in long form c. logout by typing exit, when you execute a qlogin, the queue managing software will give you detailed information on all processes d. All of the above. 23. 1pt What does the commands cd do? a. cd allows to read in information from an attached compact disk drive b. cd lists the contents of the directory you are currently in c. cd logs you onto a subnode d. cd logs you onto the headnode (aka masternode) e. cd transfers a file from your computer to the cluster f. cd allows you to change directories g. cd changes permissions of the current directory to the default sets of permissions. 24. 1pt What does the commands ls do? a. ls lists the contents of the directory you are currently in b. ls changes the directory to the lower directory of folder c. ls logs you onto the headnode d. ls logs you onto a subnode e. ls transfers a file from the cluster to your computer 25. 1pt What are advantages of a command line interface. a. It is easier to collect several commands issued though the command line into a single script b. It is easy to create a protocol of all commands executed in a session c. Individual commands can be recalled and issued repeatedly with or without modifications. d. All of the above 26. 1pt How can you transfer files from your local computer to the bbcsrv3 server? a. using Filezilla or a program like sshclient b. using sftp from the command line c. using the ftp protocol in your browser (i.e. ftp://ftp.bbcsrv3.biotech.uconn.edu) d. a and b e. a and c f. b and c 27. 1pt You try to run a program the server using a data file that you edited in word under windows or MacOSX as a text file. The first line of the data file contains a comment line starting with #. The script reports that the file does not contain any data. What is the most likely scenario for what has gone? a. The program is not equipped to ignore comment lines b. The data file does not contain the correct end of line symbols c. MS word test files are similar to pdfs. Their format is so complex that most programs cannot handle them 28. 1pt You try to run a program on a unix server using a data file that you edited in word under windows or MacOSX as a text file. The first line of the data file contains a comment line starting with #. The program reports that the file does not contain any data. What could you try to fix the problem? a. When saving the text file in MSWord, do not use the default setting provided by MSWord, but select the option "end lines with LF only". b. Use a program/application that converts end of line symbols between different operating systems c. Use a text editor under unix to edit the datafile d. Use a serial editor to convert the end of line symbols. e. All of the above. For graduate students, extra credits for undergraduate students: 29.1pt What happened to cause the downfall of the Five Kingdoms of Life? What does this have to do with Hennig’s cladistics?