CS 609/BMI 609 Computational Genomics Fall 2014 Credits: 3 units Contact Hours: Monday and Wednesday 1400-1515 Instructors: Robert Edwards Office: GMCS 536 Email: redwards@mail.sdsu.edu Office Hours: Mondays and Wednesdays 1530 – 1700 (and by appointment) Course Materials 1. CS 609 lecture notes/slides (available on Blackboard) 2. Supplementary textbooks: i. Essential Bioinformatics by Jin Xiong ISBN: 0521600820 Course Information for CS 609 Description from the Official Course Catalog Biological and genomics data. Application of computational algorithms to biological questions. Post-genomic techniques in annotation and comparison of microbial and eukaryotic genome sequences. Prerequisites: Computer Science 503 or 514 Course Type: Selected elective course in the program Specific Goals for CS 609 Course-Level Student Learning Outcomes 1. Ability to use existing bioinformatics algorithms to analyze genome sequences 2. Abtility to explain how those algorithms function and the complexity of the algorithm 3. Abtility to understand the merits of different bioinformatics algorithms 4. Ability to read, understand, and discussing relevant scientific literature Relationship to CS Program Course Outcomes CS 609 addresses the following CS Program course outcomes: a) An ability to apply knowledge of computing and mathematics appropriate to the program’s student outcomes and to the discipline b) An ability to analyze a problem, and identify and define the computing requirements appropriate to its solution c) An ability to communicate effectively with a range of audiences d) Recognition of the need for and an ability to engage in continuing professional development e) An ability to use current techniques, skills, and tools necessary for computing practice. Topics Covered The following topics are covered in CS 609: 1. Annotation 2. BLAST 3. Codon usage 4. Comparative genomics tools 5. DNA Sequencing 6. Hidden markov models 7. Machine learning 8. Metagenomics tools 9. Multiple sequence alignments 10. Pangenome analysis 11. Phylogeneomics 12. Protein families 13. Protein-encoding gene identification 14. Sequence alignments 15. Sequence assembly 16. tRNA identification 17. Whole genome comparisons Course Schedule Week Topic 1 Introduction; DNA Sequencing, Sanger, Solexa, SOLiD, Ion 2 Sequence assembly; codon usage 3 Protein-encoding gene identification 4 tRNA identification; Multiple sequence alignments 5 Sequence alignments October 6th 2014. The first progress report towards the final write up is due. 6 BLAST and International databases 7 Protein families 8 Hidden markov models 9 Whole genome comparisons 10 Annotation November 10th 2014. The second progress report towards the final write up is due. 11 Comparative genomics tools 12 Pangenome analysis 13 Machine learning 14 Phylogeneomics - whole genome phylogeny 15 Metagenomics tools December 10th, Final Report is due Major assignments Paper presentations Everyone must present a paper in class, and we will have ~ 2 papers per week. Please choose from the topics below and we will discuss the selections in class: General papers Annotation BLAST DNA Sequencing EST HMMs International Databases Machine learning Metagenomics Protein encoding gene identification - ORF Calling PanGenome Analysis Protein families RNA Identification Sequence Alignments Sequence Assembly Structural Analysis Whole Genome Comparisons Whole Genome Phylogeny Written assignments The written assignment will be in three parts. At the beginning of the semester you will be provided with a new genome to annotate. This is new sequence that has never been published before. During the course of the semester you will identify the genes and proteins in that genome, curate the annotations in that genome, and investigate the roles that the genome can fulfill in the environment. The paper writing will accompany the dicussions in class. At the end of the assignment you will have a complete article written in the format of Genome Papers (http://genomepapers.org/) and ready for submission. The first installment of the paper (due October 6th) will include basic statistics about the organism and the genome that was sequenced The second installment (due November 10th) will include information about the genes and functions that were found in the organism, and the third installment due at the end of the semester will include a detailed description of the genome. Grading Policies The final grade will be comprised of: In class participation: 35% The written assingments will total 65% of the grade. A 92 and above A- 90-92 B+ 88-90 B 82-88 B- 80-82 C+ 78-80 C 72-78 C- 70-72 D+ 68-70 D 62-68 D- 60-62 Fail Below 60 Special Assistance If you are a student with a disability and believe you will need accommodations for this class, it is your responsibility to contact Student Disability Services at (619) 594-6473. To avoid any delay in the receipt of your accommodations, you should contact Student Disability Services as soon as possible. Please note that accommodations are not retroactive, and that accommodations based upon disability cannot be provided until you have presented your instructor with an accommodation letter from Student Disability Services. Your cooperation is appreciated.