COURSE INFORMATION Winter 2013 60-592 Selected Topics in Network Systems Biology School of Computer Science, University of Windsor Instructor: Dr. Alioune Ngom, School of Computer Science. Email: angom@cs.uwindsor.ca Phone Extension: 3789 Room: 5107 LT Office Hours: Tuesday 01:00pm to 2:00pm PURPOSE OF THIS COURSE AND LEARNING RESOURCES: Course Description and Objectives Network-based Systems Biology (or Network System Biology) is an emerging field focusing on various types bio-molecular networks such as Gene Regulatory Networks, Protein Interaction Networks, and Metabolic Networks, to name a few. A major challenge in Network System Biology (NSB) is to investigate how cellular systems facilitate biological functions by various interactions between genes, proteins, and metabolites. NSB studies how an organism, viewed as an interacting network of biomolecules and bio-chemical reactions, gives rise to a complex life. Vast amounts of biological network data have recently been generated due to advances in experimental biology. These data sets are increasingly beings studied to obtain systems-level understanding of biological structures and processes. Various mathematical and computational tools are being used and developed to analyze and model these data aiming to achieve a better description and understanding of biological processes, disease, and contribute to the time and cost effectiveness of biological experimentation. This course will give an overview of the existing types of biological network data, point to sources of errors and biases in the data, and introduce the current methods, models and literature on graph theoretic modeling and discrete algorithmic analyses applied to these data. This course will cover basic biological concepts behind complex networks in general and biological networks in particular, fundamental graph theoretic algorithms, computational complexity and challenges in network analysis, existing post-genomic approaches for analyzing, modeling, and comparing biological networks, and applications of these approaches to understanding biological function, disease, and evolution. This course will also particularly cover topics pertaining to network-based systems biology such as bio-molecular network reconstruction, comparison, and data mining. Selected computational methods from Pattern Recognition, Machine Learning, Computational Intelligence and Data Mining, currently used in NSB studies will be introduced. Bioinformatics has become an important discipline in the intersection of computing and biology. Biological data sets produced by modern biotechnologies are very large and hence they can only be understood by using computational techniques. Starting from analysis of genetic sequences, the field has progressed towards analysis and modeling of entire biological systems. One means of analyzing is by using networks (or graphs); they have been used to model phenomena in many research domains, including computational and systems biology. The explosion in the availability of biological network data has led to the development of mathematical and computational algorithms for analyzing and modeling the data; the expectation is that the network data will be as useful as the sequence data in uncovering new biology. However, biological network research faces considerable challenges, owing not only to the incompleteness of the currently available data, but also to computational infeasibility of many graph theoretic problems. Hence, a variety of approximate (or heuristic) computational approaches have emerged for analyzing and modeling biological networks. The objective of this course is to introduce students to 1) different types and issues of bio-molecular networks, 2) databases and sources of bio-molecular data, 3) fundamental topics in networks, graph-theory, and algorithm for analyzing bio-molecular networks, 4) important computational concepts of network analysis, 5) and some existing methods for analyzing and modeling bio-molecular networks. 2 Lectures Lecture Hours and Location: Tuesday 08:30am to 11:20am --- ER 2137 Students will have the opportunity to meet with the instructor and with other students through the following: Some formal lectures every week – see schedule for times. One-to-one consultations with the instructor when required. Discussion and interaction with other students. Course notes, textbook and web page Homepage: http://cs.uwindsor.ca/~angom Recommended Textbook (no particular textbook is required): Luonan Chen, Rui-Sheng Wang, and Xiang-Sun Bio-Molecular Networks: Methods and Applications in Systems Biology, Wiley, ISBN 978-0-470-24373-2, 2009 Mark Newman, Networks: An Introduction, M. E. J. Newman, Oxford University Press, Oxford (2010) Any book on Complex Networks, Complex Systems, Social Networks Ethem Alpaydin, Introduction to Machine Learning, MIT Press, ISBN 0-262-01211-1, 2004 Trevor Hastie, Robert Tibshirani and Jerome Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition, Springer, ISBN 978-0-387-84857-0, 2009 Bjorn H. Junker and Falk Schreiber, “Analysis of Biological Networks,” Wiley, 2008 Bornholdt and Schuster (Editors), “Handbook of Graphs and Networks: From the Genome to the Internet,” Wiley, 2003. Douglas B. West, “Introduction to graph theory,” 2nd edition, Prentice Hall, 2001. Kopos (Author, Editor), “Biological Networks (Complex Systems and Interdisciplinary Science),” World Scientific Publishing Company; 1st edition, 2007. “LEDA: A Platform for Combinatorial and Geometric Computing,” by Kurt Mehlhorn, Stefan Näher, Cambridge University Press, 1999. Internet resources: o Dr. Andrew Moore’s tutorials at http://www.autonlab.org/tutorials/. o Dr. Ethem Alpaydin’s slides at http://www.cmpe.boun.edu.tr/~ethem/i2ml/. o Data Mining textbook at http://infolab.stanford.edu/~ullman/mmds.html. o Video Lectures at http://videolectures.net/Top/Computer_Science/. o And many more notes on the World Wide Web. Software resources WEKA can be downloaded from http://www.cs.waikato.ac.nz/~ml Matlab, and other programming language compilers Basic knowledge of Java, C or C++ is expected. WORK TO BE UNDERTAKEN BY STUDENTS: Preparation for lectures Attendance at all lectures is highly recommended. Some of the concepts covered are difficult and professors will attempt to present the concepts in such a way as to make them easier to understand. Students should read the course notes and textbook ahead of lectures. A detailed schedule showing the topic of each lecture is given later. Lectures are not substitutes for student reading. Students who do not read ahead may find themselves lost in the lectures. Evaluation Scheme The final mark will consist of 1. Class Participation: 10% 2. Presentation: 15% 3. Assignments: 25% 3 4. Project: 50% Students who wish to appeal a class-test, exam, or assignment mark should do so within a week of receiving the mark. If disagreements between a student and the instructor persist, then the student should wait until s/he received his final grade at the end of the semester and then follow the procedure outlined in the University Calendar for the appeal of that grade. No remarking of class tests or the final exam will be undertaken unless a formal grade appeal is submitted at the end of the semester after the student has received the final grade for the course. Numerical errors in adding marks on class tests and the final exam will be corrected when identified. There are no make-up tests, so please do not miss a test. The final letter grade will be calculated from the raw scores using the following table: 93 86 80 77 73 70 67 <100 < 93 < 86 < 80 < 77 < 73 < 70 A+ A AB+ B BC+ 63 60 57 53 50 35 <67 < 63 < 60 < 57 < 53 < 50 <35 C CD+ D DF F- POLICY ON CHEATING: The professor will put a great deal of effort into helping students to understand and to learn the material in the course. However, he will not tolerate any form of cheating. The professor will report any suspicion of cheating to the Director of the School of Computer Science. If sufficient evidence is available, the Director will begin a formal process according to the University Senate Bylaws. The instructor will not negotiate with students who are accused of cheating but will pass all information to the Director of the School of Computer Science. The following behavior will be regarded as cheating (together with other acts that would normally be regarded as cheating in the broad sense of the term): Copying assignments Allowing another student to copy an assignment from you and present it as their own work Copying from another student during a test or exam Referring to notes, textbooks, etc. during a test or exam Talking during a test or an exam Not sitting at the pre-assigned seat during a test or exam Communicating with another student in any way during a test or exam Having access to the exam/test paper prior to the exam/test Asking a teaching assistant for the answer to a question during an exam/test Presenting another’s work as your own Modifying answers after they have been marked Any other behaviour which attempts unfairly to give you an advantage over other students in the grade-assessment process Refusing to obey the instructions of the officer in charge of an examination. Students who are found guilty of any form of cheating will be given a grade of F- for the whole course. Several University of Windsor students have been caught cheating during the last few years. In most cases the evidence was sufficient to invoke a disciplinary process which resulted in various forms of punishment including letters of censure, loss of marks, failing grades, and expulsions. As example, a student who copied a project from another student and presented it as his own was expelled from the university. In course, 60-100, a student who copied answers during a final exam received a failing grade. Do not cheat, if you are caught and found guilty, you could be thrown out of the university and will have to explain why when you go looking for a job. 4 Tentative course lecture schedule: Week 1 2 3 4 5 6 7 8 Lecture Topic Introduction to Network-Based Systems Biology Topological Structure and Properties of Bio-Molecular Networks Protein Interaction Networks: Community Detection Methods Protein Interaction Networks: Network Comparion Methods Drug-Target Networks: Interaction Prediction Methods Gene Regulatory Networks: Reconstruction Methods Mining Large Scale Bio-Molecular Networks Software tools and libraries for network analysis (e.g., LEDA, Cytoscape, Pajek). Other Selected Topics in NSB