Topics in Computational Biology Course Approval Information: As high-throughput methods for biological data generation become more prominent and the amount and complexity of the data increase, computational methods have become essential to biological research in this post-genome age. In turn, biological problems are motivating innovations in computational sciences, such as computer science, information science, mathematics, operations research and statistics. There is high demand for scientists who are capable of bridging these disciplines. This course aims to create an environment that transcends traditional departmental boundaries and facilitate communications between researchers from life sciences and computational sciences. Through reading and discussion of literature, small research and data analysis projects, students will be introduced to current problems (e.g., regulatory motif finding, microarray data analysis, biomedical literature mining, signal transduction network modeling, cis-regulatory network discovery, etc.) in computational biology and some of the methods for studying them. Students are required to give a presentation on published research or on your present research. Each presentation will normally last 60 minutes with additional time for questions and discussions. Each presenter must submit a written report on your presentation. The report should contain 1) the background of the research, 2) the motivation for the research, 3) the approach, 4) the results, and 5) criticisms and/or suggested ways to improve the discussed methods. By mid-term, each student should propose a small research project for the course. Teamwork is strongly encouraged. Grading will be based on the project and on class participation. Suggested paper list: Liu et al. (2002) An Algorithm for Finding Protein-DNA Interaction Sites with Applications to Chromatin Immunoprecipitation Microarray Experiments. Nat Biotech Zhou et al. (2004) CisModule: De Novo discovery of cis-regulatory modules by hierarchical mixture modeling. PNAS. Frazer et al. (2003) Cross-Species Sequence Comparisons: A Review of Methods and Available Resources. Genome Research. Vol 13, Issue 1, 1-12. Ren et al. (2000) Genome-wide location and function of DNA-binding proteins. Science 290: 2306-2309. Conlon et al. (2003) Motif regressor, PNAS 100: 3339. Lee et al (2002) Transcriptional regulatory networks in S. cerevesiae. Science 298:799 Siepel A, Haussler D. (2004). Combining phylogenetic and hidden Markov models in biosequence analysis. J. Comput Biol. 11(2-3): 413-28. Wang and Stormo (2003) Combining phylogenetic data with co-regulated genes to identify regulatory motifs. Bioinfo, 19:2369. Boffelli et al. (2004) Comparative genomics at the vertebrate extremes. Nat Rev Genet. 2004 Jun;5(6):456-65. Miller et al. (2004) Comparative genomics. Annu Rev Genomics Hum Genet. 5:15-56. Tusher, V. G., Tibshirani, R., and Chu, G. (2001) Significance analysis of microarrays applied to the ionizing radiation response. Proc. Natl. Acad. Sci. USA, 98: 5116-5121 Storey, J.D. and Tibshirani, R. (2003) Statistical significance for genomewide studies. Proc. Natl. Acad. Sci. USA, 100:9440-9445. Stuart et al (2003) A gene-coexpression network for global discovery of conserved genetic modules. Science. 302(5643):249-255. Friedman, N. (2004) Inferring cellular networks using probabilistic graphical models. Science. vol 303, 799-805. Pe'er D. (2005) Bayesian network analysis of signaling networks: a primer. Sci STKE. 281, pl4. Raychaudhuri et al. (2002) Associating Genes with Gene Ontology Codes Using a Maximum Entropy Analysis of Biomedical Literature. Genome Research. 12(1):203-214. Koike et al. (2005) Automatic extraction of gene/protein biological functions from biomedical text. Bioinformatics, 21(7): 1227 - 1236. Westhof and Fritsch. (2000) RNA folding: beyond Watson-Crick pairs. Structure, 8, R55-R65. Troyanskaya et al (2003) A Bayesian framework for combining heterogeneous data sources for gene function prediction (in Saccharomyces cerevisiae). PNAS. 100(14):8348-8353. Segal et al (2003) Module Networks: Identifying Regulatory Modules and their Condition Specific Regulators from Gene Expression Data. Nature Genetics. 34(2): 166-76. Ge et al. (2003) Integrating 'omic' information: a bridge between genomics and systems biology. Trends in Genetics. 19(10):551-60.