ICB Fall 2004 G4120: Introduction to Computational Biology Oliver Jovanovic, Ph.D. Columbia University Department of Microbiology Copyright © 2004 Oliver Jovanovic, All Rights Reserved. Lecture 1 Overview September 16, 2004 Growth of GenBank Lecture 1 Overview September 16, 2004 A Brief History of Computing 30,000 BC Tally systems African & European 8500 BC Prime system African 1000 BC Abacus Chinese & Babylonian 1500 Mechanical calculator Leonardo da Vinci 1621 Slide rule William Oughtred 1642 Arithmetic Machine Blaise Pascal 1822 Difference Engine Charles Babbage 1831 Computer program Lady Ada Lovelace 1936 Z1 Computer Konrad Zuse 1936 Turing Machine Alan Turing 1938 Boolean Circuits Claude Shannon 1943 COLOSSUS Alan Turing 1945 von Neumann Machine John von Neumann 1946 ENIAC J. Presper Eckert & John W. Mauchly 1947 Transistor William Shockley, John Bardeen & Walter Brattain 1958 Integrated Circuit Jack Kilby & Robert Noyce 1964 Mouse & Graphical User Interface Douglas Engelbart Lecture 1 Overview September 16, 2004 The Era of Modern Computing Lecture 1 Overview September 16, 2004 1969 1969 1971 1972 1973 1973 1973 1974 1975 1976 1978 1981 1982 1984 1984 1985 1986 1989 1993 2001 ARPAnet UNIX Email Telnet C Ethernet FTP TCP Microsoft Corporation Apple Computer Usenet IBM PC TCP/IP DNS Macintosh Windows NeXT Computer HTTP & HTML Mosaic OS X UCLA, Stanford, UC Santa Barbara & University of Utah Ken Thompson & Dennis Ritchie, Bell Laboratories Roy Tomlinson, BBN Jon Postel, BBN Dennis Ritchie & Brian Kernighan, Bell Laboratories Robert Metcalfe, Harvard University/Xerox PARC Alex McKenzie, BBN Vint Cerf & Robert Kahn Bill Gates & Paul Allen Steve Wozniak & Steve Jobs Tom Truscott, Jim Ellis & Steve Bellovin IBM Corporation ARPA Jon Postel Apple Computer Microsoft Corporation Steve Jobs Tim Breners-Lee, CERN Marc Andreessen Apple Computer History of Computational Biology 1869 1924 1928 1944 1948 1949 1953 1955 1961 1966 1970 1970 1971 1973 1972 1977 1977 1981 1982 1988 1988 1988 1990 1994 1997 DNA Chromosomal DNA Transforming principle DNA transformation Information Theory Chargaff’s Rule Double helix Protein sequencing Codons Genetic code Restriction enzyme Needleman-Wunsch MEDLINE Brookhaven Protein Data Bank Recombinant DNA DNA sequencing Staden programs Smith-Waterman GenBank NCBI FASTA DNA Strider BLAST DNA computer PubMed Johann Friedrich Miescher Robert Feulgen Franklin Griffith Oswald Avery, Maclyn McCarty & Colin MacLeod Claude Shannon Erwin Chargaff James Watson & Francis Crick Fred Sanger Sidney Brenner & Francis Crick Marshall Nirenberg, Robert Holley & Har Khorana Hamilton Smith, Johns Hopkins S. Needleman & C. Wunsch NIH/NLM Brookhaven National Laboratory Stanley Cohen & Herbert Boyer Allan Maxam & Walter Gilbert/Frederick Sanger Roger Staden Temple Smith & Michael Waterman LANL/EMBL/NCBI NIH/NLM William Pearson & David Lipman Christian Marck Stephen Altschul & David Lipman, NCBI Leonard Adelman NCBI Lecture 1 Overview September 16, 2004 The Genomics Era Overview of Published Genomes Lecture 1 Overview September 16, 2004 1980 øX174 (5,386 bp) 1981 Human mitochondria (16,569 bp) 1981 1990 Poliovirus (7,440 bp) Human Genome Project 1992 The Institute for Genomic Research 1994 1995 1995 RK2 (60,099 bp) Haemophilus influenzae (1.8 Mb) Mycoplasma genitalium (0.58 Mb) 1996 Methanococcus jannaschii (1.6 Mb) 1996 1997 Saccharomyces cerevisiae (12.1 Mb) Escherichia coli (4.7 Mb) 1998 Celera, Inc. 1998 2000 2000 2001 2001 Caenorhabditis elegans (97 Mb) Drosophila melanogaster (180 Mb) Arabidopsis thaliana (115 Mb) Salmonella typhimurium (4.8 Mb) Homo sapiens (2.9 Gb) 2002 2003 2004 Mus musculus (2.9 Gb) Nanoarchaeum equitans (0.49 Mb) Dictyostelium discoideum (34 Mb) Growth of Sequenced Prokaryotic Genomes Source: David W. Ussery, Genome Update: 161 prokaryotic genomes sequenced, and counting, Microbiology. 2004 Feb;150 (Pt. 2): 261-3. Evolution of Operating Systems Unix Apple Windows Lecture 1 Overview September 16, 2004 Macintosh OS X Frameworks Lecture 1 Overview September 16, 2004 Macintosh OS X and Sequence Analysis SeqMatrix E. coli promoter output: DNA Location: 3,075 Spacer Length: 11 Similarity Score: 55.29 CGACATTGCTTGACCC <11> GCGTGTTCAATTCG Lecture 1 Overview September 16, 2004 Macintosh OS X and Phylogenetic Analysis Lecture 1 Overview September 16, 2004 Macintosh OS X and Presentations Lecture 1 Overview September 16, 2004 Macintosh OS X and Multimedia L27758. Birmingham IncP-a...[gi:508311] LOCUS DEFINITION Lecture 1 Overview September 16, 2004 Related Sequences, PubMed, Taxonomy BIACOMGEN 60099 bp DNA linear BCT 08-JUL-1994 Birmingham IncP-alpha plasmid (R18, R68, RK2, RP1, RP4) complete genome. ACCESSION L27758 VERSION L27758.1 GI:508311 KEYWORDS complete genome. SOURCE Birmingham IncP-alpha plasmid (plasmid Birmingham IncP-alpha plasmid, kingdom Prokaryotae) DNA. ORGANISM Birmingham IncP-alpha plasmid broad host range plasmids. REFERENCE 1 (bases 1 to 60099) AUTHORS Pansegrau,W., Lanka,E., Barth,P.T., Figurski,D.H., Guiney,D.G., Haas,D., Helinski,D.R., Schwab,H., Stanisich,V.A. and Thomas,C.M. TITLE Complete nucleotide sequence of Birmingham IncP-alpha plasmids: compilation and comparative analysis JOURNAL J. Mol. Biol. 239, 623-663 (1994) MEDLINE 94285211 FEATURES Location/Qualifiers source 1..60099 /organism="Birmingham IncP-alpha plasmid" /plasmid="Birmingham IncP-alpha plasmid" /db_xref="taxon:35419" BASE COUNT 10839 a 18681 c 18448 g 12131 t ORIGIN 1 ttcacccccg aacacgagca cggcacccgc gaccactatg ccaagaatgc ccaaggtaaa 61 aattgccggc cccgccatga agtccgtgaa tgccccgacg gccgaagtga agggcaggcc 121 gccacccagg ccgccgccct cactgcccgg cacctggtcg ctgaatgtcg atgccagcac 181 ctgcggcacg tcaatgcttc cgggcgtcgc gctcgggctg atcgcccatc ccgttactgc 241 cccgatcccg gcaatggcaa ggactgccag cgccgcgatg aggaagcggg tgccccgctt 301 cttcatcttc gcgcctcggg cctcgaggcc gcctacctgg gcgaaaacat cggtgtttgt etc. Macintosh OS X Startup Startup Sequence BootROM → Open Firmware → Startup Manager → BootX → Kernel extensions → System and kernel initialization → StartupItems → Log in Useful Startup Keys Hold X key at startup: boots into OS X (if set to boot from OS 9) Hold C key at startup: boots from CD drive (if a bootable CD is in it) Hold Mouse button at startup: ejects CD from CD drive Hold T key at startup: boots into FireWire Target Mode Hold Shift key at startup: boots into Safe Mode Preventive Maintenance OS X 10.3 features a journaled file system which automatically corrects problems and defragments files, so file system maintenance is generally not required. Normally, other daily, weekly and monthly maintenance tasks are scheduled to automatically run between 3:15 and 5:30 A.M. These tasks cannot run if the computer is always turned off or asleep at these times, so it is wise to occasionally leave it plugged in and turned on overnight, but set to never sleep in the Energy Saver System Preference pane. In addition, to head off possible file permissions problems, run Disk Utility (located in /Applications/Utilities) at least once a month and Repair Disk Permissions on the startup volume. Lecture 1 Overview September 16, 2004 Macintosh OS X Finder Finder Finder Preferences… in the Finder menu lets you alter the Finder’s behavior: in General, try Always open folders in a new window and Spring-loaded folders and windows, in Advanced, try Always show file extensions. Install all new applications in the Applications folder, save all other files in your Home folder (found in the /Users folder). Useful Finder Key Combinations Hold Option key while dragging a file to duplicate it Hold Command and Option keys down while dragging a file to create an alias Hold Option while double clicking a folder to close the previous folder Press Shift while clicking to select more than one item In List view, press Command while clicking for discontinuous selection Press Control while clicking to get a contextual menu Press Shift and Command and N to create a new folder Press Command and I with an item selected to Show Info Press Command and K to connect to a server Lecture 1 Overview September 16, 2004 Help Press Command and ? to get Mac Help, or select it from the Help menu. It has extensive documentation on OS X, your computer, and various applications. See the Shortcuts section in the List of Topics (and the handout) for other useful key combinations. The Toolbar, Sidebar and Dock Red button to close windows, Yellow button to dock windows, Green button to resize windows (note that document windows with unsaved changes will have a dot in the Red button). Toolbar • Back/Forward Arrows; Icon, List, Column Views; Shortcuts and Search • Drag items to the Toolbar to add them (hold Command while dragging to rearrange or remove an item). • To hide the Toolbar, click the Clear lozenge in the upper right corner (or show it again) • Select Customize Toolbar… from the Finder View menu to modify the Toolbar Sidebar • Drag folders into the sidebar to add them, out of the sidebar to remove them • Eject CDs and network mounted volumes by clicking on the eject icon next to them Dock • Drag items on to add, off to remove, or left or right to rearrange • Can add a folder or folders with aliases to your favorite applications • Click and hold to get application options, or hierarchical submenus for a folder or hard drive item • Press Command and Option and click an item to bring it forward and hide everything except it (click on an item in the dock to make it visible again, or select Show All from the Finder menu to make everything visible again) • To hide the Dock, press Command and Option and D (same keys to show it again) Press and hold Command, then press and keep pressing Tab to cycle through all open applications in the one by one. To cycle backwards, press Command, Shift and Tab. Lecture 1 Overview September 16, 2004 System Preferences Setting System Preferences Use Show All to view all preference panes, or drag commonly used Preferences to top Dock: set Dock preferences Appearance: recommend Font smoothing style: Medium, can adjust interface appearance and behavior Energy Saver: lets you adjust when your computer goes to sleep, with different settings for battery and power adapter use Mouse: allows you to adjust double click and tracking speed Displays: allows you to adjust screen resolution and colors Network: set TCP/IP and AppleTalk settings for Built-in Ethernet and AirPort. Can create multiple settings for different locations from the popup Location menu (select New Location or Edit Locations) Sharing: turn various file sharing options on or off, activate/deactivate Firewall Software Update: check at least once a month for critical updates Classic: controls OS 9 emulation (Classic mode) Startup Disk: controls the system you will start up with (OS X or OS 9) Lecture 1 Overview September 16, 2004 Health Sciences Internet Setup Overview The CUMC (Columbia University Medical Center) campus network has two core routers, both redundantly linked to a router in each building. Each floor of a building then has its own router. Several microwave links and a high speed cable connect the core routers to the downtown Columbia campus, which has multiple high speed cable connections to the rest of the Internet. The CUMC network is walled off from the rest of the Internet by a firewall and is centrally administered by a group called CUbhis (Columbia University Biomedical and Health Information Services). See http://www.cubhis.org/ for details.You will need an IP Address for Internet access. One has been provided for your use in this classroom, but your lab may need to provide you with another for use there. For Internet access from a dorm room, see http:// library.cpmc.columbia.edu/cait/register.html for details. If you have problems getting connected, you can also try calling the CUMC campus computer help line at 5-HELP. Network Settings IP Address: 156.111.x.x or 156.145.x.x Subnet Mask: 255.255.255.0 Router: 156.111.x.1 or 156.145.x.1 DNS Servers 156.111.60.150 156.111.70.150 Search Domains (Optional) columbia.edu Lecture 1 Overview September 16, 2004 Columbia University Email Setup Mail Preferences for Columbia University Accounts 1) Open Mail, then select Preferences from the Mail menu 2) Select Accounts, then click on the + symbol to create a new account 3) Select Account Type: IMAP, fill out the other fields as follows: Email Address: your full email address; Full Name: your full name; Incoming Mail Server: imap.columbia.edu; Username: your Columbia UNI (username) only; Password: leave this blank 4) Select Add Server... from Outgoing Mail Server (SMTP): then fill out the fields as follows: Outgoing Mail Server: send.columbia.edu; Server port: 25; Use Secure Sockets Layer (SSL) must be checked; Authentication: Password; Username: your Columbia UNI (username) only; Password: leave this blank 5) Click OK 7) Click on the Advanced tab, check Use SSL (the port should change to 993), and select Authentication: Password (then provide your password when asked, it is stored securely in the Keychain if “remembered”) 8) Click OK 9) Select Composing in Mail Preferences, then click Configure LDAP… 10) Click on the + symbol to add a new LDAP server 11) Enter Name: Columbia LDAP; Server: ldap.columbia.edu 12) Click Save, then click Close Lecture 1 Overview September 16, 2004 The first time you use Mail you should see a pop-up asking whether you want to save the certificate to the Keychain. This is the security certificate it needs to connect securely, and you can answer yes. Detailed instructions, with screenshots, are available at: http://www.columbia.edu/acis/email/pcmail/applemail/config.html References Recommended Macintosh OS X Books Mac OS X Unleashed,Third Edition by John Ray & William C. Ray Mac OS X:The Missing Manual, Panther Edition by David Pogue Mac OS X Panther Killer Tips by Scott Kelby Recommended Computational Biology Books Fundamental Concepts of Bioinformatics by Dan E. Krane & Michael L. Rayme Developing Bioinformatics Computer Skills by Cynthia Gibas & Per Jambek Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins,Third Edition edited by Andreas D. Baxevanis & B. F. Francis Ouellette BLAST: An Essential Guide to the BASIC Local Alignment Search Tool by Ian Korf, Mark Yandell & Joseph Bedell Lecture 1 Overview September 16, 2004