in PDF format - Department of Microbiology & Immunology

ICB Fall 2004
G4120: Introduction to
Computational Biology
Oliver Jovanovic, Ph.D.
Columbia University
Department of Microbiology
Copyright © 2004 Oliver Jovanovic, All Rights Reserved.
Lecture 1
Overview
September 16, 2004
Growth of GenBank
Lecture 1
Overview
September 16, 2004
A Brief History of Computing
30,000 BC
Tally systems
African & European
8500 BC
Prime system
African
1000 BC
Abacus
Chinese & Babylonian
1500
Mechanical calculator
Leonardo da Vinci
1621
Slide rule
William Oughtred
1642
Arithmetic Machine
Blaise Pascal
1822
Difference Engine
Charles Babbage
1831
Computer program
Lady Ada Lovelace
1936
Z1 Computer
Konrad Zuse
1936
Turing Machine
Alan Turing
1938
Boolean Circuits
Claude Shannon
1943
COLOSSUS
Alan Turing
1945
von Neumann Machine
John von Neumann
1946
ENIAC
J. Presper Eckert & John W. Mauchly
1947
Transistor
William Shockley, John Bardeen & Walter Brattain
1958
Integrated Circuit
Jack Kilby & Robert Noyce
1964
Mouse & Graphical User Interface
Douglas Engelbart
Lecture 1
Overview
September 16, 2004
The Era of Modern Computing
Lecture 1
Overview
September 16, 2004
1969
1969
1971
1972
1973
1973
1973
1974
1975
1976
1978
1981
1982
1984
1984
1985
1986
1989
1993
2001
ARPAnet
UNIX
Email
Telnet
C
Ethernet
FTP
TCP
Microsoft Corporation
Apple Computer
Usenet
IBM PC
TCP/IP
DNS
Macintosh
Windows
NeXT Computer
HTTP & HTML
Mosaic
OS X
UCLA, Stanford, UC Santa Barbara & University of Utah
Ken Thompson & Dennis Ritchie, Bell Laboratories
Roy Tomlinson, BBN
Jon Postel, BBN
Dennis Ritchie & Brian Kernighan, Bell Laboratories
Robert Metcalfe, Harvard University/Xerox PARC
Alex McKenzie, BBN
Vint Cerf & Robert Kahn
Bill Gates & Paul Allen
Steve Wozniak & Steve Jobs
Tom Truscott, Jim Ellis & Steve Bellovin
IBM Corporation
ARPA
Jon Postel
Apple Computer
Microsoft Corporation
Steve Jobs
Tim Breners-Lee, CERN
Marc Andreessen
Apple Computer
History of Computational Biology
1869
1924
1928
1944
1948
1949
1953
1955
1961
1966
1970
1970
1971
1973
1972
1977
1977
1981
1982
1988
1988
1988
1990
1994
1997
DNA
Chromosomal DNA
Transforming principle
DNA transformation
Information Theory
Chargaff’s Rule
Double helix
Protein sequencing
Codons
Genetic code
Restriction enzyme
Needleman-Wunsch
MEDLINE
Brookhaven Protein Data Bank
Recombinant DNA
DNA sequencing
Staden programs
Smith-Waterman
GenBank
NCBI
FASTA
DNA Strider
BLAST
DNA computer
PubMed
Johann Friedrich Miescher
Robert Feulgen
Franklin Griffith
Oswald Avery, Maclyn McCarty & Colin MacLeod
Claude Shannon
Erwin Chargaff
James Watson & Francis Crick
Fred Sanger
Sidney Brenner & Francis Crick
Marshall Nirenberg, Robert Holley & Har Khorana
Hamilton Smith, Johns Hopkins
S. Needleman & C. Wunsch
NIH/NLM
Brookhaven National Laboratory
Stanley Cohen & Herbert Boyer
Allan Maxam & Walter Gilbert/Frederick Sanger
Roger Staden
Temple Smith & Michael Waterman
LANL/EMBL/NCBI
NIH/NLM
William Pearson & David Lipman
Christian Marck
Stephen Altschul & David Lipman, NCBI
Leonard Adelman
NCBI
Lecture 1
Overview
September 16, 2004
The Genomics Era
Overview of Published Genomes
Lecture 1
Overview
September 16, 2004
1980
øX174 (5,386 bp)
1981
Human mitochondria (16,569 bp)
1981
1990
Poliovirus (7,440 bp)
Human Genome Project
1992
The Institute for Genomic Research
1994
1995
1995
RK2 (60,099 bp)
Haemophilus influenzae (1.8 Mb)
Mycoplasma genitalium (0.58 Mb)
1996
Methanococcus jannaschii (1.6 Mb)
1996
1997
Saccharomyces cerevisiae (12.1 Mb)
Escherichia coli (4.7 Mb)
1998
Celera, Inc.
1998
2000
2000
2001
2001
Caenorhabditis elegans (97 Mb)
Drosophila melanogaster (180 Mb)
Arabidopsis thaliana (115 Mb)
Salmonella typhimurium (4.8 Mb)
Homo sapiens (2.9 Gb)
2002
2003
2004
Mus musculus (2.9 Gb)
Nanoarchaeum equitans (0.49 Mb)
Dictyostelium discoideum (34 Mb)
Growth of Sequenced Prokaryotic Genomes
Source: David W. Ussery, Genome Update: 161
prokaryotic genomes sequenced, and counting,
Microbiology. 2004 Feb;150 (Pt. 2): 261-3.
Evolution of Operating Systems
Unix
Apple
Windows
Lecture 1
Overview
September 16, 2004
Macintosh OS X Frameworks
Lecture 1
Overview
September 16, 2004
Macintosh OS X and
Sequence Analysis
SeqMatrix E. coli promoter output:
DNA Location: 3,075
Spacer Length: 11
Similarity Score: 55.29
CGACATTGCTTGACCC <11> GCGTGTTCAATTCG
Lecture 1
Overview
September 16, 2004
Macintosh OS X and
Phylogenetic Analysis
Lecture 1
Overview
September 16, 2004
Macintosh OS X and
Presentations
Lecture 1
Overview
September 16, 2004
Macintosh OS X and
Multimedia
L27758. Birmingham IncP-a...[gi:508311]
LOCUS
DEFINITION
Lecture 1
Overview
September 16, 2004
Related Sequences, PubMed, Taxonomy
BIACOMGEN
60099 bp
DNA
linear
BCT 08-JUL-1994
Birmingham IncP-alpha plasmid (R18, R68, RK2, RP1, RP4) complete
genome.
ACCESSION
L27758
VERSION
L27758.1 GI:508311
KEYWORDS
complete genome.
SOURCE
Birmingham IncP-alpha plasmid (plasmid Birmingham IncP-alpha
plasmid, kingdom Prokaryotae) DNA.
ORGANISM Birmingham IncP-alpha plasmid
broad host range plasmids.
REFERENCE
1 (bases 1 to 60099)
AUTHORS
Pansegrau,W., Lanka,E., Barth,P.T., Figurski,D.H., Guiney,D.G.,
Haas,D., Helinski,D.R., Schwab,H., Stanisich,V.A. and Thomas,C.M.
TITLE
Complete nucleotide sequence of Birmingham IncP-alpha plasmids:
compilation and comparative analysis
JOURNAL
J. Mol. Biol. 239, 623-663 (1994)
MEDLINE
94285211
FEATURES
Location/Qualifiers
source
1..60099
/organism="Birmingham IncP-alpha plasmid"
/plasmid="Birmingham IncP-alpha plasmid"
/db_xref="taxon:35419"
BASE COUNT
10839 a 18681 c 18448 g 12131 t
ORIGIN
1 ttcacccccg aacacgagca cggcacccgc gaccactatg ccaagaatgc ccaaggtaaa
61 aattgccggc cccgccatga agtccgtgaa tgccccgacg gccgaagtga agggcaggcc
121 gccacccagg ccgccgccct cactgcccgg cacctggtcg ctgaatgtcg atgccagcac
181 ctgcggcacg tcaatgcttc cgggcgtcgc gctcgggctg atcgcccatc ccgttactgc
241 cccgatcccg gcaatggcaa ggactgccag cgccgcgatg aggaagcggg tgccccgctt
301 cttcatcttc gcgcctcggg cctcgaggcc gcctacctgg gcgaaaacat cggtgtttgt
etc.
Macintosh OS X Startup
Startup Sequence
BootROM → Open Firmware → Startup Manager → BootX → Kernel extensions → System and
kernel initialization → StartupItems → Log in
Useful Startup Keys
Hold X key at startup: boots into OS X (if set to boot from OS 9)
Hold C key at startup: boots from CD drive (if a bootable CD is in it)
Hold Mouse button at startup: ejects CD from CD drive
Hold T key at startup: boots into FireWire Target Mode
Hold Shift key at startup: boots into Safe Mode
Preventive Maintenance
OS X 10.3 features a journaled file system which automatically corrects problems and defragments
files, so file system maintenance is generally not required. Normally, other daily, weekly and
monthly maintenance tasks are scheduled to automatically run between 3:15 and 5:30 A.M. These
tasks cannot run if the computer is always turned off or asleep at these times, so it is wise to
occasionally leave it plugged in and turned on overnight, but set to never sleep in the Energy Saver
System Preference pane. In addition, to head off possible file permissions problems, run Disk Utility
(located in /Applications/Utilities) at least once a month and Repair Disk Permissions on the
startup volume.
Lecture 1
Overview
September 16, 2004
Macintosh OS X Finder
Finder
Finder Preferences… in the Finder menu lets you alter the Finder’s behavior: in General, try
Always open folders in a new window and Spring-loaded folders and windows,
in Advanced, try Always show file extensions.
Install all new applications in the Applications folder, save all other files in your Home folder
(found in the /Users folder).
Useful Finder Key Combinations
Hold Option key while dragging a file to duplicate it
Hold Command and Option keys down while dragging a file to create an alias
Hold Option while double clicking a folder to close the previous folder
Press Shift while clicking to select more than one item
In List view, press Command while clicking for discontinuous selection
Press Control while clicking to get a contextual menu
Press Shift and Command and N to create a new folder
Press Command and I with an item selected to Show Info
Press Command and K to connect to a server
Lecture 1
Overview
September 16, 2004
Help
Press Command and ? to get Mac Help, or select it from the Help menu. It has extensive
documentation on OS X, your computer, and various applications. See the Shortcuts section in
the List of Topics (and the handout) for other useful key combinations.
The Toolbar, Sidebar and Dock
Red button to close windows, Yellow button to dock windows, Green button to resize
windows (note that document windows with unsaved changes will have a dot in the Red button).
Toolbar
• Back/Forward Arrows; Icon, List, Column Views; Shortcuts and Search
• Drag items to the Toolbar to add them (hold Command while dragging to rearrange or
remove an item).
• To hide the Toolbar, click the Clear lozenge in the upper right corner (or show it again)
• Select Customize Toolbar… from the Finder View menu to modify the Toolbar
Sidebar
• Drag folders into the sidebar to add them, out of the sidebar to remove them
• Eject CDs and network mounted volumes by clicking on the eject icon next to them
Dock
• Drag items on to add, off to remove, or left or right to rearrange
• Can add a folder or folders with aliases to your favorite applications
• Click and hold to get application options, or hierarchical submenus for a folder or hard drive item
• Press Command and Option and click an item to bring it forward and hide everything
except it (click on an item in the dock to make it visible again, or select Show All from the Finder
menu to make everything visible again)
• To hide the Dock, press Command and Option and D (same keys to show it again)
Press and hold Command, then press and keep pressing Tab to cycle through all open
applications in the one by one. To cycle backwards, press Command, Shift and Tab.
Lecture 1
Overview
September 16, 2004
System Preferences
Setting System Preferences
Use Show All to view all preference panes, or drag commonly used Preferences to top
Dock: set Dock preferences
Appearance: recommend Font smoothing style: Medium, can adjust interface appearance and
behavior
Energy Saver: lets you adjust when your computer goes to sleep, with different settings for
battery and power adapter use
Mouse: allows you to adjust double click and tracking speed
Displays: allows you to adjust screen resolution and colors
Network: set TCP/IP and AppleTalk settings for Built-in Ethernet and AirPort. Can create
multiple settings for different locations from the popup Location menu (select New Location or
Edit Locations)
Sharing: turn various file sharing options on or off, activate/deactivate Firewall
Software Update: check at least once a month for critical updates
Classic: controls OS 9 emulation (Classic mode)
Startup Disk: controls the system you will start up with (OS X or OS 9)
Lecture 1
Overview
September 16, 2004
Health Sciences Internet Setup
Overview
The CUMC (Columbia University Medical Center) campus network has two core routers, both
redundantly linked to a router in each building. Each floor of a building then has its own router.
Several microwave links and a high speed cable connect the core routers to the downtown
Columbia campus, which has multiple high speed cable connections to the rest of the Internet.
The CUMC network is walled off from the rest of the Internet by a firewall and is centrally
administered by a group called CUbhis (Columbia University Biomedical and Health Information
Services). See http://www.cubhis.org/ for details.You will need an IP Address for Internet
access. One has been provided for your use in this classroom, but your lab may need to provide
you with another for use there. For Internet access from a dorm room, see http://
library.cpmc.columbia.edu/cait/register.html for details. If you have problems getting
connected, you can also try calling the CUMC campus computer help line at 5-HELP.
Network Settings
IP Address: 156.111.x.x or 156.145.x.x
Subnet Mask: 255.255.255.0
Router: 156.111.x.1 or 156.145.x.1
DNS Servers
156.111.60.150
156.111.70.150
Search Domains (Optional)
columbia.edu
Lecture 1
Overview
September 16, 2004
Columbia University Email Setup
Mail Preferences for Columbia University Accounts
1) Open Mail, then select Preferences from the Mail menu
2) Select Accounts, then click on the + symbol to create a new account
3) Select Account Type: IMAP, fill out the other fields as follows: Email Address: your full email address;
Full Name: your full name; Incoming Mail Server: imap.columbia.edu; Username: your Columbia
UNI (username) only; Password: leave this blank
4) Select Add Server... from Outgoing Mail Server (SMTP): then fill out the fields as follows: Outgoing
Mail Server: send.columbia.edu; Server port: 25; Use Secure Sockets Layer (SSL) must be checked;
Authentication: Password; Username: your Columbia UNI (username) only; Password: leave this
blank
5) Click OK
7) Click on the Advanced tab, check Use SSL (the port should change to 993), and select
Authentication: Password (then provide your password when asked, it is stored securely in the
Keychain if “remembered”)
8) Click OK
9) Select Composing in Mail Preferences, then click Configure LDAP…
10) Click on the + symbol to add a new LDAP server
11) Enter Name: Columbia LDAP; Server: ldap.columbia.edu
12) Click Save, then click Close
Lecture 1
Overview
September 16, 2004
The first time you use Mail you should see a pop-up asking whether you want to save the certificate
to the Keychain. This is the security certificate it needs to connect securely, and you can answer yes.
Detailed instructions, with screenshots, are available at:
http://www.columbia.edu/acis/email/pcmail/applemail/config.html
References
Recommended Macintosh OS X Books
Mac OS X Unleashed,Third Edition by John Ray & William C. Ray
Mac OS X:The Missing Manual, Panther Edition by David Pogue
Mac OS X Panther Killer Tips by Scott Kelby
Recommended Computational Biology Books
Fundamental Concepts of Bioinformatics by Dan E. Krane & Michael L. Rayme
Developing Bioinformatics Computer Skills by Cynthia Gibas & Per Jambek
Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins,Third Edition
edited by Andreas D. Baxevanis & B. F. Francis Ouellette
BLAST: An Essential Guide to the BASIC Local Alignment Search Tool
by Ian Korf, Mark Yandell & Joseph Bedell
Lecture 1
Overview
September 16, 2004