Powerpoint - Wishart Research Group

advertisement
3D Structure
Visualizing, Comparing, Classifying
David Wishart
Athabasca 3-41
david.wishart@ualberta.ca
Outline & Objectives*
•
•
•
•
•
•
Visualization Programs
Vectors & Matrices
Difference Distance Matrices
Molecular Superposition
Measuring Superposition
Classifying 3D Structures
PDB Viewers
Jmol*
• Java-based program
• Open source applet and application
– Compatible with Linux, MacOS, Windows
• Menus access by clicking on Jmol
icon on lower right corner of applet
• Works with all major web browsers
– Internet Explorer (Win32)
– Mozilla/Firefox (Win32, OSX, *nix)
– Safari (Mac OS X) and Opera 7.5.4
WebMol*
WebMol*
• Both a Java Applet and a downloadable
application
• Offers many tools including distance,
angle, dihedral angle measurements,
detection of steric conflicts, interactive
Ramachandran plot, diff. distance plot
• Compatible with most Java (1.3+) enabled
browsers including:
– Internet Explorer
– Safari on Mac OS
– Mozilla 1.6/Firefox on Linux (Redhat 8.0)
PDB SimpleViewer
Requires Java WebStart (~30 sec install)
Chime*
Chime*
• http://www.umass.edu/microbio/chime/neccsoft.h
tm#download_install
• Among first PDB viewing programs
with limited manipulation capacity
• Uses Rasmol for its back end source
• View both large and small molecules
• Browser Plug-in (Like PDF reader)
• Interesting from historical
perspective (now mostly phased out)
Protein Explorer (Chime)
Protein Explorer*
• http://www.umass.edu/microbio/chime/pe_beta/
pe/protexpl/
• Uses Chime or Jmol for its back-end
• Very flexible, user friendly, well
documented, offers morphing, sequence
structure interface, comparisons, contextdependent help, smart zooming, off-line
• Browser Plug-in (Like PDF reader)
• Compatible with Netscape (Mac & Win)
QuickPDB
Quick PDB*
• http://www.sdsc.edu/pb/Software.html
• Very simple viewing program with
limited manipulation and very limited
rendering capacity -- Very fast
• Java Applet (Source code available)
• Compatible with most browsers and
computer platforms
Rasmol
Rasmol*
• http://www.umass.edu/microbio/rasmol/
• Very simple viewing program with limited
manipulation capacity, easy to use!
• “Grand-daddy” of all visual freeware
• Runs as installed “stand-alone” program
• Source code available
• Runs on Mac, Windows, Linux, SGI and
most other UNIX platforms
B (Biomer)
Biomer (B)
• http://casegroup.rutgers.edu/Biomer/index.html
• Very sophisticated molecular rendering
and modelling package for both large
and small molecules (kind of rough)
• Supports molecular dynamics & En. min
• Written in Java (source code available)
• Can run as an applet or stand-alone
• Compatible on most platforms
Swiss PDB Viewer
Swiss PDB Viewer*
• http://spdbv.vital-it.ch/
• Among most sophisticated molecular
rendering, manipulation and modelling
packages (commercial or freeware)
• Supports threading, hom. modelling,
energy minimization, seq/struc interface
• Stand-alone version only
• Compatible on Mac, Win, Linux, SGI
Swiss PDB Tutorial*
http://spdbv.vital-it.ch/TheMolecularLevel/SPVTut/index.html
MolMol
MolMol*
• http://www.mol.biol.ethz.ch/wuthrich/software/molmol/
• Very sophisticated molecular rendering,
and manipulation package (among the
best graphics of all freeware)
• Special focus on NMR compatibility,
supports many calculations/plots
• Stand-alone version only
• Compatible on Win, Unix (nearly all)
Summary*
Mac Win Unix Rendr SeqView Super E Min Modeling
Rasmol
+
+
+
++
-
-
-
-
Chime
+
+
-
+
-
-
-
-
Prot. Expl.
+
+
-
++
+
+
-
-
Quick PDB
+
+
+
+
+
-
-
-
Biomer
+
+
+
++
-
+
+
+
SwP Viewer
+
+
+
+++
+
+
+
+
MolMol
-
+
+
+++
-
+
-
+
Visualization Hub
http://www.umass.edu/microbio/chime/top5.htm
Graphics Formats
•
•
•
•
•
•
•
•
GIF
JPEG
PNG
TIFF (Tag Image)
BMP
EPS
PS
RGP (SGI)
Graphics Formats*
• GIF (Graphical Interchange Format)
– pronounced “JIF”
– introduced in 1987 by CompuServe
– handles 8 bit colour (256 colours)
– lossy compression (up to 10 X)
– best for drawings, simple B+W or colour
diagrams, images with hard edges
– supported by Perl graphics library (GD.pm)
– supports animation & transparency
Graphics Formats*
• JPEG (Joint Photographic Experts Group)
– pronounced “JAY-peg”
– exploits eye’s poor perception of small
changes in colour variation
– handles 24 bit colour (1.6 million colours)
– allows adjustable lossy compression
– best for colour pictures of real objects
with varied colour, shadow, fuzzy edges
– among most common web image formats
Graphics Formats*
• PNG (Portable Network Graphics)
– designed to replace GIF and TIFF
– supports lossless compression
– supports 24 bit, grayscale and 8 bit
– supports transparency & interlacing
– offers better compression than GIF (15%)
– supported by new GD.pm Perl library
– problems with many early browsers in
viewing PNG (now fixed)
PovRay (www.povray.org)
Aliasing & Antialiasing*
True Image
Aliased Image
Anti-aliased Image
Ray Casting (from 3D to 2D)*
• Ray = beam of light
• For each pixel on
screen, cast ray from
eye thru pixel
• Test every object in
scene to see if ray
intersects object
• Each ray intersection
nearest to eye is made
visible, color pixel
Ray Tracing & Reflection*
• Used to determine
surface appearance
• Begins with ray
casting, determine
intersects, then
recursively sends
2ndary rays to see
which objects reflect,
which are transparent,
which absorb, etc.
Shadowing*
• Uses ray tracing
algorithm
• Sends out 2ndary
rays towards light
sources to see if
opaque objects are
in the way, if so,
then surface is in
shadow
• “shadow feelers”
HAV-3C Protease - Alan Gibbs
Outline
•
•
•
•
•
•
Visualization Programs
Vectors & Matrices
Difference Distance Matrices
Molecular Superposition
Measuring Superposition
Classifying 3D Structures
Vectors Define Bonds and
Atomic Positions
O
H3N+
z
y
x
Origin
H
CO bond
O
R
Review – Vectors*
z
(1,2,1)
^
^
^
u = 1i + 2j + 1k
y
u
1
u = 2
1
x
(0,0,0)
u = (1-0)2 + (2-0)2 + (1-0)2 = 6
Vectors have a length & a direction
Review - Vectors
• Vectors can be added together
• Vectors can be subtracted
• Vectors can be multiplied (dot or
cross or by a matrix)
• Vectors can be transformed (resized)
• Vectors can be translated
• Vectors can be rotated
Matrices*
• A matrix is a table or “array” of characters
• A matrix is also called a tensor of “rank 2”
6
5
1
6
3
8
7
0
4
4
9
9
1
3
3
4
3
0
5
4
A 5 x 6 Matrix
# columns
4
3
0
4
4
# rows
2
1
1
9
3
column
row
Different Types of Matrices
2
1
1
9
3
3
4
3
0
4
4
6
6
5
1
6
3
7
8
7
0
4
4
9
9
9
1
3
3
1
A square
Matrix
4
3
0
5
4
0
2
4
6
8
9
4
4
3
5
7
9
3
6
5
1
0
1
0
8
7
0
4
3
5
9
9
1
3
3
4
4
3
0
5
4
0
A symmetric
Matrix
1
3
5
9
7
3
A column
Matrix
(A vector)
Different Types of Matrices*
cosq
A
G
M
S
B
H
N
T
C
I
O
U
D
J
P
V
E
K
Q
W
F
L
R
X
A rectangular
Matrix
sinq
0
sinq -cosq
0
24689
0
0
A rotation
Matrix
1
A row
Matrix
(A vector)
Review - Matrix Multiplication
2 4 0
1 0 2
1 3 1 x 2 1 3
1 0 0
0 1 0
2x1
2x0
2x2
1x1
1x0
1x2
1x1
1x0
1x2
+
+
+
+
+
+
+
+
+
4x2
4x1
4x3
3x2
3x1
3x3
0x2
0x1
0x3
+
+
+
+
+
+
+
+
+
0x0
0x1
0x0
1x0
1x1
1x0
0x0
0x1
0x0
10 4 16
7 4 11
1 0 0
Rotation*
1
0
0 cosq
0 -sinq
z
q
y
f
x
cosf
-sinf
0
0
sinq
cosq
sinf
cosf
0
0
0
1
Rotate
about x
Rotate
about z
Rotation*
Counterclockwise about x Counterclockwise about z
1
0
0
0
0
cosq -sinq
sinq cosq
Clockwise about x
1
0
0 cosq
0 -sinq
0
sinq
cosq
cosf -sinf
sinf cosf
0
0
0
0
1
Clockwise about z
cosf
-sinf
0
sinf
cosf
0
0
0
1
Rotation
X
z
y
x
X
1
0
0 cosq
0 -sinq
1
0
0 cosq
0 -sinq
0
sinq
cosq
0
sinq
cosq
=
z
=
y
x
Rotation (Detail)*
1
0
0 cosq
0 -sinq
1
0
0 cosq
0 -sinq
0
sinq
cosq
0
sinq
cosq
z
z
=
X
y
y
x
x
1
1
1
1
cosq + sinq
-sinq + cosq
=
Comparing 3D Structures
•
•
•
•
•
Visual or qualitative comparison
Difference Distance Matrices
Superimposition or superposition
Root mean square deviation (RMSD)
Subgraph isomorphisms (Ullman’s
algorithm)
• Combinatorial extension (CE)
Qualitative Comparison
Same or Different?
Outline
•
•
•
•
•
•
Visualization Programs
Vectors & Matrices
Difference Distance Matrices
Molecular Superposition
Measuring Superposition
Classifying 3D Structures
Difference Distance Matrix*
D
E
D
C
B
E
C
B
A
A
Object A
Object B
Difference Distance Matrix*
A B C
A 0 4 5
B
0 4
C
0
D
E
D
6
6
3
0
E
4
7
6
3
0
A B C
A 0 4 5
B
0 4
C
0
D
E
C
A
E
4
3
5
3
0
A B C
A 0 0 0
B
0 0
C
0
D
E
D
0
1
0
0
E
0
4
1
0
0
D
D
E
D
6
5
3
0
B
E
C
B
A
Hinge motion
Difference Distance Matrix
Difference Distance
Matrices or DDM’s*
• Simplest method to perform structural
comparisons
• Requires no transfomations, no
rotations or superpositions
• Very effective at identifying “hinge”
motions or localized changes
• Produces a visually pleasing,
quantitative measure of similarity
Superposition*
• Objective is to match or overlay 2 or
more similar objects
• Requires use of translation and
rotation operators (matrices/vectors)
• Recall that very three dimensional
object can be represented by a plane
defined by 3 points
Superposition*
z
z
b
c’
b
c’
c
b’
c
b’
a
a
a’
a’
y
x
y
x
Identify 3 “equivalence” points in objects to be aligned
Superposition
z
z
b
c’
c’
c
b’
b
a
c
b’
a’
a
y
x
x
Translate points a,b,c and a’,b’,c’ to origin
y
Superposition
z
c’
z
c’
b
b
c
b’
a
x
b’
q
a
y
c
y
x
Rotate the a,b,c plane clockwise by q about x axis
Superposition
z
z
c’
c’
b
b’
b’
a
f
c
a
y
f
c
x
b
x
Rotate the a,b,c plane clockwise by f about z axis
y
Superposition
z
z
c’
b’
c’
b’
a
a
y
y
y
c
x
b
c
b
x
Rotate the a,b,c plane clockwise by y about x axis
Superposition
z
z
c’
q’
b’
c’
a
a
y
y
b’
c
x
b
c
b
x
Rotate the a’,b’,c’ plane anticlockwise by q’ about x axis
Superposition
z
c’
z
a
y
f‘
b’
c
x
a
b’
y
c’
b
c
b
x
Rotate the a’,b’,c’ plane anticlockwise by f’ about z axis
Superposition
z
z
a
c’
c
x
a
b’
y’
y
y
c’
b
c
b’
b
x
Rotate the a’,b’,c’ plane clockwise by y’ about x axis
Superposition
z
z
a
a
y
c’
c
x
y
c’
b’
b
c
b’
b
x
Apply all rotations and translations to remaining points
Superposition
z
z
b
c’
c
b’
a
a’
a
y
y
c’
c
x
b’
b
x
Before
After
Returning to the “red” frame
z
z
b
c
a
a
y
c’
c
y
b’
b
x
x
Before
After
Returning to the “red”
frame*
• Begin with the superimposed
structures on the x-y plane
• Apply counterclockwise rot. By y
• Apply counterclockwise rot. By f
• Apply counterclockwise rot. By q
• Apply red translation to red origin
Just do things in reverse order!
Shortcomings*
• Requires some initial assumptions
regarding the anchoring points for
superposition
• Anchoring points can’t always a priori
be known or easily calculated
• It “privileges” the first point “a” over
“c” which is in turn privileged over “b”
More General Approaches*
• Monte Carlo or Genetic Algorithms
• Matrix methods using least squares
or conjugate gradient minimization
(McLachlan/Kabsch)
• Lagrangian multipliers
• Rotation Angle Methods
• Quaternion-based methods (fastest)
Superposition – Applications*
• Ideal for comparing or overlaying two or
more protein structures
• Allows identification of structural
homologues (CATH and SCOP)
• Allows loops to be inserted or replaced from
loop libraries (comparative modelling)
• Allows side chains to be replaced or
inserted with relative ease
Outline
•
•
•
•
•
•
Visualization Programs
Vectors & Matrices
Difference Distance Matrices
Molecular Superposition
Measuring Superposition
Classifying 3D Structures
Measuring Superpositions
Molecule b
Molecule a
RMSD - Root Mean Square Deviation*
• Method to quantify structural similarity same as standard deviation
• Requires 2 superimposed structures
(designated here as “a” & “b”)
• N = number of atoms being compared
RMSD =
Si (xai - xbi)2+(yai - ybi)2+(zai - zbi)2
N
Superpositions for Multiple
Structures
RMSD - For Multiple Structures*
• Requires multiple superimposed
structures over a single “averaged”
structure (x,y,z)
• N = number of atoms being compared
• M = number of structures superimposed
RMSD = S
a
Si (xai - xi)2+(yai - yi)2+(zai - zi)2
N
M
RMSD without Superposition*
A B C
A 0 4 5
B
0 4
C
0
D
E
D
6
6
3
0
E
4
7
6
3
0
A B C
A 0 4 5
B
0 4
C
0
D
E
C
A
E
4
3
5
3
0
A B C
A 0 0 0
B
0 0
C
0
D
E
D
0
1
0
0
E
0
4
1
0
0
D
D
E
D
6
5
3
0
B
E
C
B
A
RMS = 1+4+1 = 1.89
10
RMSD*
•
•
•
•
•
•
0.0-0.5 Å
<1.5 Å
< 5.0 Å
5.0-7.0 Å
> 7.0 Å
> 12.0 Å
•
•
•
•
•
•
Essentially Identical
Very good fit
Moderately good fit
Structurally related
Dubious relationship
Completely unrelated
SuperPose Web Server
http://wishart.biology.ualberta.ca/SuperPose/
Outline
•
•
•
•
•
•
Visualization Programs
Vectors & Matrices
Difference Distance Matrices
Molecular Superposition
Measuring Superposition
Classifying 3D Structures
Classifying Protein Folds*
Lactate
Dehydrogenase:
Mixed a / b
Immunoglobulin
Fold: b
Hemoglobin B
Chain: a
Detecting Unusual
Relationships
Similarity between Calmodulin and Acetylcholinesterase
Classifying Protein Folds
SCOP Database
http://scop.mrc-lmb.cam.ac.uk/scop
SCOP
• Class folding class derived from
secondary structure content
• Fold derived from topological
connection, orientation, arrangement
and # 2o structures
• Superfamily clusters of low sequence
ID but related structures & functions
• Family clusers of proteins with seq ID
> 30% with v. similar struct. & function
SCOP Structural
Classification
The eight most frequent SCOP superfolds
The CATH Database
http://www.cathdb.info
CATH
• Class [C] derived from secondary
structure content (automatic)
• Architecture (A) derived from
orientation of 2o structures (manual)
• Topology (T) derived from topological
connection and # 2o structures
• Homologous Superfamily (H) clusters
of similar structures & functions
CATH - Class
Class 1:
Mainly Alpha
Class 2:
Mainly Beta
Class 3:
Mixed
Alpha/Beta
Class 4:
Few Secondary
Structures
Secondary structure content (automatic)
CATH - Architecture
Roll
Super Roll
Barrel
2-Layer
Sandwich
Orientation of secondary structures (manual)
CATH - Topology
L-fucose Isomerase
Serine Protease
Aconitase,
domain 4
TIM Barrel
Topological connection and number of secondary structures
CATH - Homology
Alanine racemase
Dihydropteroat
e (DHP)
synthetase
FMN dependent
fluorescent
proteins
7-stranded
glycosidases
Superfamily clusters of similar structures & functions
Other Servers/Databases
•
•
•
•
•
•
Dali - http://ekhidna.biocenter.helsinki.fi/dali_server/
VAST - http://www.ncbi.nlm.nih.gov/Structure/VAST/vast.shtml
Matras - http://biunit.aist-nara.ac.jp/matras/
CE - http://cl.sdsc.edu/ce.html
TopMatch - http://topmatch.services.came.sbg.ac.at/
PDBsum - http://www.ebi.ac.uk/thornton-srv/databases/pdbsum/
CE Search
http://cl.sdsc.edu/ce/all-to-all/1-to-all.html
CE Search
Summary
• Many different tools and formats to
visualize 3D structure – learn how to
use at least one of them
• Visualization on computers is mostly
about matrix and vector manipulation
• Structure comparison also requires
the use of linear algebra
• Protein structures can be compared
and aligned – just like sequences
Download