Structural Biology Practical, Nov. 2012

advertisement
Structural Biology Practical
MolBio / GGNB A57
December 2012
Tim Grüne
Tutors:
Caroline Behrens
Inessa De
Kevin Pröpper
Tales Rocha de Moura
Course Dates:
Monday – Friday, 1pm – 5pm
1
1
DATA INTEGRATION WITH HKL2000
Data Integration with HKL2000
1.1
Getting Started
The major part of this practical is computer based. The computing environment is completely Linux-based. This section contains a few
basics to get familiar with it.
1.2
Logging in and setting up our files
There are 4 accounts set up for this practical with
usernames: mb01,mb02, mb03, mb04, mb05
passwords: mb01 ,mb02, mb03, mb04 mb05
Each group should pick on unique username and password and use it on both days. This prevents mangling of the data.
The computers have there names printed on them. The ones suited for this course are
• ganymede
• klio
• medusa
• stheno
• urania
All computers have the same setup and your files are accessible from all these computers. Therefore it is not important that you stick to the
same computer all the time.
The computer network in the Sheldrick group is separated from the internet. In order to access the internet, different computers must be
used (the tutors will tell you which ones they are).
The usernames on those computers are
usernames: pg1,pg2, pg3, pg4,Tales Rocha de Moura
pg5
and the password for all these usernames is
ohMou9Oo
Passwords are case sensitive
After logging in your desktop looks a little desolate. You are going to need a terminal from which you can type commands. In order to get
your first terminal, type
Alt-F2
which opens a small windows that allows you to type in commands. Type
konsole
to get a terminal.
1
1.3
Data Integration with HKL2000
1
DATA INTEGRATION WITH HKL2000
From now on, the terminal will appear in your control menu (the blue button at the bottom of the screen with the “K”).
(The screenshots of this tutorial stem from the GGNB practical, so please don’t get confused when some directories in the pictures read
ggnb instead of mb.)
In order to keep track of the process you should keep the different parts of this course in separate directories. For the first part, create a
directory integration using the linux command mkdir by typing in the terminal window
#> mkdir integration
#> cd integration
The first command creates the directory, the second one changes into that directory.
1.3
Data Integration with HKL2000
Now you can start the integration program HKL2000 by typing this command in the same terminal. The first window that appears asks for
the detector type. Our data were collected on a Mar 345 image plate.
Selecting the correct detector.
2
1.3
Data Integration with HKL2000
1
DATA INTEGRATION WITH HKL2000
Once you selected the Mar345 detector and clicked OK, the main window of HKL2000 appears:
Make sure that the New Output Data Dir lists the directory
/net/home/mbXX/integration
where ’XX’ corresponds to the number in your username.
Correct output directory?
Next you have to tell HKL2000 were the data frames are (New Raw Data Dir), because they are not in the directory you just created.
In the subwindow Directory Tree you have to double-click on the net-folder, so that you can see ganymede and home. Double-click
you way through to
net->ganymede->ggnb-I->frames
Thereafter click on the >> below New Raw Data Dir (not the one below New Output Data Dir!)
When the New Raw Data Dir points to the correct directory /net/ganymede/ggnb-I/frames, you can click on the Load Data
Sets-button and you can see the frames that HKL2000 found in that directory.
3
1.3
Data Integration with HKL2000
1
DATA INTEGRATION WITH HKL2000
Loading data sets.
After OK, the HKL2000 window looks like the next picture. Note that the bottom fields are not filled in as much as HKL2000 could extract
from the file header.
Data set information.
4
1.3
Data Integration with HKL2000
1
DATA INTEGRATION WITH HKL2000
When you click on the Display button, HKL2000 shows the first frame (if you do not see anything, move the main window aside.
Sometimes HKL2000 opens the new window behind the main window.)
Frame number 1.
1.3.1
Indexing
We must next ask HKL2000 to index our data, i.e.
1. find an (approximate) unit cell consistent with the pattern in the frame
2. assign the Miller indices to the reflections
First select the Index-tab at the top of the main window. HKL2000 must find the reflections on the image. This is what the Peak Search
button in the main window is for.
5
1.3
Data Integration with HKL2000
1
DATA INTEGRATION WITH HKL2000
Click the Index. HKL2000 opens a new window which offers a choice of Bravais lattices. The green ones fit well with the diffraction
pattern, the red one fit only poorly. Furthermore, the Bravais lattices are ordered according to the degree of symmetry. Pick the top green
one, primitive tetragonal. The colour of the peaks in the frame window changes colour.
Once an initial cell has been found, the unit cell parameters and the experimental settings (detector distance, . . . ) can be refined, i.e., their
values improved based on the collected data. Hit the Refine button in the main window 3-4 times and check whether the numbers in that
window do not change much any more. Now click Fit All in order to include all parameters, also select the button Mosaicity and click
on the Refine button again several times until the numbers stabilise and do not change much anymore. The colour on the frame display has
changed again. The colour indicates whether a reflection is a full reflection, i.e. whether the complete reflection has been recorded on that
image, or whether part of it is on one of the adjacent frames.
Question: Why are the spots on the circles not enclosed? Is HKL2000 making a mistake?
1.3.2
Box- and Spot-Size for Integration
You can give HKL2000 an idea about how big the spots are and how much area the program should use in order to determine the background
around each spot. Both are important settings for a proper data integration.
6
1.3
Data Integration with HKL2000
1
DATA INTEGRATION WITH HKL2000
Click the Zoom window-button in the Frame Display. This opens a third window. Middle-click in the Frame Display in an area with spots
and adjust the brightness (in the Frame Display) so that you can clearly see the spots.
The Int. box-button in the Zoom window shows the current settings for the background area (square) and the spot size (circle). There are
actually two circles, the area in between the two is the “transition” between spot and background. You can see that the boxes overlap with
some of the boxes, and the circles seem to small to encompass the spots, at least the larger ones.
Click Zoom in twice for a good view.
First click on Box size in the main window. A setting 20-25 seems to be a reasonable setting for this data set. You must click Refine in
the main window to see the effect of the change.
Similarly, increase the Spot size so that the spots fit into the circles. It does not matter, if the circles are too big, but it does if they are too
small. A Spot size of about 0.75 seems a good choice.
Now click the Refine-button in the main window two to three more times to adjust to the new settings. Before advancing to the integration
step, you can tell the program where the shadow of the beam stop is so that it excludes this area during integration. With weak data this can
be important to improve the data quality, but it is also good practice for high resolution data.
Please ask the tutors to show how to carry out this step.
Then click the Integrate-button to start the integration, and lean back for a few minutes.
1.3.3
Integration
Integration starts automatically. The Integration-tab of the main window shows the progress of integration. The program now further
improves the experimental parameters. The bottom right window shows the variation of some of the parameters. They usually fluctuate a
little.
7
3
ELECTRON DENSITY, MODEL, AND SECONDARY STRUCTURE
With most decent data sets, there is little to adjust
during or after integration with HKL2000.
2
Data Preparation for Rest of Practical
Before you can continue, you have to get a copy of the files required for the rest of this practical.
Find the terminal window so that you can type commands and type
#> cd
The command cd without and argument takes you directly into your home-directory /net/home/mbXX.
Create a directory for the practical, change into it and copy the required files:
#> mkdir practical
#> cd practical
#> cp -rv /net/ganymede/molbio/ex* .
(The period at the end of the last line is part of the command. It means here in the UNIX-world.)
3
Electron Density, Model, and Secondary Structure
This section introduces the program coot.This program is a graphical model building interface. It can display PDB–files and electron
density maps.
The data are in the directory practical/ex2, so find a terminal, and change into this directory:
#> cd ~/practical/ex2
(The “~” is a short-cut for your home-directory)
1. Open a terminal and have a look a the file exercise2.pdb with the text editor kwrite by typing
kwrite exercise2.pdb
As you can see, PDB–files are plain text files. The information in the “header” of the file (i.e., the lines not starting with ATOM), tells
you a few things about what program created the file, about the refinement statistics. Because this file contains an data which were
not deposited, there is not author information or other information about the molecule itself.
2. Start the program coot from the terminal.
3. Create an electron density map by loading the file exercise2.mtz:
8
3
ELECTRON DENSITY, MODEL, AND SECONDARY STRUCTURE
File -> Auto Open MTZ
At first nothing seems to happen. This is because we are looking at the origin of the coordinate system. In this particular case, there is
no density at the origin (solvent region). Move around by holding the Ctrl-key and the left mouse button at the same time and move
the mouse while you keep on holding the Ctrl-key!
The default diameter of the map is 10Å.
In order to see more, select the
Edit->Map Parameters...
entry and increase it to 15Å.
WARNING: If you set this value to too high a value you may overload the graphics card of the computer hand make your computer
reeeeeally slow!.
This shows electron density in three different colours. The blue colour displays the main map which we want to explain with a model.
The red and green part show differences between our model and our data. Red indicates “too much model” and green indicates ”not
enough model”. We will come back to what this means shortly.
4. To rotate the map, hold down the left mouse button and move the mouse. To move (translate) the map, hold the Ctrl–key plus the
left mouse button while you move the mouse.
The yellow cube indicates the origin (0,0,0), while the small pink box indicates the centre of rotation.
To zoom in or out, hold down the right mouse button while you move the mouse left or right or up or down.
To change the level of detail of the map (the sigma–level), use the scroll wheel. With a lower sigma level one sees more, but at about
1σ or less, the noise overcomes the meaningful data.
Can you already make out features, e.g. secondary structure elements, or side chains?
What is the rule of the “Christmas tree”?
5. Now load the PDB-file exercise2.pdb.
File -> Open Coordinates...
The electron density becomes much more easily to understand.
6. This is a low resolution structure. The data (and hence the map) have a resolution of 3.4A, which is rather poor for a protein crystal
structure. At this resolution, the atomic positions have to be considered with care.
Instead look at the secondary structure of the protein. To better see it, select the
9
4
MODEL BUILDING
Display Manager
and select C-alphas for the Molecule exercise2.pdb. You can also switch off the map by clicking the Display–button next
to the two map entries in the Display Manager.
How many α–helices, and how many β–strands does this protein consist of?
7. Now use the centre mouse button to centre on the C–terminus of the model and redisplay the electron density map and look around a
little. How do you judge the quality of this model?
4
Model Building
In this exercise you are going to look at the model and density of Thermolysin, a heat-stable metalloproteinase produced by Bacillus
srearothermophilus that hydrolyses peptide bonds on the amino side of bulky hydrophobic residues such as Leu, Ile, Val, and Phe. You are
going to use coot to build some missing residues and correct the placement of a few side chains.
4.1
Major Corrections
1. Start coot again from a terminal.
2. Auto Load the electron density map of exercise3.mtz. These data have a resolution of 1.7Å. You should be able to see a
difference to exercise 1. Find some aromatic side chains. They have holes now.
3. Open Coordinates from exercise3.pdb.
4. Centre on reside Thr 278:
Draw -> Go To Atom ...
and scroll down until you find Thr 278. Double-click on it to centre.
10
4.2
Minor Corrections — Rotamers
5
THE EFFECT OF LOW AND HIGH RESOLUTION
Downstream of Thr 278 starts a part of blue density which overlaps with some green (difference) density: the model is missing atom
to explain the density and you have to add them.
5. First tell the program which map you want to build against:
Calculate -> Model/Fit/Refine... -> Select Map... -> OK
6. Now add a new peptide to residue, Thr 278:
Add Terminal Residue...
then click on any atom of residue Thr 278. If the peptide fits more or less in density, accept.
7. Residue 279 is now an Alanine. But according to the sequence it should be a different residue. Try to guess which one.
8. Coot let’s you to Mutate & Auto Fit ...
9. Fill the whole gap by alternatingly adding a terminal residue and mutating it to the correct type.
4.2
Minor Corrections — Rotamers
1. Centre on Trp55. Remember that “red” indicates “too many atoms” and green means “not enough atoms”: The Tryptophan is looking
the wrong way. To correct it, click in Rotamers ... and then on some atom of Trp55. Select and accept the rotamer that best fits
the density.
2. To improve the fit, Edit Chi Angles and move the Tryptophan into the density as good as possible.
3. To finish off, do a Real Space Refine Fit on Trp55: click the button and then two atoms of Trp55.
4. What you did may have violated stereo-chemical restrictions (ideal bond lengths, angles, etc.). To correct for this, click the Regularise
Zone–button. Click once on a residue about two residues before Trp55 and two residues behind Trp55.
5. Look at Met120. There is quite a bit of red and green density around that residue. Do you have an idea to explain the density?
5
The Effect of Low and High Resolution
Now you are going to examine two maps with different resolution ranges to learn the importance both of high and low resolution data. You
are going to look at data from Tendamistat, a bacterial inhibitor of mammalian α–amylases (a digestive enzyme that breaks down starch).
1. Load the low–resolution map of Tendamistat, exercise4a.mtz and look around. It is difficult to recognise side chains, like with
the first map you have seen.
2. Load the file exercise4.pdb. Now it is easier to see that the model fits the density so that the density makes sense.
3. Go the residue Asn272. Some residues are missing there. Can you make out their types?
4. Load the map from exercise4b.mtz. This map was calculate with data between 1Å and 2Å only, i.e., all reflections with less
than 2Å resolution were omitted. It looks very noisy, but it shows peaks around the atoms.
5. Does the second map help to guess the right residues?
6. Load a map from all data, which covers data between 1.0Å and 20Å, exercise4c.mtz.
11
6
6
6.1
MODEL ANALYSIS AND VALIDATION
Model Analysis and Validation
Secondary Structure
(a) Change to directory ex5.
(b) Load exercise5.pdb.
(c) Confirm the topology drawing
(d) To aid you checking, activate the “environment distances” for the centred atom:
Measures -> Environment Distances
click on “Show Environment Distances” and limit the distances to 2.5–3.3Å.
(e) For a hydrogen bond between an Oxygen and a Nitrogen atom, their distance ought to be roughly 2.7–3.2Å.
and confirm the beginnings and the ends of the α–helices by checking the hydrogen bonds between the N of residue n and the
O of residue n + 4.
(f) Look at the same file with the pymol viewer program: From a terminal/ console, type
pymol exercise5.pdb
In the top right part of the graphical window there are two menu entries, one saying all and the other saying exercise5. The
five boxes to their right stand for
•
•
•
•
•
Action
Show
Hide
Label
Colour
Click the S->cartoon and H->lines to remove the chicken wire model and show the molecule as cartoon. For a nicer view,
click C->by ss->Helix Sheet Loop
12
6.2
Structure Validation — Ramachandran Plots et al.
6
MODEL ANALYSIS AND VALIDATION
(g) You see, the program detects fewer β–strands. For a publication one would have to check at least beginning and end of each
secondary structure element and compare with at least one other program.
(h) Next, select A->preset->b factor putty. This colours the atoms according to their B–value. The diameter of the tube
also corresponds to the B-factor.
(i) Can you explain which regions are blue (low B-factor) and which are green (medium B-factor) or even red (high B-factor)?
6.2
Structure Validation — Ramachandran Plots et al.
Back to coot you are going to meet some of its possibilities to judge the quality of a structure.
(a) Close pymol by selecting File -> Quit and go back to the coot window.
(b) Load the file exercise5b.pdb. It is the structure of from ketosteroid isomerase.
(c) click
Validate -> Ramachandran Plot -> 0 exercise5b.pdb
This should open an interactive window showing the Ramachandran plot of this structure.
General
Proline
Glycine
Outlier
Depending on where you place the mouse pointer, it changes according to the Ramachandran plot of the corresponding class:
Prolines are much more restricted while Glycines are less restricted then general peptides.
(d) Click on the outlier marked red. Coot focuses the main window on that residue.
(e) Load the corresponding map from exercise5b.mtz
(f) From the Model/Fit/Refine–window, select Edit Backbone Torsion and click an Asn93.
13
7
FINDING THE ACTIVE SITE
(g) Play around with the Φ and Ψ angles and try to improve the fit to the Ramachandran plot. Since there is nearly no density there
to judge, we should at least make sure the loop fits the Ramachandran plot.
(h) Now let coot try to refine the part: From the Calculate -> Model/Fit/Refine... Menu, Select
Real Space Refine Zone
and select the residues 90–96. Is the suggestion acceptable? When you accept, look at the Ramachandran plot!
(i) Now have a look at the B–factor plot of the Cα main chain atoms of this structure.
B−factors of Cα
90
B−factors
Temperature Factor Å2
80
70
60
50
40
30
20
10
0
0
20
40
60
80
100
120
140
Residue number
The largest peak is around residue 95, just around where you fixed the Ramachandran plot before. Look at the Cα trace of the
protein. Can you imagine, why the B–factors are high in this region?
7
Finding the active site
This time you are going to look at the structure of the protease thermolysine. Structures of thermolysine usually retains a Val–Lys–
peptide in the active sites. You are going to locate the active site and fit the peptide.
(a) Load the file exercise6.pdb and the map from exercise6.mtz. You may already notice a major “green” area, indicating
that something is missing there.
(b) Load the file val-lys.pdb. It contains a Valine–Lysine di-peptide.
(c) Instead of trying to find the ligand yourself, you can ask coot to do it for you:
Calculate -> Other Modelling Tools -> Find Ligands
(d) Leave the default values except for three parts:
i. select the Di-peptide as ligand to search for
ii. select the ’flexible’ search. This lets the program take into account that the peptide can make rotations about certain bonds,
e.g. the C − N bond between the two residues.
iii. set the σ–level for the search to a value slightly less to what you use for looking at the map, e.g. around 1.3
(e) The peptide should end up within the extra density, but the fit is far from good. Manual adjustment is obviously necessary. Since
the Lysine looks more obvious, we will start with the C–terminal residue.
(f) move the Cα atom of the Lysine where it belongs.
14
8
PLAYING WITH PYMOL
Calculate -> Model/Fit/Refine -> Translate/ Rotate Zone
and click once on some atom of the Valine and once on some atom of the Lysine.
(g) What do you notice about the density near the Oxygen of the Lysine?
(h) Add the second terminal Oxygen:
Calculate -> Other Modelling Tools ... -> Add OXT to Residue ... -> Fitted Ligand #0
(i) Flip the peptide between the two residues: On the Model/Fit/Refine Menu, select Flip Peptide and click on an atom
of the Valine.
(j) Now try to move and rotate the Valine alone into density, just coarsely.
(k) When both residues are roughly in density, the stereochemical aspects are most likely awful. Use the Real Space Refine
Zone option for coot to correct the fit and finally do an Regularise Zone to “polish” it.
(l) If the side chains of the Lysine are still not where they should be, you can Edit Chi Angles to make them fit better.
(m) In order to put the protein and its ligand into a refinement program, they must be within one PDB–file, i.e., the two structures
must now be merged:
Calculate -> Merge Molecules
and select the fitted ligand and the molecule exercise6.pdb. Now we can
File -> Save Coordinates
our modified molecule in order to hand it over to a refinement program which will further improvements and calculate new
phases for us.
8
Playing with Pymol
If there is time left, you can open pymol again with any of the above PDB-files and play around with its options to make nice pictures.
E.g. you can calculate the electrostatic surface potential with A -> generate -> vacuum electrostatics -> protein
contact potential You can see strongly charged patches and less strongly charged parts on the surface.
Next generate the symmetry related molecules in the crystal: A->generate ->symmetry mates -> within 12A.
With the file exercise5.pdb you can observe that the contacts are actually at regions where the electrostatic potential is comparatively
weak. This supports the notion that protein-protein contacts are usually controlled by van-der-Waals (hydrophobic) interaction and not
electrostatic forces.
15
Download