X-ray structure re-refinement

advertisement
Using X-ray structures
for bioinformatics
Robbie P. Joosten
Netherlands Cancer Institute
Autumnschool 2013
Introduction
Structures in bioinformatics
• Understand biology
– Direct interpretation
– Data mining
– Homology modeling
• Drug design
• Molecular dynamics
Basic rule:
Better structures → Better results
Introduction
Right structure(s) for the job
1. Selection: find (a number of)
PDB entries
2. Validation: check the quality of
your selection
3. Optimisation: maximise the
quality of your selection
Focus on X-ray structures
Selection
X-ray structures have a history
1.
2.
3.
4.
5.
Protein expression
Crystallisation
X-ray diffraction experiment
Model building and refinement
Deposition at the PDB
All these steps affect
the final PDB file
History
Protein expression
A ‘construct’ is made
• Partial proteins
– E.g. only extracellular domain
of membrane protein
• Frankenstein proteins
– Fusion proteins or chimeras
• Mutants are introduced
– Some by accident!
• Poly-histidine tags added for purification
• Altered glycosylation state
– Large sugars hamper crystallisation
History
Crystallisation
The protein stacks regularly to form
a crystal
• Protein still functional in the crystal
• Much solvent in the crystal (~40%)
• Some residues can move
– Disorder: missing loops/side chains
– Alternate conformation
History
Crystallisation
Beware of crystal packing
• One copy of the protein can influence
the next
History
Crystallisation
Chemicals are used for crystallisation
• Buffers to stabilise the pH
• Precipitants
–
–
–
–
Change solubility of the protein
Neutralise local charges
Bind water
High concentrations are used
• Compounds compete with natural ligands
• Examples:
– Polyethylene glycol (PEG)
– Ammonium sulphate
History
Crystallisation
Beware of the crystallisation conditions
History
Crystallisation
Beware of the crystallisation conditions
History
X-ray diffraction
Typical experiment
Detector
X-ray source
History
X-ray diffraction
• X-rays interact with electrons
– Atoms with few electrons (H, Li) do not
diffract well
• X-rays cause damage to the protein
–
–
–
–
Acidic groups (ASP en GLU) can be destroyed
Disulphide bridges are broken
Hydrogens are stripped
Cooling crystals in liquid nitrogen helps
• Glycerol added to the crystal!
History
X-ray diffraction
• We are not using a microscope
• We don’t measure everything we need
1
ρ ๐‘ฅ, ๐‘ฆ, ๐‘ง =
๐‘‰
๐นโ„Ž๐‘˜๐‘™ ๐‘’ [−2๐œ‹๐‘–
โ„Ž
๐‘˜
โ„Ž๐‘ฅ+๐‘˜๐‘ฆ+๐‘™๐‘ง −๐›ผ]
๐‘™
Measured
Missing: phase
X-ray diffraction gives an indirect
and incomplete measurement
History
Model building and refinement
Iterative process
FT
Phases +
calculated
X-ray data
Measured X-ray
diffraction data
Initial
phases
Electron
density
maps
Structure
model
History
Model building and refinement
Two types of maps
1. Regular electron density map (2mFo-DFc)
2. Difference map (mFo-DFc)
History
Model building and refinement
Fitting atoms to the ED map and trying to
remove difference density peaks
History
Model building and refinement
• Requires skill and experience
• Requires time and patience
• Requires good software
Lack of any of these can be
seen in the final PDB file
History
Deposition at the PDB
• Both coordinates and experimental
X-ray data are deposited
• PDB standardises files and adds
annotation
• Sometimes things go wrong
History
Deposition at the PDB
LINKs between alternate conformations
History
Deposition at the PDB
Un-biological LINKs
LINK
LINK
LINK
LINK
LINK
LINK
LINK
C
C
CF
N
C
C
N
ACE
PTH
PTH
DIP
ACE
PTH
DIP
C
C
C
C
D
D
D
100
101
101
103
100
101
103
(in 1a1a)
N
N
OG
C
N
N
C
PTH
GLU
SER
GLU
PTH
GLU
GLU
C
C
A
C
D
D
D
101
102
188
102
101
102
102
Think of what happened
to the structure before
you downloaded it
Validation
X-ray specific validation
Use the experimental data
• Resolution says very little about the
structure
• (free) R-factor gives the overall fit of
the structure to the experimental data
• For biological interpretation more
detail is needed
Use the maps
Validation
X-ray specific validation
Which is the better structure of berenil
bound to DNA?
PDB id
Resolution
R
268d
1d63
2.0
2.0
0.160
0.183
Validation
X-ray specific validation
The real-space R-factor (RSR)
• A per-residue score of how well the atoms
fit the map
• Works like the R-factor (lower is better)
Validation
X-ray specific validation
Maps can help distinguish the good and bad
bits of a structure
Validation
Things you can find in maps
Poorly fitted
side-chains
Evil peptides
Validation
Things you can find in maps
The wrong drug
Validation
Things you can find in maps
Sequence error K -> R
• Accidental mutant
• Also a missing sulfate
Validation
Things you can find in maps
Missing water
Missing alternate
conformation
Validation
Checking maps
• Visualisation in Coot
– http://www2.mrclmb.cam.ac.uk/personal/pemsley/coot/
• Get maps and real-space R values from the
Electron Density Server
– http://eds.bmc.uu.se/eds/index.html
– Direct interface with Coot
• Get maps and updated models from PDB_REDO
Practical session
Maps show things you
cannot see otherwise
Optimisation
Structures in the PDB
• Solved by a diverse group of scientists
– People make errors & gain experience
• Since 1976
– Structures are not updated
• Solved with the methods of their era
– Methods improve over time
Structures in the PDB do not
represent the best we can do NOW
Optimisation
Improve structures in PDB
• Take structure + experimental data
• Use latest X-ray crystallography methods
– Decision making: use case-specific methods
– Create new methods when needed
• Improve model quality
– Fit with experimental data
– Geometric quality
• Fix errors
PDB_REDO
Optimisation
PDB_REDO method
Step 1: prepare data
• Clean-up structure and X-ray data
• Data mining
Step 2: establish baseline
• Fit with experimental data (R-factors)
• Geometric quality
– Validation with WHAT_CHECK
Optimisation
PDB_REDO method
Step 3: re-refine structure
(with Refmac)
• Improve fit with experimental data
– Use restraints to improve geometric quality
• Improve description of protein dynamics
– Concerted movement of groups of atoms (TLS)
– Anisotropic movement
of individual atoms
Optimisation
PDB_REDO method
Step 4: rebuild structure
• Delete nonsense waters
• Flip peptide planes
• Rebuild side-chains
– Add missing ones
– Optimise H-bonding
Step 5: validate structure
• Geometry
• Density map fit
• Ligand interactions
Availability
PDB_REDO databank
• www.cmbi.ru.nl/pdb_redo
– > 72,000 structures (98%)
– Detailed methods & reprints
• Directly in molecular graphics software
–
–
–
–
YASARA
CCP4mg
Coot (needs plugin)
PyMOL (needs plugin)
• Linked via PDBe & RCSB
Optimisation
Does it work?
(12,000 structures)
• Improved fit with the data
• Better geometry
Ramachandran plot
100%
Fine packing
R-free
100%
100%
80%
75%
75%
50%
50%
50%
25%
25%
8%
12%
0%
17%
Same
22%
25%
9%
4%
0%
0%
Worse
74%
74%
75%
Better
Worse
Same
Worse
Better
Same
Better
Optimisation
MolProbity validation
PDB
PDB_REDO
(1eoi)
Optimisation
Electrostatics calculations
• ‘Missing’ positive lysine atoms distort
electrostatics calculations
• Adding missing atoms correctly describes
C-terminus interaction with side chains
Optimisation
Protein-ligand interaction
• Wrong peptide plane in
peptide ligand
• Fixed by PDB_REDO
• Better understanding of
H-bonds in the interaction
Optimisation
Protein-protein interaction
• Packing interface with poor ionic
interactions
• Rebuilt interface properly describes
ionic dimerisation interactions
Optimised structures
give a better view of the
biology of the protein
PDB_REDOers
Amsterdam:
Nijmegen:
Cambridge:
• R Joosten
• K Joosten
• A Perrakis
• T te Beek
• M Hekkelman
• G Vriend
• G Murshudov
• F Long
Key contributors:
Eleanor Dodson, Ian Tickle, Paul Emsley, Ethan Merritt, Elmar
Krieger, Thomas Lütteke, Rachel Kramer Green, Sanchayita Sen
Download