Structure determination by multiple isomorphous replacement

advertisement
Phasing by multiple isomorphous
replacement
Mike Lawrence
Walter & Eliza Hall Institute of
Medical Research
Parkville, Melbourne
Lecture 1
Heavy atom derivatives and how to
make them
Some history....
• Bernal (1934) – showed that pepsin crystals diffracted, suggesting
that it may be possible to determine the atomic structure of
proteins.
• Patterson (1934) – showed that the distances between “heavy”
atoms in a molecule could be computed from diffraction intensities
alone, without the need for phases.
• Kendrew, J. et al. (1958) A three-dimensional model of the
myoglobin molecule obtained by x-ray analysis. Nature 181, 662666.
• Muirhead, H. & Perutz, M.F. (1963) Structure of haemoglobin. A
three-dimensional Fourier synthesis of reduced human haemoglobin
at 5.5 Å resolution. Nature 199, 633-638.
Why heavy atoms can be used to
yield phases
Consider a protein crystal doped with heavy atoms.
The structure factors then obey the equation
FPH = FP + FH
where FPH is the structure factor for the derivatized crystal, FP is
the structure factor for the protein crystal, and FH is the structure
factor for the heavy atoms alone.
provided that the structure of the protein and the basic assembly of
the crystal is not changed by the addition of the heavy atoms (i.e.
the derivative is isomorphous!).
Expected differences in diffraction
intensities due to heavy atoms
It can be shown that for acentric reflections, the magnitude of the
heavy atom differences ∆I = |IPH – IP| is related to the relevant
cumulative atomic numbers of the protein and heavy atoms by the
equation
√<∆I>² / <IP> ≈ √(2<IH> / <IP>) and <I> = Σifi² .
Hence for a protein with one mercury atom: (for sin Θ / λ = 0):
MW (kDa)
<I>/<I>
28
0.36
56
0.25
112
0.18
224
0.13
448
0.09
i.e. a significant change in
intensity results from the
addition of even a single
heavy atom
Why heavy atoms work
We have FPH = FP + FH
We can measure the amplitudes of FP and FPH.
Patterson methods allow us to compute FH, the vector quantity (I’ll
show you this later…)
Simple geometry then shows that there are, in general, two
possibilities for the phase of FP, assuming that there is no anomalous
signal.
A second (i.e. multiple ! ) derivative will yield another two possibilities
for the phase of FP, and the phase that is in common should then be
the correct phase for FP (QED !)
Phase choice 1
Phase choice 2
Where the problems lie…
1. Not all derivatives diffract or diffract as well as the native.
2. Not all derivatives are isomorphous to the native.
3. Determination of the heavy atom positions may not be possible.
4. Errors in |FP| and |FPH| and lack of isomorphism can make the
theory difficult to apply, i.e. the phases generated can be hopelessly
inaccurate.
Powerful algorithms have thus been developed to handle the error
issues, yielding carefully weighted phase sets and unbiased
determination of the heavy atom detail.
Making heavy atoms derivatives
There are four fundamental strategies for making heavy atom
derivatized crystals:1. Soak the crystal in a solution containing heavy atoms.
2. Co-crystallize the protein with a heavy atom containing solution.
3. Simply use a heavy atom already in the protein (e.g.
metalloprotease).
4. Covalently modify the protein to include a heavy atom.
Crystal soaking
Advantages
Easy to do. Simply make up a solution of the heavy atom in the
crystallization solution and soak the crystal in it (hours to days).
Disadvantages
Not easy to control the derivatization.
May destroy the crystal or reduce its resolution.
Co-crystallization
Advantages
Controlled stoichiometry.
Avoids steric hindrance problems.
Disadvantages
Protein may no longer crystallize or may no longer crystallize in the
same space group and unit cell (i.e. no longer isomorphous).
Use heavy atoms already within the
protein
Advantages
Particularly suitable for MAD (multiple anomalous dispesrion)
techniques
Disadvantages
May not be sufficient anomalous signal at the wavelengths available.
Requires synchrotron radiation.
Covalently modify the protein to
include a heavy atom
Advantages
Suitable for MAD and MIR. Should yield good derivatives.
Disadvantages
Requires additional work at the protein production stage.
May not be suitable for all proteins.
May change crystallization conditions.
Which heavy metal do I use?
There are some rules, in particular which relate to the pH of the
precipitant solution, the chemical composition of the precipitant
solution and to the sequence of the native protein.
However, these are guides only and provide no guarantee of success.
For example - platinum binds preferentially to histidine residues.
- platinum derivatization may not work in ammonium sulphate solutions.
Native gels may also be used to pre-evaluate heavy atom binding.
http://www.doembi.ucla.edu/~sawaya/m230d/Crystallization/crystallization.html
Making up heavy metal solutions
These are toxic. Care must be taken, particularly with methyl mercury
derivatives, with uranium salts, and with some explosive mixtures.
Study the MSDS and read the literature. Clean up after yourself and
avoid contaminating tips, lab, other people's protein, etc.
Work with as small a volume as possible. Start with say a 2mM solution
and soak overnight. Solubility of heavy atoms in precipitant may not
be high, organic solvents such as DMSO may help.
Tranfer the crystal to the heavy atom solution. Monitor the crystal for
signs of decay during soaking- check for dissolving, cracking, colour
change. Leave it to soak overnight.
Collect some diffraction frames. Check if the cell is the same. Check
the resolution. If the diffraction is strong, process these frames
and compare the data with the native data.
Survey of successful derivatives
Some useful sites:
http://www.sbg.bio.ic.ac.uk/had/
http://hatodas.harima.riken.go.jp/
http://eagle.mmid.med.ualberta.ca/tutorials/HA/
http://www-bio3d-igbmc.u-strasbg.fr/~romier/heavyatom.html
Platinum derivatives have proven the most popular, in particular
K2PtCl4, K2Pt(NO2)4 and K2PtCl6.
Mercury, gold, tungsten, uranium derivatives are also popular.
The di-platinum-di-iodo derivative PIP is very useful for large
proteins.
Xenon is also particularly interesting.
Lecture 2 Anomalous scattering
&
the collection and processing of
diffraction data from derivatized
crystals
X-ray scattering
X-ray scattering by atoms involves the following process:1. The electromagnetic field of the incident X-ray beam exerts a
force on the electrons within the atoms.
2. To a first approximation the electrons can be regarded as free
electrons which oscillate at the same frequency as the
incident X-rays.
3. These electrons then hence emit X-rays of the same wavelength as
the incident X- rays.
4. The phase of the scattered X-rays differs by 180° from that
of the incident X-rays.
This is called the free-electron approximation to X-ray scattering.
Anomalous X-ray scattering
The free electron approximation breaks down for "high" Z atoms.
The inner electrons are more tightly attached to the nucleus than
the outer electrons, and at certain X-ray wavelengths these
electrons may be ejected from the inner shell (say K) into the
continuous energy region. This happens if the incident wavelength is
close to what is termed the absorption edge.
The electron then re-emits an X-ray photon as it falls back into a
lower energy shell (say L).
The emitted photon does not necessarily differ in phase by 180°
from the incident photon.
Anomalous X-ray scattering
This diagram show the total scattering by an atom. It consists of the
free electron component f and an anomalous component Δf + if". It is
the magnitude of f" that is of most importance in phase
determination. The anomalous component is a function of both Z and
wavelength, and tables of Δf and f“ exist for each wavelength and Z.
ftotal
fiso
fanom
Δf
if"
ftotal = fiso + f + if”
= fiso + fanom
Friedel's Law
Friedels law states the the
reflections FP(+) and FP(-) are
symmetrically located with respect
to the horizontal axis,
i.e. the phase of the reflection
(-h,-k,-l) is opposite in sign to the
phase of the reflection (h,k,l).
FP(+)
a
-a
FP(-)
The same applies to FH(+) and FH(-).
Provided that there is no anomalous scattering !!
Breakdown of Friedel's Law
FH(+)
FP(+)
FPH(+)
FP(-)
FPH(-)
FH(-)
Friedel’s law breaks down if the heavy atom scatters anomalously
and there is then no phase or amplitude relationship between
FPH(+) and FPH(-).
Anomalous scattering
The consequences of anomalous scattering are as follows
1. The Friedel pairs are no longer have the same intensity nor do they
have related phases.
2. The magnitude of the anomalous scattering for a particular atom
varies considerably as a function of incident wavelength - it is
strongest near the so-called absorption edges.
3. The anomalous signal can be used to break the phase ambiguity
inherent in SIR (SIRAS).
4. The anomalous scattering can be used to "generate" multiple
derivatives by the incident wavelength is changed (MAD).
5. The incident wavelength can be tuned to match scatterers within
the protein itself (SAD).
Centric and acentric reflections
Space group symmetry sometimes constrains phases of certain
reflections to have only a limited, finite number of phase
possibilities. For example, in orthorhombic space groups, any
reflection of the type (hk0), (h0l) and (0kl) may only have a phase of
0 or 180. The axial planes in reciprocal space are thus referred to
as centric zones, and the reflections within these planes are
referred to as centric reflections. Reflections with un-restricted
phases are referred to as acentric reflections.
The probability distributions associated with these two classes of
reflections are different, and in any phasing program, statistics are
often reported separately for the two classes of reflections. It is in
general "easier" to determine the phase angle of centric reflections.
Processing heavy atom data
Collecting derivative data
Derivative data are collected and processed in much the same way as
native data, except that the Friedel pairs are not merged with
each other (as they are not equivalent if anomalous scattering is
present). This implies that a greater range of oscillation data will
have to be collected to achieve a complete data set, or at least a
data set of equal multiplicity to that of the native.
Once the first derivative oscillation image has been collected, it
should be processed to ensure that the derivative unit cell is the
same as that of the native and that the crystal diffracts to
adequate resolution.
If the derivative and native unit cells are the "same" and the crystal
exhibits a useful level of resolution, then it is worth collecting
more data. Once a reasonable amount of data has been collected,
it can be compared with the native to check for isomorphous
derivatization.
Scaling the derivative data to the
native data
Comparison of the derivative data with those of the native requires
first that they be placed on the same scale.
There are a variety of ways of doing this, we will consider only the
least squares approach in SCALEIT.
Minimize the sum of weighted squares of isomorphous differences:
w (k|FPH| - |FP|)2 with respect to the scale factor.
w (weight) = reciprocal variance of the isomorphous difference:
w = 1/((k PH)2 +  P2)
The assumption is that |FH| can be ignored; this introduces an error
of 5-10% in the scale factor.
Scaling the derivative data to the
native data
In CCP4 this is done with the program SCALEIT.
SCALEIT provides a variety of very useful features:
(i) The derivative data is scaled w.r.t. the native data by means of a
single scale factor plus an anisotropic temperature factor matrix.
(ii) The scaling can be determined from a selected reliable subset of
reflections and then applied to all reflections (this avoids using
poor data in the scaling exercise).
(iii) Anomalous data can be treated separately.
(iv) Various measures of the isomorphous differences are reported.
These can be of great value in assessing whether to continue with
data collection, whether to re-soak for a different time, whether
to move on to another condition or whether the derivative is
isomorphous.
Assessing derivatization
Some rules of thumb for looking at SCALEIT output.
1. Calculate the isomorphous R-factor on F's. This should be in the
range 12% - 25% overall.
2. The isomorphous R-factor will in all likelihood be higher at very low
resolution (due to problems in scaling) and higher at high
resolution (due to data inaccuracy). Nevertheless it should exhibit
a large "flat" regime across the useful resolution range.
3. The weighted R-factor is also a useful number. This should more or
less match the R-factor. If it is much lower then most of the
differences is due to noise in the data rather than to isomorphous
signal.
Assessing derivatization
Some rules of thumb (continued)
4. The scale factor should be fairly uniform across the resolution
range.
5. There should not be many outliers. These should be rejected and
SCALEIT re-run.
6. The derivative data should not be significantly anisotropic or
there may be problems in scaling it to the native.
How many heavy atoms sites
Having established that the derivative shows significant isomorphous
differences with respect to the native, the next questions are
- how many heavy atoms sites are there in the crystal ?
- what are their coordinates ?
- in what stoichiometric ratio are the heavy atoms bound at these
sites ?
- what are their thermal (B) values ?
Remember, in soaking experiments not all sites are necessarily 100%
occupied.
Ideally, one wants as few sites as possible commensurate with the
overall Riso that one seeks to achieve.
Lecture 3 Patterson functions
Patterson functions
The Patterson function is the auto-correlation function of the
electron density ρ(x) of the structure. The key points here are
(i) The Patterson function is independent of the structure factor
phases and can thus be computed directly from the reduced
intensity data for the reflections.
(ii) The peaks in the Patterson function of the difference electron
density (derivative-native) correspond approximately to the interatomic distances between the heavy atoms themselves, provided
that the derivative is isomorphous and that the differences in
structure factor amplitudes are small (i.e. in the Riso range of
about 10 -28 %).
Calculation of the Patterson
function
The Patterson (auto-correlation) function P of ρ(x) is
P(x) = ∫ρ(u)ρ(u-x)du
Peaks in P(x) correspond to the vectors between peaks in ρ(x).
P(x) is easily computed using the fact that correlation in real space
corresponds to multiplication in Fourier space.
Remember, no phases are needed!
What does a Patterson function
look like ?
Here is a simple 2D Patterson function for a set of three atoms.
Real space
The blue dots are atoms,
linked by their inter-atomic
vectors.
Patterson
space
The red dots plot the
inter-atomic vectors.
Properties of a Patterson
Properties of the Patterson function:
1. It is centrosymmetric i.e. for every pair of heavy atoms the
displacement between them can be considered in either the positive or
negative sense. Hence, for a given heavy atom set within the crystal,
the Patterson cannot distinguish between the set and its mirror image.
2. The height of a peak in the Patterson is proportional to the product of
the atomic numbers of the atoms responsible for the peak.
3. The symmetry of the Patterson function is that of the Laue group of the
diffraction pattern (i.e. screw axes become non-screw axes and a
centre of symmetry is added)
4. Symmetry elements in real space give rise to peaks in special planes or
lines in Patterson space, termed Harker planes or lines.
5. The Patterson function has a large origin peak (why?).
How do we deconvolute the
Patterson function ?
Deconvolution has traditionally be done by hand, starting with an
inspection of the Harker sections of the Patterson function.
Peaks in these sections arise from vectors between a particular
heavy atom and its symmetry-related mates and allow coordinates
to be assigned with limited ambiguity to all the heavy atoms.
However, peaks between one heavy atom and another (not a
symmetry mate) are not constrained to be in any special position.
Interpretation of these peaks allows one to check the coordinates
assigned from Harker sections and to resolve the ambiguity in the
peak coordinates.
Simple Harker sections
Consider the space group P222 with a heavy atom at (x,y,z). The
symmetry related heavy atoms are therefore at (-x,y,-z), (x,-y,-z)
and (-x,-y,z). Interatomic vectors are therefore of the form
(2x,0,2z), (2x,2y,0) and (0,2y,2z), i.e. they all lie on the axial
planes.
The axial planes are termed the Harker planes corresponding to
P222, and one need look only in these planes for the vectors
between atoms and their symmetry-related partners.
Given peak coordinates of the form (u,v,w) within the Harker planes,
simple algebra allows the determination of (x,y,z). But note that
one can always add or subtract 1/2 from any one or more of the
final coordinates! Why?
How do we deconvolute the
Patterson function ?
Thus the manual search process is as follows.
1. List all the peaks in the Harker section. Try to find self consistent
sets that yield trial coordinates of each and every heavy atom.
Note that in practice the peaks may be of varying height, some
peaks may be entirely absent, some peaks may be the
superposition of more than one vector. Note further that there
can be both a hand and origin ambiguity associated with the
vectors. Try to account for all the peaks present.
2. Then take the possible coordinates of each pair of atoms and
search for all the cross peaks, in an endeavour to resolve the
ambiguities and to check the assignment.
In practice...
Even once the ambiguity in the heavy atom coordinates has been
resolved, one may still be left with overall ambiguity in the hand
of the space group or in the hand of the heavy atom cluster.
Clearly the process is complicated but it can be automated to some
extent. A particularly useful semi-automated search program in
CCP4 is RSPS.
Automated Patterson does not start from the list of Harker peaks.
Instead it considers every possible coordinate (x,y,z) in the
crystallographic asymmetric unit, computes the corresponding
Harker coordinates (uvw) and then evaluates a score function of
the Patterson values at these positions (say the sum of the
Patterson values). The (xyz) coordinates with the highest score
function in Patterson space are then retained as potential heavy
atom sites.
In practice...
One may then generate all the possible cross vector sets, allowing
for ambiguity and use a score function to check these, finally
taking the set of heavy atom positions that have the highest
overall score.
Alternatively one may perform cross-vector searches directly. From
a starting set of heavy atom positions one may search for a
further site by simply computing the coordinates of all possible
cross-vectors between it and the starting set and scoring these
positions in Patterson space. Then check for the corresponding
This technique is sometimes a better way to proceed than Harker
searches as there are more vectors involved and that leads to
better averaging of scores.
Lecture 4 - How do we calculate
the phases
The heavy atom model
Assume we have m derivatives. We then have the following measurements:
The native data : |FP|, σP
The derivative data sets : |FPHn|, σPHn (n=1,…,m)
We are seeking to describe the difference between these by means of a
heavy atom model made up of rn heavy atoms for each derivative n.
Coordinates xi,n,yi,n,zi,n
Temperature factors bi,n
Occupancies oi,n
Anomalous scattering ∆fi,n,f”I,n
Scale factor Sn and anisotropic temperature factor matrix [B]n
i = 1, … , rn
The heavy atom model
i.e. we are seeking to use these few 10’s of numbers to describe the
differences between a few 10’s of thousands of reflections!
Furthermore, both the native and derivative data themselves may be quite
inaccurate. So inevitably we cannot expect the model on the whole to
succeed in accounting for all the differences between the each native
reflection amplitude and each corresponding derivative reflection
amplitude. Thus there are going to be errors.
We start by minimizing w[Δ|Fiso| - |FH|calc]²over all reflections as a
function of the model, where w is some weight. This gives some preliminary
values to the model parameters. Remember that this is only an
approximation!
The heavy atom model
Now we need to compute the phases. To do this we need to realize that
there will be what is termed lack of closure i.e. the triangles do not quite
close. So how does one compute the phase and its error?
The standard way is to use what are
called phase probabilities – these are
based on assuming that the lack of
closure itself has a gaussian error
distribution with standard deviation of
E = <|Δhkl|> (i.e. on average for a
particular resolution range) .
Then P(αhkl) = N exp(-Δ²(αhkl)/ 2E²)
where P(αhkl) is the probability that
reflection (hkl) has phase αhkl.
Lack of closure
The next strategy used to refine the heavy atom parameters is to
minimize, using least squares, the lack of closure across the entire
data set. Minimize
hklmhklΔ²hkl
where m is a weight (termed the figure of merit) indicating the
reliability of the protein phase angle based on its probability.
So this means that we iterate a procedure of refining HA
parameters, computing phase angle distributions, refining HA
parameters, computing phase angle distributions … until
convergence.
Problems, problems, problems
All of this is rather complicated, and leads to a number of
fundamental problems and poor convergence and incorrect and
biased results.
In particular, the least squares minimization assumes that the
native data are error free. Furthermore, the native data is used
for every derivative. This means that there is considerable
overweighting of the native data (“tyranny of the native”).
Furthermore, the phase angles themselves are functions of the
parameters being refined, so we are assuming we now the very
quantities we are trying to minimize!
Maximum likelihood
Maximum likelihood is a concept that underlies much of modern
refinement procedures in protein crystallography, including molecular
replacement (BEAST), crystallographic refinement (CNS, Refmac),
isomorphous replacement (SHARP, MLPHARE), solvent flattening
(RESOLVE).
The idea here is quite simple, but highly powerful and gets rid of all
the problems in traditional heavy atom refinement.
It is based on so-called Bayesian statistics, wherein one assumes
that one can calculate the probability of a particular observation
occurring as a function of a set of underlying parameters. One then
seeks to select a set of values for the parameters that have the
maximum likelihood of resulting in that particular observation.
Maximum likelihood
These principles are increasingly applied to crystallography. The most
well-defined formulation of a maximum likelihood treatment of the
MIR problem is embedded in the program SHARP.
SHARP treats the native protein itself as a derivative with no heavy
atoms and hence does not favour it above others. The formulae
involved are quite complicated and lead to probability expressions
that are a function of the heavy atom coordinates, occupancy, Bvalue, and scattering as well as of the scale factors and lack of
isomorphism error.
I will not be describing SHARP.
SOLVE
SOLVE on the other hand is not a maximum likelihood program.
However, it has some other advantages which make it valuable:It solves the Patterson functions automatically, refines the heavy
atom positions by standard non ML techniques and then
evaluates the solutions according to a number of criteria:(i)
A Patterson score
(ii) A difference Fourier score
(iii) A figure of merit score
(iv) An evaluation of protein/solvent partitioning in the native
Fourier
SOLVE keeps detailed lists of all solutions it builds up and tries to
work out the best one based on the overall scores.
Monitoring the success of MIR
There are a number of statistics which are of value in MIR.
RCullis = Σhkl Σα P(α) | |FP(obs) + FH(calc)| - FPH(obs) | / Σ hkl |FPH(obs) - FP(obs)|
or <probability-weighted lack of closure> / <isomorphous
difference>
RCullis(centrics) < 0.6 excellent, < 0.9 usable.
RCullis (ano) = Σhkl Σα P(α )| ano(obs) - ano(calc)| / Σhkl |ano(obs)|
or <probability-weighted |observed - calculated anomalous
difference|> / <|observed anomalous difference|>
For anomalous data any RCullis < 1 is useful.
Monitoring the success of MIR
Phasing power (PP) = <|FH|> / <(|∆|)>
or <heavy-atom amplitude> / <probability-weighted lack of closure
error>
PP > 1.5 excellent, > 1 good, > 0.5 usable.
The Cullis R-factor and the phasing power are the two most useful
statistics for individual derivatives.
Monitoring the success of MIR
Mean Figure of Merit (FOM)
FOM = mean of <cos(Δα)>
FOM is a measure of the precision of the "best" phase.
The FOM can be computed for both single derivatives as well as for
the overall phase set.
All statistics should be computed for a range of resolution. What is
normally true is that could stats can be generated at low resolution
and these become progressively poorer at high resolution. The final
overall values will depend on the resolution cutoff applied.
What do MIR maps look like?
Maps calculated from heavy atoms phases can vary considerably in
quality, depending on the number of derivatives, the degree of
anomalous scattering and the resolution. Sometimes it may not even
be possible to see much beyond differentiation of protein and
solvent.
Nevertheless density modification can be used to improve these
maps further. Density modification involves systematic alteration to
the map in order to force it to obey certain constraints - the most
powerful of these are non-crystallographic symmetry averaging and
solvent flattening.
Lecture 5 - Density modification
Lecture 5 - Density modification
Density modification (DM) is one of the most powerful phasing
techniques. Essentially it involves applying known constraints to
an electron density map in order to improve the phases.
Lecture 5 - Density modification
The basic idea behind density modification is as follows
i) Calculate an (Fobs, acalc) map. Call this calc .
ii) Apply the density constraints to the map to generate a new map
calc' .
iii) Calculate the Fourier transform of this map to generate a new set
of structure factors (F'calc,a'calc).
iv) Return to step (i) and calculate the map (Fobs, a'calc).
v) Repeat this until the procedure converges.
Density modification in practice
The algorithm should in general lead to an improved map, provided
that the density modification procedure is correct. However,
there is always a bias in the calculation towards the starting
phases, and much of the implementation of the procedure is
directed towards breaking this bias, particularly if the
constraints are relatively weak.
Solvent flattening
A particularly powerful algorithm is solvent flattening, wherein one seeks
to bring in the prior knowledge that a large percentage of the unit cell
consists of solvent, which is amorphous. As a consequence this will have
uniform electron density. If one can predetermine which portion of the
unit cell consists of solvent then one can modify the calculated electron
density so that this will indeed be the case. As a consequence, around
50% of the electron density to be determined is 'automatically known'
The trick is how to determine the protein-solvent boundary. Various
algorithms exist, and in common implementations of solvent flattening
the molecular envelope itself be iteratively improved as the phases
improved. This technique is particularly powerful when the solvent
content is high, less powerful when it is low.
RESOLVE is a maximum likelihood solvent flattening, auto-building program
Non crystallographic symmetry
averaging
This technique employs, in addition to solvent flattening, the
constraint that the asu may contain identical copies of the same
molecule. In this case, one has the constraint that the electron
density associated with these regions of the unit cell should be
identical, whilst all voume outside of the protomers is solvent and
should have uniform e.d.
This constraint is enormously powerful and in the case of high copy
number (e.g. viruses) can be used to improve enormously even
extremely poor starting phases. The technique can also be applied
to phase extension, a technique whereby one can progressively
generate higher resolution phases.
Histogram matching
This technique is not very powerful, but is easily implemented and can
lead to some phase improvement. The idea is that the electron
density distributions of all proteins are more or less identical. If
the electron density distribution of a calculated map does not
match that expected, histogram equalization technqiues can be
applied to force the experimental e.d. distribuation to match that
expected. Do not expect much gain in phase accuracy by this
technique.
Other DM techniques
Powerful high resolution techniques include skeletonization (wherein
one attempts to discern backbone and then force the backbone to
have a tube like form, and atomicity constraints, wherein one
forces the distribution to have an atom-like appearance (thru the
implementation of a Sayre's equation constraint). These
techniques are not particularly useful for MIR as the resolution
of MIR maps is in general rather low, and if the resolution is high
then auto-building technqiues can be employed instead.
Download